2012-10-06 27 views
3

我的問題是相似的關於「獲得均值式振幅的-WAV-從-SOX」前一個問題:批量測量結果與襪統計

Get Mean amplitude(only) of .wav from sox

我想能夠使用統計信息sox對目錄中的1,000個.wav文件進行批量測量,並將結果存儲在一個數據框或一些可以保存爲csv文本文件的類似結構中。

對於一個聲音文件,代碼將是:

./sox SampleSound.wav -n STAT

在下面的輸出得到的:

Samples read:   72000000 
Length (seconds): 3600.000000 
Scaled by:   2147483647.0 
Maximum amplitude:  0.778809 
Minimum amplitude: -1.000000 
Midline amplitude: -0.110596 
Mean norm:   0.062671 
Mean amplitude: -0.008131 
RMS  amplitude:  0.172914 
Maximum delta:   1.778809 
Minimum delta:   0.000000 
Mean delta:   0.014475 
RMS  delta:   0.057648 
Rough frequency:   1061 
Volume adjustment:  1.000 

我想: - 對指定目錄中的1,000個聲音文件進行批量測量, - 以列的形式捕獲統計輸出以及測量的聲音文件名稱, - 並導出用作a中的協變量n分析在R.

謝謝!

馬修

回答

9

首先,你需要執行系統調用sox,並捕獲它的輸出。例如:

> spam = system("sox worf.wav -n stat 2>&1", intern = TRUE) 
> spam 
[1] "Samples read:    34000" "Length (seconds):  3.083900" 
[3] "Scaled by:   2147483647.0" "Maximum amplitude:  0.999969" 
[5] "Minimum amplitude: -0.938721" "Midline amplitude:  0.030624" 
[7] "Mean norm:   0.190602" "Mean amplitude: -0.004302" 
[9] "RMS  amplitude:  0.244978" "Maximum delta:   1.340240" 
[11] "Minimum delta:   0.000000" "Mean delta:   0.051444" 
[13] "RMS  delta:   0.099933" "Rough frequency:   715" 
[15] "Volume adjustment:  1.000" 

設置intern = TRUE將命令的輸出返回給變量。奇怪的是,sox提供它的輸出到stderr而不是stdout,因此需要2>&1。最好的方式前進,現在是一個函數,它也是後續加工的system輸出來包裝這個:

get_wav_stats = function(wav_file) { 
    rough_wav_stats = system(sprintf("sox %s -n stat 2>&1", wav_file), intern = TRUE) 
    wav_stats = data.frame(do.call("rbind", strsplit(rough_wav_stats, split = ":"))) 
    names(wav_stats) = c("variable", "value") 
    wav_stats = transform(wav_stats, value = as.numeric(as.character(value))) 
    return(wav_stats) 
} 
> spam = get_wav_stats("worf.wav") 
> spam 
      variable   value 
1  Samples read 3.400000e+04 
2 Length (seconds) 3.083900e+00 
3   Scaled by 2.147484e+09 
4 Maximum amplitude 9.999690e-01 
5 Minimum amplitude -9.387210e-01 
6 Midline amplitude 3.062400e-02 
7  Mean norm 1.906020e-01 
8 Mean amplitude -4.302000e-03 
9 RMS  amplitude 2.449780e-01 
10  Maximum delta 1.340240e+00 
11  Minimum delta 0.000000e+00 
12  Mean delta 5.144400e-02 
13  RMS  delta 9.993300e-02 
14 Rough frequency 7.150000e+02 
15 Volume adjustment 1.000000e+00 

接下來,你可以在應用循環包裝這讓所有從給定目錄的統計:

# files_dir = list.files("path", full.names = TRUE) 
# For this example I create a mock list: 
files_dir = rep("worf.wav", 10) 
stat_wavs = lapply(files_dir, get_wav_stats) 
> str(stat_wavs) 
    List of 10 
    $ :'data.frame': 15 obs. of 2 variables: 
     ..$ variable: Factor w/ 15 levels "Length (seconds)",..: 13 1 14 2 8 7 6 4 10 3 ... 
     ..$ value : num [1:15] 3.40e+04 3.08 2.15e+09 1.00 -9.39e-01 ... 
    $ :'data.frame': 15 obs. of 2 variables: 
     ..$ variable: Factor w/ 15 levels "Length (seconds)",..: 13 1 14 2 8 7 6 4 10 3 ... 
     ..$ value : num [1:15] 3.40e+04 3.08 2.15e+09 1.00 -9.39e-01 ... 
<<snip>> 
    $ :'data.frame': 15 obs. of 2 variables: 
     ..$ variable: Factor w/ 15 levels "Length (seconds)",..: 13 1 14 2 8 7 6 4 10 3 ... 
     ..$ value : num [1:15] 3.40e+04 3.08 2.15e+09 1.00 -9.39e-01 ... 

只提取value列,其中包含您所需要的統計:

stats4files = data.frame(do.call("rbind", lapply(stat_wavs, "[[", 2))) 
names(stats4files) = stat_wavs[[1]][[1]] 
rownames(stats4files) = files_dir # this doesn't work actually because I have repeated the same file multiple times :) 

> stats4files 
    Samples read Length (seconds) Scaled by Maximum amplitude Minimum amplitude Midline amplitude 
1   34000   3.0839 2147483647   0.999969   -0.938721   0.030624 
2   34000   3.0839 2147483647   0.999969   -0.938721   0.030624 
3   34000   3.0839 2147483647   0.999969   -0.938721   0.030624 
4   34000   3.0839 2147483647   0.999969   -0.938721   0.030624 
5   34000   3.0839 2147483647   0.999969   -0.938721   0.030624 
6   34000   3.0839 2147483647   0.999969   -0.938721   0.030624 
7   34000   3.0839 2147483647   0.999969   -0.938721   0.030624 
8   34000   3.0839 2147483647   0.999969   -0.938721   0.030624 
9   34000   3.0839 2147483647   0.999969   -0.938721   0.030624 
10  34000   3.0839 2147483647   0.999969   -0.938721   0.030624 
    Mean norm Mean amplitude RMS  amplitude Maximum delta Minimum delta Mean delta 
1  0.190602   -0.004302   0.244978  1.34024    0  0.051444 
2  0.190602   -0.004302   0.244978  1.34024    0  0.051444 
3  0.190602   -0.004302   0.244978  1.34024    0  0.051444 
4  0.190602   -0.004302   0.244978  1.34024    0  0.051444 
5  0.190602   -0.004302   0.244978  1.34024    0  0.051444 
6  0.190602   -0.004302   0.244978  1.34024    0  0.051444 
7  0.190602   -0.004302   0.244978  1.34024    0  0.051444 
8  0.190602   -0.004302   0.244978  1.34024    0  0.051444 
9  0.190602   -0.004302   0.244978  1.34024    0  0.051444 
10  0.190602   -0.004302   0.244978  1.34024    0  0.051444 
    RMS  delta Rough frequency Volume adjustment 
1  0.099933    715     1 
2  0.099933    715     1 
3  0.099933    715     1 
4  0.099933    715     1 
5  0.099933    715     1 
6  0.099933    715     1 
7  0.099933    715     1 
8  0.099933    715     1 
9  0.099933    715     1 
10  0.099933    715     1 
+0

我會告訴他做所有的預處理在Python或東西相似,但你的方式非常漂亮。如果可以的話,會+2。 – drammock

+0

謝謝Paul!這就像一個魅力......當我停止嘗試通過我的終端窗口運行它,然後運行它在R ... ooops :)我真的很感謝你的優雅的解決方案,並已學會了一些將幫助我與R在其他一些挑戰。 – user1267299

+0

這種情況下,但我),很難獲得第一線的運行: – user3535074