2014-03-25 48 views
6

我能找到的所有例子都是單聲道的,有CHANNELS = 1。如何使用PyAudio中的回調方法讀取立體聲或多聲道輸入,並將其轉換爲2D NumPy陣列或多個1D陣列?將多聲道PyAudio轉換爲NumPy陣列

對於單聲道輸入,像這樣的工作:

def callback(in_data, frame_count, time_info, status): 
    global result 
    global result_waiting 

    if in_data: 
     result = np.fromstring(in_data, dtype=np.float32) 
     result_waiting = True 
    else: 
     print('no input') 

    return None, pyaudio.paContinue 

stream = p.open(format=pyaudio.paFloat32, 
       channels=1, 
       rate=fs, 
       output=False, 
       input=True, 
       frames_per_buffer=fs, 
       stream_callback=callback) 

但對於立體聲輸入不工作,result陣列的兩倍長,所以我假定信道交錯或東西,但我可以」找到這個文件。

+0

我想寫一個數組並使用PyAudio播放它。對此有何想法? – SolessChong

+0

@SolessChong我在下面的回答中添加了功能 – endolith

回答

9

它似乎是交錯採樣的樣本,首先是左聲道。隨着左聲道輸入和沉默右聲道信號,我得到:

result = [0.2776, -0.0002, 0.2732, -0.0002, 0.2688, -0.0001, 0.2643, -0.0003, 0.2599, ... 

所以將其分離出成立體聲流,重塑成一個二維數組:

result = np.fromstring(in_data, dtype=np.float32) 
result = np.reshape(result, (frames_per_buffer, 2)) 

現在進入左頻道,使用result[:, 0],對於右聲道,請使用result[:, 1]

def decode(in_data, channels): 
    """ 
    Convert a byte stream into a 2D numpy array with 
    shape (chunk_size, channels) 

    Samples are interleaved, so for a stereo stream with left channel 
    of [L0, L1, L2, ...] and right channel of [R0, R1, R2, ...], the output 
    is ordered as [L0, R0, L1, R1, ...] 
    """ 
    # TODO: handle data type as parameter, convert between pyaudio/numpy types 
    result = np.fromstring(in_data, dtype=np.float32) 

    chunk_length = len(result)/channels 
    assert chunk_length == int(chunk_length) 

    result = np.reshape(result, (chunk_length, channels)) 
    return result 


def encode(signal): 
    """ 
    Convert a 2D numpy array into a byte stream for PyAudio 

    Signal should be a numpy array with shape (chunk_size, channels) 
    """ 
    interleaved = signal.flatten() 

    # TODO: handle data type as parameter, convert between pyaudio/numpy types 
    out_data = interleaved.astype(np.float32).tostring() 
    return out_data 
+0

非常有幫助。部分與[此問題]相關(http://stackoverflow.com/questions/22927096/how-to-print-values-of-a-string-full-of-chaos-question-marks/22927836?noredirect=1# comment35005843_22927836) – SolessChong

+0

[使用其他數據格式進行音頻編碼](https://stackoverflow.com/a/24985016/3002273)(例如'np.int16') –