2017-04-18 119 views
1

我正在通過谷歌語音API提供的代碼片段found here。代碼應該足以將.wav文件轉換爲轉錄文本。谷歌雲語音API python代碼示例有可能的bug

關注的塊是在這裏:

def transcribe_file(speech_file): 
    """Transcribe the given audio file.""" 
    from google.cloud import speech 
    speech_client = speech.Client() 

    with io.open(speech_file, 'rb') as audio_file: 
     content = audio_file.read() 
     audio_sample = speech_client.sample(
      content=content, 
      source_uri=None, 
      encoding='LINEAR16', 
      sample_rate_hertz=16000) 

    alternatives = audio_sample.recognize('en-US') 
    for alternative in alternatives: 
     print('Transcript: {}'.format(alternative.transcript)) 

首先,我想也許代碼是老了,sample_rate_hertz=16000不得不改爲sample_rate=16000

在那之後,我得到一個錯誤這條線:
alternatives = audio_sample.recognize('en-US')
其內容
AttributeError: 'Sample' object has no attribute 'recognize'

我很好奇如何糾正這一點。我似乎無法找到有關此方法的任何文檔。也許它也需要被替換。

+0

請看看[這裏](http://stackoverflow.com/questions/38703853/how-to-use-google-speech-recognition-api-in-python/38788928#38788928),因爲有一個類似的工作例子 –

回答

1

您NEAD閱讀文件爲二進制,然後用service.speech().syncrecognize論點一(字典),其中包含所有必需的參數,如:

  • 編碼,
  • 採樣率
  • 語言)

願你嘗試類似:

with open(speech_file, 'rb') as speech: 
    speech_content = base64.b64encode(speech.read()) 

service = get_speech_service() 
service_request = service.speech().syncrecognize(
    body={ 
     'config': { 
      'encoding': 'LINEAR16', # raw 16-bit signed LE samples 
      'sampleRate': 16000, # 16 khz 
      'languageCode': 'en-US', # a BCP-47 language tag 
     }, 
     'audio': { 
      'content': speech_content.decode('UTF-8') 
      } 
     }) 
response = service_request.execute() 
print(json.dumps(response)) 

請看看here,因爲有一個類似的工作示例。

1

您使用github quickstart.py示例,所以我不知道這與文檔Google Cloud Speech API class sample不同步。但它仍然是BETA

假設isinstance(audio_sample, <class Sample(object)>) == True
然後.recognize

alternatives = audio_sample.recognize('en-US') 

應該是

async_recognize, streaming_recognize, sync_recognize