如何將Android語音輸入更改爲文字

我對Android編程有點新鮮，最近我發現了android上可用的文本API。我在網絡上發現了很多教程，很好地解釋瞭如何使用此功能，但所有這些教程都以相同的方式工作：應用程序使用意圖開始識別，編程時不指定輸入。如何將Android語音輸入更改爲文字

我的問題是：是否有可能像在Audiorecord中那樣做，並且準確地使用我們想要使用的音頻源？（例如MediaRecorder.AudioSource.MIC）？

我覺得是這樣做的標準方法，但這裏是我是如何實現的SpeechToText：

private void askSpeechInput() { 
    Intent intent = new Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH); 
    intent.putExtra(RecognizerIntent.EXTRA_LANGUAGE_MODEL, 
      RecognizerIntent.LANGUAGE_MODEL_FREE_FORM); 
    intent.putExtra(RecognizerIntent.EXTRA_LANGUAGE, Locale.US); 

    try { 
     startActivityForResult(intent, REQ_CODE_SPEECH_INPUT); 
    } catch (ActivityNotFoundException a) { 

    } 
}

和他們做我想做的事情與我的回報

@Override 
public void onActivityResult(int requestCode, int resultCode, Intent data) { 
    super.onActivityResult(requestCode, resultCode, data); 

    switch (requestCode) { 
     case REQ_CODE_SPEECH_INPUT: { 
      if (resultCode == RESULT_OK && null != data) { 
       ArrayList<String> result = data.getStringArrayListExtra(RecognizerIntent.EXTRA_RESULTS); 
       String message = ""; 
       message = result.get(0); 
       //Do whatever i want with my message 
      } 
      break; 
     } 
    } 
}

獲取文本因此，此代碼可用於麥克風輸入，但可以更改它嗎？

來源

2017-05-15 unMaxEnRad

你想改變它到什麼？它使用藍牙（如果可用）...如果您嘗試在錄製的語音上運行語音識別，Google會提供一個API - 他們很樂意收取您的使用費用） – 323go

其實我試圖將其改爲android呼叫VOICE_DOWNLINK，這是在通話過程中接收到的語音，我發現了一個解決方案，它只是在發言時發出聲音，但這意味着我也會將發言者的聲音轉換爲文本，我不想那樣 – unMaxEnRad

那麼我不知道它是否會幫助任何人，但我找到了解決這個問題的方法。

首先，我使用錄音機錄製聲音，使用輸入我想感謝MediaRecorder.AudioSource，並將其保存到文件中。

private void startRecording() { 
    recorder = new AudioRecord(MediaRecorder.AudioSource.MIC, 
      RECORDER_SAMPLERATE, RECORDER_CHANNELS, 
      RECORDER_AUDIO_ENCODING, BufferElements2Rec * BytesPerElement); 
    recorder.startRecording(); 
    isRecording = true; 
    recordingThread = new Thread(new Runnable() { 
     public void run() { 
      writeAudioDataToFile(); 
     } 
    }, "AudioRecorder Thread"); 
    recordingThread.start(); 
}

之後，我使用了一個flac編碼器，爲了在.flac中對.wav進行編碼。

最後，我發現一些代碼允許我直接發送flac文件到Google API，並接收我想要的文本！

public void getTranscription(int sampleRate) { 

    File myfil = new File(fileName); 
    if (!myfil.canRead()) { 
     Log.d("ParseStarter", "FATAL no read access"); 
     System.out.println("FATAL CAN'T READ"); 
    } 

    // first is a GET for the speech-api DOWNSTREAM 
    // then a future exec for the UPSTREAM/chunked encoding used so as not 
    // to limit 
    // the POST body sz 

    PAIR = MIN + (long) (Math.random() * ((MAX - MIN) + 1L)); 
    // DOWN URL just like in curl full-duplex example plus the handler 
    downChannel(API_DOWN_URL + PAIR, messageHandler); 

    // UP chan, process the audio byteStream for interface to UrlConnection 
    // using 'chunked-encoding' 
    FileInputStream fis; 
    try { 
     fis = new FileInputStream(myfil); 
     FileChannel fc = fis.getChannel(); // Get the file's size and then 
     // map it into memory 
     int sz = (int) fc.size(); 
     MappedByteBuffer bb = fc.map(FileChannel.MapMode.READ_ONLY, 0, sz); 
     byte[] data2 = new byte[bb.remaining()]; 
     Log.d("ParseStarter", "mapfil " + sz + " " + bb.remaining()); 
     bb.get(data2); 
     // conform to the interface from the curl examples on full-duplex 
     // calls 
     // see curl examples full-duplex for more on 'PAIR'. Just a globally 
     // uniq value typ=long->String. 
     // API KEY value is part of value in UP_URL_p2 
     upChannel(root + up_p1 + PAIR + up_p2 + api_key, messageHandler2, 
       data2); 
    } catch (FileNotFoundException e) { 
     // TODO Auto-generated catch block 
     e.printStackTrace(); 
    } catch (IOException e) { 
     // TODO Auto-generated catch block 
     e.printStackTrace(); 
    } 
} 

private void downChannel(String urlStr, final Handler messageHandler) { 

    final String url = urlStr; 

    new Thread() { 
     Bundle b; 

     public void run() { 
      String response = "NAO FOI"; 
      Message msg = Message.obtain(); 
      msg.what = 1; 
      // handler for DOWN channel http response stream - httpsUrlConn 
      // response handler should manage the connection.... ?? 
      // assign a TIMEOUT Value that exceeds by a safe factor 
      // the amount of time that it will take to write the bytes 
      // to the UPChannel in a fashion that mimics a liveStream 
      // of the audio at the applicable Bitrate. BR=sampleRate * bits 
      // per sample 
      // Note that the TLS session uses 
      // "* SSLv3, TLS alert, Client hello (1): " 
      // to wake up the listener when there are additional bytes. 
      // The mechanics of the TLS session should be transparent. Just 
      // use 
      // httpsUrlConn and allow it enough time to do its work. 
      Scanner inStream = openHttpsConnection(url); 
      // process the stream and store it in StringBuilder 
      while (inStream.hasNextLine()) { 
       b = new Bundle(); 
       b.putString("text", inStream.nextLine()); 
       msg.setData(b); 
       messageHandler.dispatchMessage(msg); 
      } 

     } 
    }.start(); 
} 

private void upChannel(String urlStr, final Handler messageHandler, 
         byte[] arg3) { 

    final String murl = urlStr; 
    final byte[] mdata = arg3; 
    Log.d("ParseStarter", "upChan " + mdata.length); 
    new Thread() { 
     public void run() { 
      String response = "NAO FOI"; 
      Message msg = Message.obtain(); 
      msg.what = 2; 
      Scanner inStream = openHttpsPostConnection(murl, mdata); 
      inStream.hasNext(); 
      // process the stream and store it in StringBuilder 
      while (inStream.hasNextLine()) { 
       response += (inStream.nextLine()); 
       Log.d("ParseStarter", "POST resp " + response.length()); 
      } 
      Bundle b = new Bundle(); 
      b.putString("post", response); 
      msg.setData(b); 
      // in.close(); // mind the resources 
      messageHandler.sendMessage(msg); 

     } 
    }.start(); 

} 

// GET for DOWNSTREAM 
private Scanner openHttpsConnection(String urlStr) { 
    InputStream in = null; 
    int resCode = -1; 
    Log.d("ParseStarter", "dwnURL " + urlStr); 

    try { 
     URL url = new URL(urlStr); 
     URLConnection urlConn = url.openConnection(); 

     if (!(urlConn instanceof HttpsURLConnection)) { 
      throw new IOException("URL is not an Https URL"); 
     } 

     HttpsURLConnection httpConn = (HttpsURLConnection) urlConn; 
     httpConn.setAllowUserInteraction(false); 
     // TIMEOUT is required 
     httpConn.setInstanceFollowRedirects(true); 
     httpConn.setRequestMethod("GET"); 

     httpConn.connect(); 

     resCode = httpConn.getResponseCode(); 
     if (resCode == HttpsURLConnection.HTTP_OK) { 
      return new Scanner(httpConn.getInputStream()); 
     } 

    } catch (MalformedURLException e) { 
     e.printStackTrace(); 
    } catch (IOException e) { 
     e.printStackTrace(); 
    } 
    return null; 
} 

// GET for UPSTREAM 
private Scanner openHttpsPostConnection(String urlStr, byte[] data) { 
    InputStream in = null; 
    byte[] mextrad = data; 
    int resCode = -1; 
    OutputStream out = null; 
    // int http_status; 
    try { 
     URL url = new URL(urlStr); 
     URLConnection urlConn = url.openConnection(); 

     if (!(urlConn instanceof HttpsURLConnection)) { 
      throw new IOException("URL is not an Https URL"); 
     } 

     HttpsURLConnection httpConn = (HttpsURLConnection) urlConn; 
     httpConn.setAllowUserInteraction(false); 
     httpConn.setInstanceFollowRedirects(true); 
     httpConn.setRequestMethod("POST"); 
     httpConn.setDoOutput(true); 
     httpConn.setChunkedStreamingMode(0); 
     httpConn.setRequestProperty("Content-Type", "audio/x-flac; rate=" 
       + rate); 
     httpConn.connect(); 

     try { 
      // this opens a connection, then sends POST & headers. 
      out = httpConn.getOutputStream(); 
      // Note : if the audio is more than 15 seconds 
      // dont write it to UrlConnInputStream all in one block as this 
      // sample does. 
      // Rather, segment the byteArray and on intermittently, sleeping 
      // thread 
      // supply bytes to the urlConn Stream at a rate that approaches 
      // the bitrate (=30K per sec. in this instance). 
      Log.d("ParseStarter", "IO beg on data"); 
      out.write(mextrad); // one big block supplied instantly to the 
      // underlying chunker wont work for duration 
      // > 15 s. 
      Log.d("ParseStarter", "IO fin on data"); 
      // do you need the trailer? 
      // NOW you can look at the status. 
      resCode = httpConn.getResponseCode(); 

      Log.d("ParseStarter", "POST OK resp " 
        + httpConn.getResponseMessage().getBytes().toString()); 

      if (resCode/100 != 2) { 
       Log.d("ParseStarter", "POST bad io "); 
      } 

     } catch (IOException e) { 
      Log.d("ParseStarter", "FATAL " + e); 

     } 

     if (resCode == HttpsURLConnection.HTTP_OK) { 
      Log.d("ParseStarter", "OK RESP to POST return scanner "); 
      return new Scanner(httpConn.getInputStream()); 
     } 
    } catch (MalformedURLException e) { 
     e.printStackTrace(); 
    } catch (IOException e) { 
     e.printStackTrace(); 
    } 
    return null; 
} 







// DOWN handler 
Handler messageHandler = new Handler() { 

    public void handleMessage(Message msg) { 
     super.handleMessage(msg); 
     switch (msg.what) { 
      case 1: // GET DOWNSTREAM json id="@+id/comment" 
       String mtxt = msg.getData().getString("text"); 
       if (mtxt.length() > 20) { 
        final String f_msg = mtxt; 
        handler.post(new Runnable() { // This thread runs in the UI 
         // TREATMENT FOR GOOGLE RESPONSE 
         @Override 
         public void run() { 
          System.out.println(f_msg); 


          String message = ""; 
          final ChatMessage chatMessage = new ChatMessage(user1, user2, 
            message, "" + random.nextInt(1000), true); 
          message = f_msg; 
          chatMessage.setMsgID(); 
          chatMessage.setMsgID(); 
          chatMessage.body = message; 
          chatMessage.Date = CommonMethods.getCurrentDate(); 
          chatMessage.Time = CommonMethods.getCurrentTime(); 
          msg_edittext.setText(""); 
          chatAdapter.add(chatMessage); 
          chatAdapter.notifyDataSetChanged(); 
         } 
        }); 
       } 
       break; 
      case 2: 
       break; 
     } 
    } 
}; // doDOWNSTRM Handler end 

// UPSTREAM channel. its servicing a thread and should have its own handler 
Handler messageHandler2 = new Handler() { 

    public void handleMessage(Message msg) { 
     super.handleMessage(msg); 
     switch (msg.what) { 
      case 1: // GET DOWNSTREAM json 
       Log.d("ParseStarter", msg.getData().getString("post")); 
       break; 
      case 2: 
       Log.d("ParseStarter", msg.getData().getString("post")); 
       break; 
     } 

    } 
}; // UPstream handler end

我得到的這部分代碼從this項目，其中以谷歌的API的工作連接，但該文件的編碼器似乎已經過時。

來源

2017-05-24 14:26:04 unMaxEnRad

如何將Android語音輸入更改爲文字

回答

相關問題