2017-04-19 134 views
1

我一直在研究iOS的Alexa應用一段時間,但我一直在努力將麥克風音頻作爲流發送到AVS API。如何將音頻流式傳輸到v20160207 iOS中的語音服務API

我成功地預先錄製了一個音頻樣本並將其作爲一個整體發送並獲得響應。

我只是想知道如何通過NSURLSession http/2連接將數據傳輸到AVS。

下面的代碼片段正是我現在做的:

func sendData() { 
     let request = NSMutableURLRequest(URL: NSURL(string: "https://avs-alexa-na.amazon.com/v20160207/events")!) 
     request.setValue("Bearer \(Settings.Credentials.TOKEN)", forHTTPHeaderField: "authorization") 
     request.HTTPMethod = "POST" 

     let boundry = NSUUID().UUIDString 
     let contentType = "multipart/form-data; boundary=\(boundry)" 
     request.setValue(contentType, forHTTPHeaderField: "content-type") 

     let bodyData = NSMutableData() 

     let jsonData = "{\"context\":[{\"header\":{\"namespace\":\"Alerts\",\"name\":\"AlertsState\"},\"payload\":{\"allAlerts\":[],\"activeAlerts\":[]}},{\"header\":{\"namespace\":\"AudioPlayer\",\"name\":\"PlaybackState\"},\"payload\":{\"token\":\"\",\"offsetInMilliseconds\":0,\"playerActivity\":\"IDLE\"}},{\"header\":{\"namespace\":\"Speaker\",\"name\":\"VolumeState\"},\"payload\":{\"volume\":25,\"muted\":false}},{\"header\":{\"namespace\":\"SpeechSynthesizer\",\"name\":\"SpeechState\"},\"payload\":{\"token\":\"\",\"offsetInMilliseconds\":0,\"playerActivity\":\"FINISHED\"}}],\"event\":{\"header\":{\"namespace\":\"SpeechRecognizer\",\"name\":\"Recognize\",\"messageId\":\"messageId-123\",\"dialogRequestId\":\"dialogRequestId-321\"},\"payload\":{\"profile\":\"CLOSE_TALK\",\"format\":\"AUDIO_L16_RATE_16000_CHANNELS_1\"}}}" 

     bodyData.appendData("--\(boundry)\r\n".dataUsingEncoding(NSUTF8StringEncoding)!) 
     bodyData.appendData("Content-Disposition: form-data; name=\"metadata\"\r\n".dataUsingEncoding(NSUTF8StringEncoding)!) 
     bodyData.appendData("Content-Type: application/json; charset=UTF-8\r\n\r\n".dataUsingEncoding(NSUTF8StringEncoding)!) 
     bodyData.appendData(jsonData.dataUsingEncoding(NSUTF8StringEncoding)!) 
     bodyData.appendData("\r\n".dataUsingEncoding(NSUTF8StringEncoding)!) 

     bodyData.appendData("--\(boundry)\r\n".dataUsingEncoding(NSUTF8StringEncoding)!) 

     bodyData.appendData("Content-Disposition: form-data; name=\"audio\"\r\n".dataUsingEncoding(NSUTF8StringEncoding)!) 
     //  bodyData.appendData("Content-Type: audio/L16; rate=16000; channels=1\r\n\r\n".dataUsingEncoding(NSUTF8StringEncoding)!) 
     bodyData.appendData("Content-Type: application/octet-stream\r\n\r\n".dataUsingEncoding(NSUTF8StringEncoding)!) 
     bodyData.appendData(audioData!) 
     bodyData.appendData("\r\n".dataUsingEncoding(NSUTF8StringEncoding)!) 

     bodyData.appendData("--\(boundry)\r\n".dataUsingEncoding(NSUTF8StringEncoding)!) 
     session = NSURLSession.sharedSession() 
     session.configuration.timeoutIntervalForResource = 60000 
     session.configuration.timeoutIntervalForRequest = 60000 

     let upload = session.uploadTaskWithRequest(request, fromData: bodyData) { (data, response, error) in 
      print("done") 
      if(data?.length > 0) { 
       print("break") 
      } 
      if let httpResponse = response as? NSHTTPURLResponse { 
       if let responseData = data, let contentTypeHeader = httpResponse.allHeaderFields["Content-Type"] { 

        var boundry: String? 
        let ctbRange = contentTypeHeader.rangeOfString("boundary=.*?;", options: .RegularExpressionSearch) 
        if ctbRange.location != NSNotFound { 
         let boundryNSS = contentTypeHeader.substringWithRange(ctbRange) as NSString 
         boundry = boundryNSS.substringWithRange(NSRange(location: 9, length: boundryNSS.length - 10)) 
        } 

        if let b = boundry { 
         let parts = self.parseResponse(responseData, boundry: b) 
         print("got parts") 
//      self.sendSynchronize() 
         self.successHandler?(data: responseData, parts:self.parseResponse(responseData, boundry: b)) 
        } else { 
         print("something went wrong") 
         self.errorHandler?(error: NSError(domain: Settings.Error.ErrorDomain, code: Settings.Error.AVSResponseBorderParseErrorCode, userInfo: [NSLocalizedDescriptionKey : "Could not find boundry in AVS response"])) 
        } 
       } 
      } 
     } 

     upload.resume() 
    } 

該功能獲取的每一個叫做320個字節的音頻數據,因爲這是大小亞馬遜建議流:)

映入眼簾!

+0

任何運氣@tomwyckhuys?我也在同樣的問題中遇到過。我也嘗試刪除末端邊界術語。 –

回答

0

在對話請求開始時(例如,麥克風打開並開始記錄的時刻),您應該輸入send the JSON metadata headers only once

您還需要在每次爲同一個流調用sendData方法時使用相同的邊界值。對於整個請求使用相同的HTTP/2流,這意味着您需要重新構造sendData方法以「適應」。使用uploadTask的示例:withStreamedRequest可能會有所幫助(您可能需要使用它)。

我對Swift HTTP/2 API不熟悉,所以我不知道是否會爲你處理連續幀,或者如果你需要自己管理它,所以這是需要注意的。祝你好運,希望這有助於。