This page describes how to use Cloud Speech-to-Text to transcribe audio files that include more than one channel. Multi-channel recognition is available for most, but not all, audio encodings supported by Cloud STT. For information about how many channels are recognized in audio files of each encoding type, see audioChannelCount.
Audio data usually includes a channel for each speaker present on the recording. For example, audio of two people talking over the phone might contain two channels, where each line is recorded separately.
To transcribe audio data that includes multiple channels, you must provide the number of channels in your request to the Cloud Speech-to-Text API. In your request, set the audioChannelCount field in your request to the number of channels present in your audio.
When you send a request with multiple channels, Cloud STT returns a result to you that identifies the different channels present in the audio, labeling the alternatives for each result with the channelTag field.
The following code sample demonstrates how to transcribe audio that contains multiple channels.
Protocol
Refer to the speech:recognize API endpoint for complete details.
To perform synchronous speech recognition, make a POST request and provide the appropriate request body. The following shows an example of a POST request using curl. The example uses the Google Cloud CLI to generate an access token. For instructions on installing the gcloud CLI, see the quickstart.
The following example show how to send a POST request using curl, where the body of the request specifies the number of channels present on the audio sample.
curl -X POST -H "Authorization: Bearer $(gcloud auth application-default print-access-token)" \ -H "Content-Type: application/json; charset=utf-8" \ --data '{ "config": { "encoding": "LINEAR16", "languageCode": "en-US", "audioChannelCount": 2, "enableSeparateRecognitionPerChannel": true }, "audio": { "uri": "gs://cloud-samples-tests/speech/commercial_stereo.wav" } }' "https://speech.googleapis.com/v1/speech:recognize" > multi-channel.txt
If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format, saved to a file named multi-channel.json.
{ "results": [ { "alternatives": [ { "transcript": "hi I'd like to buy a Chromecast I'm always wondering whether you could help me with that", "confidence": 0.8991147 } ], "channelTag": 1, "languageCode": "en-us" }, { "alternatives": [ { "transcript": "certainly which color would you like we have blue black and red", "confidence": 0.9408236 } ], "channelTag": 2, "languageCode": "en-us" }, { "alternatives": [ { "transcript": " let's go with the black one", "confidence": 0.98783094 } ], "channelTag": 1, "languageCode": "en-us" }, { "alternatives": [ { "transcript": " would you like the new Chromecast Ultra model or the regular Chromecast", "confidence": 0.9573053 } ], "channelTag": 2, "languageCode": "en-us" }, { "alternatives": [ { "transcript": " regular Chromecast is fine thank you", "confidence": 0.9671048 } ], "channelTag": 1, "languageCode": "en-us" }, { "alternatives": [ { "transcript": " okay sure would you like to ship it regular or Express", "confidence": 0.9544821 } ], "channelTag": 2, "languageCode": "en-us" }, { "alternatives": [ { "transcript": " express please", "confidence": 0.9487205 } ], "channelTag": 1, "languageCode": "en-us" }, { "alternatives": [ { "transcript": " terrific it's on the way thank you", "confidence": 0.97655964 } ], "channelTag": 2, "languageCode": "en-us" }, { "alternatives": [ { "transcript": " thank you very much bye", "confidence": 0.9735077 } ], "channelTag": 1, "languageCode": "en-us" } ] } Go
To learn how to install and use the client library for Cloud STT, see Cloud STT client libraries. For more information, see the Cloud STT Go API reference documentation.
To authenticate to Cloud STT, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Java
To learn how to install and use the client library for Cloud STT, see Cloud STT client libraries. For more information, see the Cloud STT Java API reference documentation.
To authenticate to Cloud STT, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Node.js
To learn how to install and use the client library for Cloud STT, see Cloud STT client libraries. For more information, see the Cloud STT Node.js API reference documentation.
To authenticate to Cloud STT, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Python
To learn how to install and use the client library for Cloud STT, see Cloud STT client libraries. For more information, see the Cloud STT Python API reference documentation.
To authenticate to Cloud STT, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Additional languages
C#: Please follow the C# setup instructions on the client libraries page and then visit the Cloud STT reference documentation for .NET.
PHP: Please follow the PHP setup instructions on the client libraries page and then visit the Cloud STT reference documentation for PHP.
Ruby: Please follow the Ruby setup instructions on the client libraries page and then visit the Cloud STT reference documentation for Ruby.