AudioPlayer Interface Reference
The AudioPlayer interface provides directives and requests for streaming audio and monitoring playback progression. Your skill can send directives to start and stop the playback. The Alexa service sends your skill AudioPlayer requests to give you information about the playback state, such as when the track is nearly finished, or when playback starts and stops. Alexa also sends PlaybackController requests in response to hardware buttons, such as on a remote control or the next and previous tap controls on Alexa-enabled devices with a screen.
For more details about the audio player, see Stream Long-Form Audio with AudioPlayer.
Directives and requests
The AudioPlayer interface includes the following directives responses and request types. You include the directives in a response to Alexa to start and stop an audio stream. Alexa sends the requests to notify your skill about changes to the playback state.
| Interface | Description | Type |
|---|---|---|
| Requests Alexa to stream the specified audio file. | Directive | |
| Requests Alexa to stop the current audio stream. | Directive | |
| Requests Alexa to clears the queue of all audio streams. | Directive | |
| Notifies your skill that Alexa started the audio stream specified in a | Request | |
| Notifies your skill when the stream comes to an end on its own. | Request | |
| Sent when Alexa stops playing an audio stream in response to a voice request or an | Request | |
| Notifies your skill when the current stream is nearly complete and the device is ready to receive a new stream. | Request | |
| Notifies your skill when an error occurred when attempting to play a stream. | Request |
Play directive
Send Alexa a request to stream the audio file identified by the specified audioItem. Use the playBehavior parameter to indicate whether to play the stream immediately or to add the stream to the queue. Add the Play directive in your response to Alexa. Include the directive in the directives array in your response.
Play directive in a request.When you send a Play directive, set the shouldEndSession flag in the response object to true to end the session. If you set this flag to false, Alexa sends the stream to the device for playback, and then pauses the stream to listen for the user's response.
LaunchRequest or IntentRequest, your response can include both AudioPlayer directives and standard response properties, such as outputSpeech, card, and reprompt. For example, when you include outputSpeech in the same response as a Play directive, Alexa speaks the provided text, and then starts to stream the audio. Example directive response
The following example shows a directive entry in your response. For the full response format, see Response Format.
{ "type": "AudioPlayer.Play", "playBehavior": "valid playBehavior value such as ENQUEUE", "audioItem": { "stream": { "url": "https://cdn.example.com/url-of-the-stream-to-play", "token": "opaque token representing this stream", "expectedPreviousToken": "opaque token representing the previous stream", "offsetInMilliseconds": 0, "captionData":{ "content": "WEBVTT\n\n00:00.000 --> 00:02.107\n<00:00.006>My <00:00.0192>Audio <00:01.232>Captions.\n", "type": "WEBVTT" } }, "metadata": { "title": "title of the track to display", "subtitle": "subtitle of the track to display", "art": { "sources": [ { "url": "https://cdn.example.com/url-of-the-album-art-image.png" } ] }, "backgroundImage": { "sources": [ { "url": "https://cdn.example.com/url-of-the-background-image.png" } ] } } } } Directive parameters
| Parameter | Description | Type | Required |
|---|---|---|---|
|
| Set to | String | Yes |
|
| Describes playback behavior. Accepted values:
| String | Yes |
|
| Contains an object providing information about the audio stream to play. | Object | Yes |
|
| Contains an object representing the audio stream to play. | Object | Yes |
|
| Identifies the location of audio content at a remote | String | Yes |
|
| Opaque token that identifies the audio stream. Use the
| String | Yes |
|
| An opaque token that represents the expected previous stream. This should match the value of This property is required and allowed only when the | String | Yes (when |
|
| The timestamp in the stream from which Alexa should begin playback. Set to 0 to start playing the stream from the beginning. Set to any other value to start playback from that associated point in the stream. | Long | Yes |
|
| An object with two fields, | Object | No |
|
| The format of the string in the content field. | String | No |
|
| The time-encoded caption text. | String | No |
|
| Information about the audio displayed on the Alexa-enabled device with a screen. The information isn't shown in the Alexa app. If you don't include this object, Alexa shows the skill name on a gray background by default. This entire object is optional. However, if you do include For more details, see Guidelines for images for Alexa-enabled devices with a screen. Tip: Include the audioItem.metadata with the directive regardless of whether the device supports the Alexa.Presentation.APL interface. The Echo Spot (2024 release) doesn't support APL for skills, but it does display the title, subtitle, and art metadata when streaming audio. | Object | No |
|
| The title text to display. This is typically used for the audio track title. | String | No |
|
| Subtitle text to display, such as a category or an artist name. | String | No |
|
| An object representing the album art to display. This object has a single
For best results, follow the image guidelines and specifications. | Object | No |
|
| An object representing the background image to display. This object has a single For details about the image format, see Image requirements and recommendations.
For best results, follow the image guidelines and specifications. Note: Background images for AudioPlayer skills get an automatic color overlay. The color comes from the skill's album cover color. You can't remove or change this color tint. | Object | No |
Playlist progression with ENQUEUE
The audioItem.stream.expectedPreviousToken property is required if playBehavior is ENQUEUE to handle situations in which requests to progress through a playlist and change tracks happen at the same time. The value of audioItem.stream.expectedPreviousToken should match the audioItem.stream.token property provided with the previous stream.
For example:
- The skill is streaming track 2 in a playlist of several tracks.
- The user says "Alexa, go back," which sends an
AMAZON.PreviousIntent. - At about the same time, track 2 is nearly finished, so Alexa sends a
PlaybackNearlyFinishedrequest. - The skill handles the
AMAZON.PreviousIntentfirst and sends a newPlaydirective with track 1. This track begins playing. The already-sentPlaybackNearlyFinishedrequest is now outdated, since it assumed that track 2 was playing. - The skill handles the now-outdated
PlaybackNearlyFinishedrequest and sends aPlaydirective with track 3, since this is the next track after the originally playing track 2. This request includesexpectedPreviousTokenset to track 2. - The
expectedPreviousTokenprovided in the directive doesn't match thetokenfor the actively playing stream, so the device ignores this directive. - As track 1 finishes, Alexa sends a
PlaybackNearlyFinishedrequest. The skill responds with aPlaydirective for track 2. This track begins playing once track 1 finishes.
If this check wasn't in place, the directive sent in step 5 would put track 3 on the queue, which would cause the audio to skip from track 1 to track 3 when track 1 finishes.
audioItem.stream.expectedPreviousToken when playBehavior is any other value (REPLACE_ALL or REPLACE_ENQUEUED) causes an error.Guidelines for images for Alexa-enabled devices with a screen
If you provide images in the audioItem.metadata.art and audioItem.metadata.backgroundImage properties, note the following guidelines:
- When you send a track with new metadata, be sure to also change the
audioItem.stream.tokenproperty for the track. - Your image must meet the requirements for an audio image. For more details, see Image requirements and recommendations.
- For the
audioItem.metadata.art, use a square image for the best results. If the image isn't square, it's displayed with extra black space on the device. Note that the image is cropped to a circle shape on the Echo Spot (original 2017 release). - You can provide multiple image URLs in the
sourcearray. The device selects the image with the highest resolution to display.
audioItem.stream.token included in the Play directive. The Alexa service might cache the metadata associated with a particular audioItem.stream.token for up to five days. As a result, changes to the metadata, such as a different image or a change to the title text, might not reflect on the device immediately. For example, during testing you might notice this behaviour when you experiment with different images or title text for the same audio stream. To clear the cache, send a new Play directive with a different audioItem.stream.token.Stop directive
Stops the current audio playback. Include the directive in the directives array in your response.
LaunchRequest or IntentRequest, your response can include both AudioPlayer directives and standard response properties, such as outputSpeech, card, and reprompt. Example directive response
The following example shows a directive entry in your response. For the full response format, see Response Format.
{ "type": "AudioPlayer.Stop" } Directive parameters
| Parameter | Description | Type | Required |
|---|---|---|---|
type | Set to AudioPlayer.Stop | String | Yes |
ClearQueue directive
Clears the audio playback queue. You can set this directive to clear the queue without stopping the currently playing stream, or clear the queue and stop any currently playing stream. Include the directive in the directives array in your response.
LaunchRequest or IntentRequest, your response can include both AudioPlayer directives and standard response properties, such as outputSpeech, card, and reprompt. Example directive response
The following example shows a directive entry in your response. For the full response format, see Response Format.
{ "type": "AudioPlayer.ClearQueue", "clearBehavior" : "valid clearBehavior value such as CLEAR_ALL" } Directive parameters
| Parameter | Description | Type | Required |
|---|---|---|---|
|
| Set to | String | Yes |
|
| Describes the clear queue behavior. Accepted values:
| String | Yes |
PlaybackStarted request
Sent when Alexa begins playing the audio stream previously sent in a Play directive. This request lets your skill verify that playback started successfully. Also, Alexa sends this request to notify your skill when Alexa resumes playback after pausing it for a voice request.
session object because the request isn't sent in the context of a skill session. Use the context object to get details, such as the applicationId and userId.Example request
{ "version": "1.0", "context": { "System": { "application": {}, "user": {}, "device": {} } }, "request": { "type": "AudioPlayer.PlaybackStarted", "requestId": "unique.id.for.the.request", "timestamp": "timestamp of request in format: 2018-04-11T15:15:25Z", "token": "token representing the currently playing stream", "offsetInMilliseconds": 0, "locale": "a locale code such as en-US" } } Request parameters
| Parameter | Description | Type |
|---|---|---|
type | AudioPlayer.PlaybackStarted | String |
requestId | Represents a unique identifier for the specific request. | String |
timestamp | Provides the date and time when Alexa sent the request as an ISO 8601 formatted string. Used to [verify the request when hosting your skill as a web service][hosting-as-web-service#timestamp]. | String |
token | An opaque token that represents the audio stream. You provide this token when sending the Play directive. | String |
offsetInMilliseconds | Identifies a track's offset in milliseconds when the PlaybackStarted request is sent. | Long |
locale | A string indicating the user's locale. For example: en-US. See [supported locale codes][service_ref#request-locale]. | String |
For the full request format, see Request Format.
Response
Your skill can respond to PlaybackStarted with a Stop or ClearQueue directive.
The response cannot include:
- Any standard properties such as
outputSpeech,card, orreprompt. - Any other
AudioPlayerdirectives. - Any other directives from other interfaces, such a [Dialog directive][dialog-interface-reference#directives].
AudioPlayer requests.PlaybackFinished request
Sent when the stream Alexa is playing comes to an end on its own. If your skill explicitly stops the playback with the Stop directive, Alexa sends PlaybackStopped instead of PlaybackFinished.
session object because the request isn't sent in the context of a skill session. Use the context object to get details, such as the applicationId and userId.Example request
{ "version": "1.0", "context": { "System": { "application": {}, "user": {}, "device": {} } }, "request": { "type": "AudioPlayer.PlaybackFinished", "requestId": "unique.id.for.the.request", "timestamp": "timestamp of request in format: 2018-04-11T15:15:25Z", "token": "token representing the currently playing stream", "offsetInMilliseconds": 0, "locale": "a locale code such as en-US" } } Request parameters
| Parameter | Description | Type |
|---|---|---|
type | AudioPlayer.PlaybackFinished | String |
requestId | Represents a unique identifier for the specific request. | String |
timestamp | Provides the date and time when Alexa sent the request as an ISO 8601 formatted string. Used to [verify the request when hosting your skill as a web service][hosting-as-web-service#timestamp]. | String |
token | An opaque token that represents the audio stream. You provide this token when sending the Play directive. | String |
offsetInMilliseconds | Identifies a track's offset in milliseconds when the PlaybackFinished request is sent. | Long |
locale | A string indicating the user's locale. For example: en-US. See [supported locale codes][service_ref#request-locale]. | String |
Response
Your skill can respond to PlaybackFinished with a Stop or ClearQueue directive.
The response cannot include:
- Any standard properties such as
outputSpeech,card, orreprompt. - Any other
AudioPlayerdirectives. - Any other directives from other interfaces, such a [Dialog directive][dialog-interface-reference#directives].
AudioPlayer requests.PlaybackStopped request
Sent when Alexa stops playing an audio stream in response to one of the following AudioPlayer directives:
StopPlaywith aplayBehaviorofREPLACE_ALL.ClearQueuewith aclearBehaviorofCLEAR_ALL.
This request is also sent if the user makes a voice request to Alexa, since this temporarily pauses the playback. In this case, the playback begins automatically once the voice interaction is complete. If playback stops because the audio stream comes to an end on its own, Alexa sends PlaybackFinished instead of PlaybackStopped.
session object because the request isn't sent in the context of a skill session. Use the context object to get details, such as the applicationId and userId.Example request
{ "version": "1.0", "context": { "System": { "application": {}, "user": {}, "device": {} } }, "request": { "type": "AudioPlayer.PlaybackStopped", "requestId": "unique.id.for.the.request", "timestamp": "timestamp of request in format: 2018-04-11T15:15:25Z", "token": "token representing the currently playing stream", "offsetInMilliseconds": 0, "locale": "a locale code such as en-US" } } Request parameters
| Parameter | Description | Type |
|---|---|---|
type | AudioPlayer.PlaybackStopped | String |
requestId | Represents a unique identifier for the specific request. | String |
timestamp | Provides the date and time when Alexa sent the request as an ISO 8601 formatted string. Used to [verify the request when hosting your skill as a web service][hosting-as-web-service#timestamp]. | String |
token | An opaque token that represents the audio stream. You provide this token when sending the Play directive. | String |
offsetInMilliseconds | Identifies a track's offset in milliseconds when the PlaybackStopped request is sent. | Long |
locale | A string indicating the user's locale. For example: en-US. See [supported locale codes][service_ref#request-locale]. | String |
Response
Your skill can't return a response to PlaybackStopped.
PlaybackNearlyFinished request
Sent when the device is ready to add the next stream to the queue.
To progress through a playlist of audio streams, respond to this request with a Play directive for the next stream and set playBehavior to ENQUEUE or REPLACE_ENQUEUED. This adds the new stream to the queue without stopping the current playback. Alexa begins streaming the new audio item once the currently playing track finishes.
session object because the request isn't sent in the context of a skill session. Use the context object to get details, such as the applicationId and userId.Example request
{ "version": "1.0", "context": { "System": { "application": {}, "user": {}, "device": {} } }, "request": { "type": "AudioPlayer.PlaybackNearlyFinished", "requestId": "unique.id.for.the.request", "timestamp": "timestamp of request in format: 2018-04-11T15:15:25Z", "token": "token representing the currently playing stream", "offsetInMilliseconds": 0, "locale": "a locale code such as en-US" } } Request parameters
| Parameter | Description | Type |
|---|---|---|
type | AudioPlayer.PlaybackNearlyFinished | String |
requestId | Represents a unique identifier for the specific request. | String |
timestamp | Provides the date and time when Alexa sent the request as an ISO 8601 formatted string. Used to [verify the request when hosting your skill as a web service][hosting-as-web-service#timestamp]. | String |
token | An opaque token that represents the audio stream that is currently playing. You provide this token when sending the Play directive. | String |
offsetInMilliseconds | Identifies a track's offset in milliseconds when the PlaybackNearlyFinished request is sent. | Long |
locale | A string indicating the user's locale. For example: en-US. See [supported locale codes][service_ref#request-locale]. | String |
Response
Your skill can respond to PlaybackNearlyFinished with any AudioPlayer directive.
The response cannot include:
- Any standard properties such as
outputSpeech,card, orreprompt. - Any other directives from other interfaces, such a [Dialog directive][dialog-interface-reference#directives].
AudioPlayer requests.PlaybackFailed request
Sent when Alexa encounters an error when attempting to play a stream.
This request type includes two token properties – one as a property of the request object, and one as a property of the currentPlaybackState object. The request.token property represents the stream that failed to play. The currentPlaybackState.token property can be different if Alexa is playing a stream and the error occurs when attempting to buffer the next stream on the queue. In this case, currentPlaybackState.token represents the stream that was successfully playing.
session object because the request isn't sent in the context of a skill session. Use the context object to get details, such as the applicationId and userId.Example request
{ "version": "1.0", "context": { "System": { "application": {}, "user": {}, "device": {} } }, "request": { "type": "AudioPlayer.PlaybackFailed", "requestId": "unique.id.for.the.request", "timestamp": "timestamp of request in format: 2018-04-11T15:15:25Z", "token": "token representing the currently playing stream", "offsetInMilliseconds": 0, "locale": "a locale code such as en-US", "error": { "type": "error code", "message": "description of the error that occurred" }, "currentPlaybackState": { "token": "token representing stream playing when error occurred", "offsetInMilliseconds": 0, "playerActivity": "player state when error occurred, such as PLAYING" } } } Request parameters
| Parameter | Description | Type |
|---|---|---|
type | AudioPlayer.PlaybackFailed | String |
requestId | Represents a unique identifier for the specific request. | String |
timestamp | Provides the date and time when Alexa sent the request as an ISO 8601 formatted string. Used to [verify the request when hosting your skill as a web service][hosting-as-web-service#timestamp]. | String |
token | An opaque token provided by the Play directive that represents the stream that failed to play. | String |
locale | A string indicating the user's locale. For example: en-US. See [supported locale codes][service_ref#request-locale]. | String |
error | Contains an object with error information | Object |
error.type | Identifies the specific type of error. For details about each error type, see Playback errors. | String |
error.message | A description of the error the device has encountered. | String |
currentPlaybackState | Contains an object providing details about the playback activity occurring at the time of the error. | Object |
currentPlaybackState.token | An opaque token that represents the audio stream currently playing when the error occurred. Note that this may be different from the value of the request.token property. | String |
currentPlaybackState.offsetInMilliseconds | Identifies a track's offset in milliseconds when the error occurred. | Long |
currentPlaybackState.playerActivity | Identifies the player state when the error occurred: PLAYING, PAUSED, FINISHED, BUFFER_UNDERRUN, or IDLE. | String |
| Error Type | Description |
|---|---|
MEDIA_ERROR_UNKNOWN | An unknown error occurred. |
MEDIA_ERROR_INVALID_REQUEST | The request is malformed, unauthorized, forbidden, or not found. |
MEDIA_ERROR_SERVICE_UNAVAILABLE | Alexa was unable to reach the URL for the stream. |
MEDIA_ERROR_INTERNAL_SERVER_ERROR | Alexa accepted the request, but was unable to process the request as expected. |
MEDIA_ERROR_INTERNAL_DEVICE_ERROR | There was an internal error on the device. |
Response
Your skill can respond to PlaybackFailed with any AudioPlayer directive.
The response cannot include:
- Any standard properties such as
outputSpeech,card, orreprompt. - Any other directives from other interfaces, such a [Dialog directive][dialog-interface-reference#directives].
AudioPlayer requests.System.ExceptionEncountered request
If a response to an AudioPlayer request causes an error, Alexa sends your skill a System.ExceptionEncountered request. Alexa ignores any directives included in the response to this request.
Example request
{ "type": "System.ExceptionEncountered", "requestId": "unique.id.for.the.request", "timestamp": "timestamp of request in format: 2018-04-11T15:15:25Z", "locale": "a locale code such as en-US", "error": { "type": "error code such as INVALID_RESPONSE", "message": "description of the error that occurred" }, "cause": { "requestId": "unique identifier for the request that caused the error" } } Request parameters
| Parameter | Description | Type |
|---|---|---|
type | System.ExceptionEncountered | string |
requestId | Represents a unique identifier for the specific request. | string |
timestamp | Provides the date and time when Alexa sent the request as an ISO 8601 formatted string. Used to [verify the request when hosting your skill as a web service][hosting-as-web-service#timestamp]. | string |
locale | A string indicating the user's locale. For example: en-US. See [supported locale codes][service_ref#request-locale]. | string |
error | Contains an object with error information | object |
error.type | Identifies the specific type of error (INVALID_RESPONSE, DEVICE_COMMUNICATION_ERROR, INTERNAL_ERROR). | string |
error.message | A description of the error the device has encountered. | string |
cause.requestId | The requestId for the request that caused the error | string |
Response
Your skill can't return a response to System.ExceptionEncountered request.
Related topics
Last updated: Sep 06, 2024