Media Player Queue Support API Reference
Complete API reference and implementation guide for the Media Player queue support feature. Includes data models, listener interface, usage examples, and speech marks integration with AWS Polly for synchronized audio playback.
Overview
This enhancement adds support for queuing multiple audio tracks with callbacks for track playback events, specifically designed for playing AWS Polly-generated MP3s with their corresponding speech marks JSON files.
Features
- Queue Management: Play multiple tracks in sequence
- Track Callbacks: Get notified when tracks start and complete
- Duration Info: Access track duration from speech marks
- Speech Marks: Full access to word and sentence timing data from Polly
- Error Handling: Graceful error handling with callbacks
Data Models
AudioTrack
Represents a single audio track with its associated speech marks.
1
2
3
4
data class AudioTrack(
val mp3Url: String, // URL to the MP3 file
val marksJsonUrl: String // URL to the marks.json file from Polly
)
SpeechMark
Represents a single speech mark from Polly (word or sentence timing).
1
2
3
4
5
6
7
data class SpeechMark(
val time: Int, // Time in milliseconds from start
val type: String, // "word" or "sentence"
val start: Int, // Character position start
val end: Int, // Character position end
val value: String // The word or sentence text
)
TrackInfo
Contains the track, its speech marks, and calculated duration.
1
2
3
4
5
data class TrackInfo(
val track: AudioTrack,
val marks: List<SpeechMark>,
val durationMs: Int // Duration calculated from marks
)
TrackPlaybackListener Interface
Implement this interface to receive playback events:
1
2
3
4
5
6
interface TrackPlaybackListener {
fun onTrackStarted(trackIndex: Int, trackInfo: TrackInfo)
fun onTrackCompleted(trackIndex: Int)
fun onQueueCompleted()
fun onError(trackIndex: Int, error: String)
}
Usage Examples
Basic Queue Playback with Listener
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
suspend fun playDemo() {
val tracks = listOf(
AudioTrack(
mp3Url = "https://example.com/001.mp3",
marksJsonUrl = "https://example.com/001.marks.json"
),
AudioTrack(
mp3Url = "https://example.com/002.mp3",
marksJsonUrl = "https://example.com/002.marks.json"
)
)
val listener = object : TrackPlaybackListener {
override fun onTrackStarted(trackIndex: Int, trackInfo: TrackInfo) {
Logger.d("Track $trackIndex started - Duration: ${trackInfo.durationMs}ms")
// Access speech marks for word-by-word timing
trackInfo.marks.forEach { mark ->
Logger.d("${mark.type} at ${mark.time}ms: ${mark.value}")
}
}
override fun onTrackCompleted(trackIndex: Int) {
Logger.d("Track $trackIndex completed")
}
override fun onQueueCompleted() {
Logger.d("All tracks completed!")
}
override fun onError(trackIndex: Int, error: String) {
Logger.d("Error on track $trackIndex: $error")
}
}
mediaPlayer.playQueue(tracks, listener)
}
Simple Playback Without Listener
1
2
3
4
5
6
7
8
9
10
suspend fun playSimple() {
val tracks = listOf(
AudioTrack(
mp3Url = "https://example.com/audio.mp3",
marksJsonUrl = "https://example.com/audio.marks.json"
)
)
mediaPlayer.playQueue(tracks)
}
Control Playback
1
2
3
4
5
// Stop current playback
mediaPlayer.stop()
// Clear the queue
mediaPlayer.clearQueue()
Speech Marks JSON Format
The marks.json files generated by AWS Polly have the following structure:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
[
{
"time": 0,
"type": "sentence",
"start": 7,
"end": 24,
"value": "Welcome to Krill!"
},
{
"time": 25,
"type": "word",
"start": 7,
"end": 14,
"value": "Welcome"
},
{
"time": 337,
"type": "word",
"start": 15,
"end": 17,
"value": "to"
}
]
Field Descriptions
- time: Milliseconds from the start of audio
- type: “word” or “sentence”
- start/end: Character positions in original text
- value: The actual word or sentence
Implementation Details
Android Implementation
The Android implementation uses ExoPlayer3:
- Fetches speech marks from the provided URLs
- Calculates duration from the last mark’s timestamp
- Uses ExoPlayer’s listener system to detect track transitions
- Provides callbacks on the Main dispatcher
Track Transition Detection
stateDiagram-v2
[*] --> Idle
Idle --> Loading: playQueue()
Loading --> Ready: STATE_READY
Ready --> Playing: Auto-play
Playing --> TrackStarted: onTrackStarted()
TrackStarted --> Transitioning: End of track
Transitioning --> TrackCompleted: onTrackCompleted()
TrackCompleted --> Playing: Next track
TrackCompleted --> QueueEnded: No more tracks
QueueEnded --> [*]: onQueueCompleted()
Callback Sequence
onTrackStarted: Called when ExoPlayer enters READY state and starts playingonTrackCompleted: Called when ExoPlayer transitions to the next media itemonQueueCompleted: Called when ExoPlayer reaches STATE_ENDED
Error Handling
- Network errors when fetching marks.json are logged but don’t stop playback
- Playback errors are reported via
onErrorcallback - Empty marks list is used as fallback if marks.json fetch fails
Generating Content with AWS Polly
The Python script python/synth_narration.py generates the required files:
1
2
3
4
5
./python/synth_narration.py \
--input content/narration/krillapp \
--out-dir build/narration \
--voice Matthew \
--engine neural
This creates for each .ssml file:
*.mp3- The audio file*.marks.json- Speech marks data*.vtt- WebVTT subtitles
Example SSML Input
1
2
3
4
5
<speak>
<amazon:domain name="conversational">
Welcome to Krill! This is a demonstration of the audio queue system.
</amazon:domain>
</speak>
Platform Support
Currently implemented for:
- ✅ Android (full support with ExoPlayer3)
- ❌ iOS (TODO)
- ❌ Desktop/JVM (TODO)
- ❌ WebAssembly (TODO)
Other platforms will return TODO("Not yet implemented") when calling mediaPlayer.playQueue().
MediaPlayer Interface
The complete MediaPlayer interface includes:
1
2
3
4
5
interface MediaPlayer {
suspend fun playQueue(tracks: List<AudioTrack>, listener: TrackPlaybackListener? = null)
fun stop()
fun clearQueue()
}
Method Descriptions
playQueue()
Plays a queue of audio tracks sequentially.
Parameters:
tracks: List ofAudioTrackobjects to playlistener: Optional callback listener for playback events
Behavior:
- Fetches all marks.json files in parallel before starting playback
- Queues all MP3s in ExoPlayer
- Calls listener callbacks as tracks play
stop()
Stops current playback immediately.
Behavior:
- Stops ExoPlayer
- Does not call
onQueueCompleted() - Queue remains intact (use
clearQueue()to clear)
clearQueue()
Clears the playback queue.
Behavior:
- Removes all queued tracks
- Stops playback if currently playing
- Resets internal state