This content describes chime playback in the high availability renderer (HAR).
An Audio crate exposes AudioManager to the HAR app, which controls chime
playback.
To keep latency low, playback threads run throughout the lifetime of the app, idling and yielding when no audio plays.
Terminology
- asset
AudioAssetpertains to playable audio. Assets are commonly known and exist in the app runtime.- device
AudioDevicerefers to a separate bus for playback of audio. The device is the most granular unit relating to hardware accessed by the system. In the standard SDVM implementation,AudioDevicerefers to a single Advanced Linux Sound Architecture (ALSA) PCM.- stream
- An instance of playback of an asset on a device. Streams persist from the moment of being scheduled until completed, canceled, or ending in error.
Components
Figure 1 displays the component diagram for chime:
Figure 1. Component diagram.
Audio device and PCM
Audio hardware configuration follows the standard HAR platform abstraction layer
design, and har-platform-api contains it.
The HAR Audio crate defines a new structure for AudioDevice, which defines
fields for all the data structures that affect the internal HAR Audio crate
and playback. AudioDevice also uses generics to wrap potential
platform-specific additional parameters. In the case of tinyalsa,
PlatformAudioDevice contains the descriptors and properties of an ALSA PCM.
/// NOTE: The following code is a sample definition to help understanding, it is not a
/// representation of the final code/implementation.
AudioDevice<PlatformAudioDevice> {
/// Internal HAR Identifier for the device.
AudioDeviceID,
/// The size (in bytes) for chunks of audio data to stream to the device.
ChunkSize,
/// Properties necessary to control volume (details in "Mixer control" section).
VolumeControl,
/// Properties necessary to control spatialization (details in "Mixer control"
/// section).
SpatialControl,
/// Platform specific data for the AudioDevice.
/// E.g. ALSA properties and reference to opened PCM.
PlatformAudioDevice
}
/// Elaboration of the previously mentioned VolumeControl
VolumeControl {
/// Identifier for the control used to change volume.
ControlID,
/// Mapping between Decibel and control values. (see Mixer control section)
VolumeOutputIndex
}
Audio assets
This section describes how audio assets are configured and implemented.
Configuration
The initial HAR audio implementation supports statically configured audio assets. A JSON config defines which assets are available and which assets are defined as WAV files.
The implementation also supports synthesized and streamed audio assets though a more generic asset implementation, which accepts a function to generate audio data.
Implementation
Implement assets using two separate constructs, AudioAsset and AudioStream.
AudioAsset defines the static properties of an asset, and a container for
potential internal data related to the asset. From AudioAsset AudioStream can
be derived, which is a single streamable instance of the asset. AudioStream
contains an internal state related to the singular stream playback.
/// NOTE: The following code is a sample definition to help understanding, it is not a
/// representation of the final code/implementation.
/// Static properties and definition of an Asset.
AudioAsset {
/// Perform optional initialization steps, e.g. load bytes from file into memory.
/// Can also define lazy loading, to load data at first playback instead.
fn initialize(LazyLoad);
/// Create a new AudioStream from the asset.
fn create_stream() -> AudioStream;
/// More functions for metadata etc. of the asset.
...
}
/// Single streamable instance of an AudioAsset
AudioStream {
/// Gets the next bytes to play from the Asset together with if the current chunk of
/// bytes contains any control signals (e.g. fade-out).
fn get_playback(num_bytes: usize) -> ([u8], ControlSignals);
/// Gets playback Mode details used to handle special states of playback
/// e.g. when a chime gets is interrupted and put in "fade-out" mode.
fn playback_mode() -> PlaybackMode;
/// [0.0, 1.0] indication of how much of the stream was played.
fn progress() -> f32;
/// Reset the stream, e.g. if it should play again.
fn reset();
/// Time of which the stream was created.
fn created_at() -> Instant;
/// Additional metadata etc. for the stream.
...
}
Chime playback
This section describes the API and procedure for playback of a chime. A singular chime playback is referred to as a stream.
Lifecycle of a stream
Figure 2 illustrates the lifecycle of a stream:
Figure 2. Stream playback and events.
Figure 2 describes these steps:
Play: Schedule stream to play.
Prioritize: Playback prioritization decides whether to:
- Play chime now (started event when the first bytes)
- Play chime later (paused or resumed event)
- Deprioritize chime (canceled event)
Mixer controls: If needed, update mixer controls based on configured behaviors.
Write bytes: Write a chunk of bytes to
AudioDevice.More data: If the stream has more data, return to Step 2.
Repeat: If the stream should be repeated, reset and return to Step 2 (restarted event).
Completed: The stream completed successfully (
FinishedSuccessfullyevent).
The chime can be interrupted with pause, resume, or stop calls at any time.
Chime priorities
This logic sets chime priorities:
Playback mode overrides. For example, a chime in the fade out mode is always granted top priority until the fade out is completed.
Specified priority.
If equal priority is more recent, the chime plays first.
When chimes are of equal priority, AudioManager is instantiated with an
enum value.
API
Events
If an event channel is provided when the chime starts, HAR Audio emits a
number of events during the playback. The supported events are shown in this
example:
/// NOTE: The following code is a sample definition to help understanding, it is not a
/// representation of the final code/implementation.
StreamBehaviors<PlatformStreamBehaviors> {
/// What should happen if the stream is interrupted for a higher priority stream.
/// e.g. pause-and-resume or cancel, will also define preference for fade-out.
OverrunBehavior,
/// Urgency, if interrupted streams are allowed to "fade-out", or if the stream should
/// urgently disrupt any other playback.
Optional<Urgency>,
/// Priority for the stream (or minimum if not specified).
Optional<StreamPriority>
/// Descriptor if a stream should be played on repeat.
Optional<RepeatBehavior>
/// Volume, if the stream should play at a specific volume.
Optional<Volume>
/// Spatialization, if the stream should play with specific spatialization.
Optional<Spatialization>
/// Optional generic for future expandability of the API, or pass-through of platform
/// specific Stream Behaviors
Optional<PlatformStreamBehaviors>
}
/// Plays a chime on specified device with given behaviors. StreamEvents are delivered
/// using the provided event transmitter. This method won't wait for any events.
fn play(AudioDeviceID, AssetID, StreamBehaviors, Option<EventTransmitter>) -> StreamController
/// Object used to control a Stream.
StreamController {
/// Gets the current state/metadata of a stream (e.g. ID, progress, playback_state).
fn metadata() -> StreamMetadata
/// Stops the stream.
fn stop()
/// Pauses a given stream, if the specified duration expires the stream is cancelled.
/// Timeout is required to make sure there are no paused streams left indefinitely
/// pending resumption.
fn pause(TimeoutDuration)
/// Resumes a paused stream.
fn resume()
/// Updates the spatialization of a playing stream.
fn set_spatialization(Spatialization)
/// Updates the volume of a playing stream.
fn set_volume(Volume)
}
Mixer control
This section describes how volume and spatialization are controlled.
Volume
HAR defines volume consistently in millibels. The har-platform-api crate
handles conversion from millibels to control signal.
The relation between millibels and hardware power output is logarithmic, and
varies greatly between different hardware and speaker setups. As a result,
provide configuration between the values as part of AudioDevice
(Audio Device and PCM) configuration, and conversion must take place before
calling the platform layer.
As a result, implementation in the PAL API defines two functions.
fn set_volume_millibel(AudioDeviceID, Millibel) {
/// Default implementation with conversion using DeviceConfig.
}
fn set_volume_control(AudioDeviceID, ControlValue);
The default implementation for set_volume_millibel uses the config provided
for AudioDevice, including a set of key-value pairs for reference millibel -
control, transform the millibel to control values, and then call the
set_volume_control function with converted value.
This design provides a default and enables subsequent implementations to override the default mapping.
Figure 3. HAR audio flow.
Spatialization
The Audio API exposes functionality to control what spatial area audio data should play in. These parameters are passed through to the PAL layer, and be applied downstream using hardware controls. Options are defined as part of the PAL API as:
/// NOTE: The following code is a sample definition to help understanding, it is not a
/// representation of the final code/implementation.
enum Spatialization {
Front,
FrontLeft,
FrontRight,
Center, // No spatialization
Rear,
RearLeft,
RearRight,
Right,
Left
}
Mixer control tiers
You can define volume and spatialization on an asset and for a stream. If you define a stream priority, the stream overrides the controls defined by the asset.
Thread management
The audio manager maintains one thread per AudioDevice instance. Each thread
operates independently. Interaction between AudioManager and the playback
thread uses a shared stream queue sorted by priority.
ALSA calls use ASYNC writes with polling to determine when data is digested.
Figure 4. Thread management sequence.
Control signals during polling
When awaiting the sound card to digest bytes, control signals can be issued. For
example, to change fade or spatialization of the audio. Polling to get the state
of the audio device is either configured at the AudioManager level or
defaults to 1 millisecond. After each polling cycle, the playback thread
digests and issues any timed control commands.
Buffer management
To minimize interruption latency, buffer sizes written to the device are
kept small. When using TinyALSA as a default, buffer size is configured to be
the same as the startup_threshold parameter.
TinyALSA defines the default as the entire allocated device buffer
divided by two.
Stream interruption
When streams are interrupted, the streams maintain thread priority until data they've written to the card is drained. As a result, a transition period takes place between interruption and the new stream.
For example, if an audio sample in HAR uses a:
- Size of 3,072
- Rate of 48,000
- Sample size of two
The pending buffer is calculated as 3,072 and 6,144 frames, which results in an interruption delay of 64 to 128 milliseconds. A production implementation would require a smaller buffer.
Error management and risks
This section describes how errors are managed and potential risks.
Stale streams and queue starvation
Given that AudioStream can be paused, and because playback can occur only from
the top-priority AudioStream instance, the risk arises of a growing queue
starving low-priority streams.
To avoid this occurrence, each queue is capped at a configurable size. When this value is exceeded, the lowest-priority stream is discarded.
Monitor and alert
In production, the safety monitor tracks audio features to track that playback takes place as expected.
AudioManager monitors the internal statistics specific to latencies and a flag
that defines logging performance. After setting these thresholds, warning logs
are generated for all debug builds when:
- Duration between scheduling and starting playback exceeds
xmilliseconds. - (For a non-disrupted stream) asset length and playback time differ by more
than
ypercent.
Device blocked
There's always a small risk of an audio device becoming unresponsive, for example, if it's allocated and written to by another process in the system. Given that playback runs asynchronously in separate threads, and that chimes can be queued up to play later, this is completely transparent to the calling app.
To detect this, a thread health check is made whenever a new chime is scheduled to be played, returning error if a playback thread has a populated queue, and hasn't digested any new bytes for the last second.
For future purposes it might be necessary to attempt restarting / opening devices, but for the initial implementation, errors shouldn't be invisible.
Code structure
On a high level, the code related to chimes playback exists across the following crates:
CRATE: display-safety/crates/(harry-app|harry)
The existing HAR app, which issues calls to play chimes.
NEW CRATE: display-safety/crates/audio
NEW: Crate to manage audio control and playback (this is where most of the functionality exists).
CRATE: display-safety/crates/har-platform-api/audio
PAL including all system calls required for audio.
CRATE: display-safety/crates/har-platform-(android|linux)/audio
Calls to tinyalsa-rs for playback using TinyALSA. QNX support isn't
implemented in the initial solution, and this will grow as more platforms are
supported.
TINYALSA PAL: display-safety/crates/tinyalsa-audio
TinyALSA-specific code for playback. This is used by the Android and Linux platform implementations.
CRATE: display-safety/crates/tinyalsa-rs
Rust bindings for TinyALSA C implementation
Rust implementation details
Some specific implementation details:
- All API functions return
Result<X, AudioError>whereXis either () or a return value. - No API functions are marked as
unsafe. - Mutex and synchronization mechanisms are internal and aren't exposed in the
AudioManagerAPI.
Ownership model and AudioManager
All app interaction with the audio system takes place through
AudioManageror objects returned fromAudioManager.AudioManageris thread safe.AudioManageris instantiated once in the HARry app, andMoved, forLooperto have ownership.AudioManageruses atokio_util::CancellationTokentoken to manage its started playback threads, ensuring the threads are terminated and resources released ifAudioManagerisDropped.AudioManagerdoesn't explicitly prevent multiple instances from being created. If more than one instance exists, it logs with thewarnlevel.
Shared ownership
A number of objects have shared ownership wrapped and synchronized with
exclusive access. These mechanisms aren't exposed in the AudioManager API, but
are internal to the audio and PAL implementations.
AudioDevice- Each hardware reference (for example, TinyALSA PCM) that is opened (has a handle) has exclusive access. See SMP Design.AudioStreaminstances have exclusive access after they're scheduled for playback because they can be controlled by the app and simultaneously accessed by the playback thread.The playback thread doesn't hold locks during playback, but makes an immutable snapshot of the next buffer to play, and doesn't consider changes until the next buffer is digested.
Each playback thread has a playback queue, a shared reference between
AudioManagerand the playback thread. As a result, the thread needs exclusive access for mutations.Threads with no streams become idle with the
Condvarvariable to receive wakeup events when new data is detected. This mechanism has shared ownership.
Dependencies
Crates and audio crate are designed to reduce dependencies on crates that aren't approved to be built in the Android source tree. See this list of included crates.
Downstream platform implementations for Android and Linux depend on
TinyALSA and the existing display safety tinyalsa-rs crate.
Quality attributes
Reliability
While audio playback is safety critical, this design doesn't cover the implementation of a safety monitoring. Implement this in a separate effort, to verify audio playback reliability on hardware and in production.
Scalability
The one thread per device approach is intended to scale to different hardware setups. Given that each thread is primarily idling, waiting for data, or waiting for the device to digest written data, it shouldn't be demanding on the processor or performance intensive on the system.
The design decision to only play data to a single device, combined with mixer control commands for all further output control ensures the exact output is handled by sound hardware, and should scale for future systems.
Latency
Latency is critical for the audio system, so after implementation, a set of service-level objectives (SLOs) are defined for the latency of the system. To continuously monitor the latency health, monitoring in the system logs not meeting defined SLOs in all debug builds.
For the production versions, monitoring data is passed to some system external to the audio implementation, rather than relying on logs.
Test and test strategy
The crates and the audio crate are designed with test coverage. We added a mock platform implementation to confirm that all capabilities are tested.
The complexity of hardware and bindings preclude extensive test coverage for platform implementations. We provide sample implementations to manually test the solution on hardware and on the Cuttlefish emulator.
Documentation
The README.md file in Audio crates/audio describes how to use
AudioManager. crates/audio/examples contains examples for:
- Implement a platform.
- Create an instance of
AudioManager. - Play
WavAsset. - Play a custom function asset on repeat.
- Log playback events.