AudioFormat
class AudioFormat : Parcelable
| kotlin.Any | |
| ↳ | android.media.AudioFormat |
The AudioFormat class is used to access a number of audio format and channel configuration constants. They are for instance used in AudioTrackAudioRecord, as valid values in individual parameters of constructors like AudioTrack#AudioTrack(int, int, int, int, int, int), where the fourth parameter is one of the AudioFormat.ENCODING_* constants. The AudioFormat constants are also used in MediaFormat to specify audio related values commonly used in media, such as for MediaFormat#KEY_CHANNEL_MASK
The AudioFormat.Builder class can be used to create instances of the AudioFormat format class. Refer to AudioFormat.Builder for documentation on the mechanics of the configuration and building of such instances. Here we describe the main concepts that the AudioFormat class allow you to convey in each instance, they are:
Closely associated with the AudioFormat is the notion of an audio frame, which is used throughout the documentation to represent the minimum size complete unit of audio data.
Sample rate
Expressed in Hz, the sample rate in an AudioFormat instance expresses the number of audio samples for each channel per second in the content you are playing or recording. It is not the sample rate at which content is rendered or produced. For instance a sound at a media sample rate of 8000Hz can be played on a device operating at a sample rate of 48000Hz; the sample rate conversion is automatically handled by the platform, it will not play at 6x speed.
As of API android.os.Build.VERSION_CODES#M, sample rates up to 192kHz are supported for AudioRecord and AudioTrack, with sample rate conversion performed as needed. To improve efficiency and avoid lossy conversions, it is recommended to match the sample rate for AudioRecord and AudioTrack to the endpoint device sample rate, and limit the sample rate to no more than 48kHz unless there are special device capabilities that warrant a higher rate.
Encoding
Audio encoding is used to describe the bit representation of audio data, which can be either linear PCM or compressed audio, such as AC3 or DTS.
For linear PCM, the audio encoding describes the sample size, 8 bits, 16 bits, or 32 bits, and the sample representation, integer or float.
-
ENCODING_PCM_8BIT: The audio sample is a 8 bit unsigned integer in the range [0, 255], with a 128 offset for zero. This is typically stored as a Java byte in a byte array or ByteBuffer. Since the Java byte is signed, be careful with math operations and conversions as the most significant bit is inverted. -
ENCODING_PCM_16BIT: The audio sample is a 16 bit signed integer typically stored as a Java short in a short array, but when the short is stored in a ByteBuffer, it is native endian (as compared to the default Java big endian). The short has full range from [-32768, 32767], and is sometimes interpreted as fixed point Q.15 data. -
ENCODING_PCM_FLOAT: Introduced in APIandroid.os.Build.VERSION_CODES#LOLLIPOP, this encoding specifies that the audio sample is a 32 bit IEEE single precision float. The sample can be manipulated as a Java float in a float array, though within a ByteBuffer it is stored in native endian byte order. The nominal range ofENCODING_PCM_FLOATaudio data is [-1.0, 1.0]. It is implementation dependent whether the positive maximum of 1.0 is included in the interval. Values outside of the nominal range are clamped before sending to the endpoint device. Beware that the handling of NaN is undefined; subnormals may be treated as zero; and infinities are generally clamped just like other values forAudioTrack– try to avoid infinities because they can easily generate a NaN.
To achieve higher audio bit depth than a signed 16 bit integer short, it is recommended to useENCODING_PCM_FLOATfor audio capture, processing, and playback. Floats are efficiently manipulated by modern CPUs, have greater precision than 24 bit signed integers, and have greater dynamic range than 32 bit signed integers.AudioRecordas of APIandroid.os.Build.VERSION_CODES#MandAudioTrackas of APIandroid.os.Build.VERSION_CODES#LOLLIPOPsupportENCODING_PCM_FLOAT.
For compressed audio, the encoding specifies the method of compression, for example ENCODING_AC3 and ENCODING_DTS. The compressed audio data is typically stored as bytes in a byte array or ByteBuffer. When a compressed audio encoding is specified for an AudioTrack, it creates a direct (non-mixed) track for output to an endpoint (such as HDMI) capable of decoding the compressed audio. For (most) other endpoints, which are not capable of decoding such compressed audio, you will need to decode the data first, typically by creating a MediaCodec. Alternatively, one may use MediaPlayer for playback of compressed audio files or streams.
When compressed audio is sent out through a direct AudioTrack, it need not be written in exact multiples of the audio access unit; this differs from MediaCodec input buffers.
Channel mask
Channel masks are used in AudioTrack and AudioRecord to describe the samples and their arrangement in the audio frame. They are also used in the endpoint (e.g. a USB audio interface, a DAC connected to headphones) to specify allowable configurations of a particular device.
As of API android.os.Build.VERSION_CODES#M, there are two types of channel masks: channel position masks and channel index masks.
Channel position masks
Channel position masks are the original Android channel masks, and are used since APIandroid.os.Build.VERSION_CODES#BASE. For input and output, they imply a positional nature - the location of a speaker or a microphone for recording or playback. For a channel position mask, each allowed channel position corresponds to a bit in the channel mask. If that channel position is present in the audio frame, that bit is set, otherwise it is zero. The order of the bits (from lsb to msb) corresponds to the order of that position's sample in the audio frame.
The canonical channel position masks by channel count are as follows:
| channel count | channel position mask |
| 1 | CHANNEL_OUT_MONO |
| 2 | CHANNEL_OUT_STEREO |
| 3 | CHANNEL_OUT_STEREO | CHANNEL_OUT_FRONT_CENTER |
| 4 | CHANNEL_OUT_QUAD |
| 5 | CHANNEL_OUT_QUAD | CHANNEL_OUT_FRONT_CENTER |
| 6 | CHANNEL_OUT_5POINT1 |
| 7 | CHANNEL_OUT_5POINT1 | CHANNEL_OUT_BACK_CENTER |
| 8 | CHANNEL_OUT_7POINT1_SURROUND |
These masks are an ORed composite of individual channel masks. For example
CHANNEL_OUT_STEREO is composed of CHANNEL_OUT_FRONT_LEFT and CHANNEL_OUT_FRONT_RIGHT.
Channel index masks
Channel index masks are introduced in APIandroid.os.Build.VERSION_CODES#M. They allow the selection of a particular channel from the source or sink endpoint by number, i.e. the first channel, the second channel, and so forth. This avoids problems with artificially assigning positions to channels of an endpoint, or figuring what the ith position bit is within an endpoint's channel position mask etc. Here's an example where channel index masks address this confusion: dealing with a 4 channel USB device. Using a position mask, and based on the channel count, this would be a
CHANNEL_OUT_QUAD device, but really one is only interested in channel 0 through channel 3. The USB device would then have the following individual bit channel masks: CHANNEL_OUT_FRONT_LEFT, CHANNEL_OUT_FRONT_RIGHT, CHANNEL_OUT_BACK_LEFT and CHANNEL_OUT_BACK_RIGHT. But which is channel 0 and which is channel 3? For a channel index mask, each channel number is represented as a bit in the mask, from the lsb (channel 0) upwards to the msb, numerically this bit value is
1 << channelNumber. A set bit indicates that channel is present in the audio frame, otherwise it is cleared. The order of the bits also correspond to that channel number's sample order in the audio frame. For the previous 4 channel USB device example, the device would have a channel index mask
0xF. Suppose we wanted to select only the first and the third channels; this would correspond to a channel index mask 0x5 (the first and third bits set). If an AudioTrack uses this channel index mask, the audio frame would consist of two samples, the first sample of each frame routed to channel 0, and the second sample of each frame routed to channel 2. The canonical channel index masks by channel count are given by the formula (1 << channelCount) - 1.
Use cases
- Channel position mask for an endpoint:
CHANNEL_OUT_FRONT_LEFT,CHANNEL_OUT_FRONT_CENTER, etc. for HDMI home theater purposes. - Channel position mask for an audio stream: Creating an
AudioTrackto output movie content, where 5.1 multichannel output is to be written. - Channel index mask for an endpoint: USB devices for which input and output do not correspond to left or right speaker or microphone.
- Channel index mask for an audio stream: An
AudioRecordmay only want the third and fourth audio channels of the endpoint (i.e. the second channel pair), and not care the about position it corresponds to, in which case the channel index mask is0xC. MultichannelAudioRecordsessions should use channel index masks.
Audio Frame
For linear PCM, an audio frame consists of a set of samples captured at the same time, whose count and channel association are given by the channel mask, and whose sample contents are specified by the encoding. For example, a stereo 16 bit PCM frame consists of two 16 bit linear PCM samples, with a frame size of 4 bytes. For compressed audio, an audio frame may alternately refer to an access unit of compressed data bytes that is logically grouped together for decoding and bitstream access (e.g. MediaCodec), or a single byte of compressed data (e.g. AudioTrack#getBufferSizeInFrames()), or the linear PCM frame result from decoding the compressed data (e.g.AudioTrack#getPlaybackHeadPosition()), depending on the context where audio frame is used. For the purposes of AudioFormat#getFrameSizeInBytes(), a compressed data format returns a frame size of 1 byte.
Summary
Nested classes |
|
|---|---|
|
Builder class for |
|
Constants |
|
|---|---|
| static Int | |
| static Int | |
| static Int | |
| static Int | |
| static Int |
Invalid audio channel mask |
| static Int | |
| static Int | |
| static Int | |
| static Int | |
| static Int | |
| static Int | |
| static Int | |
| static Int | |
| static Int | |
| static Int | |
| static Int | |
| static Int | |
| static Int | |
| static Int | |
| static Int | |
| static Int | |
| static Int | |
| static Int | |
| static Int | |
| static Int | |
| static Int | |
| static Int | |
| static Int | |
| static Int |
Default audio channel mask |
| static Int | |
| static Int | |
| static Int | |
| static Int | |
| static Int | |
| static Int | |
| static Int | |
| static Int | |
| static Int | |
| static Int | |
| static Int | |
| static Int | |
| static Int |
Audio data format: AAC ELD compressed |
| static Int |
Audio data format: AAC HE V1 compressed |
| static Int |
Audio data format: AAC HE V2 compressed |
| static Int |
Audio data format: AAC LC compressed |
| static Int |
Audio data format: AAC xHE compressed |
| static Int |
Audio data format: AC-3 compressed |
| static Int |
Audio data format: AC-4 sync frame transport format |
| static Int |
Default audio data format |
| static Int |
Audio data format: Dolby MAT (Metadata-enhanced Audio Transmission) Dolby MAT bitstreams are used to transmit Dolby TrueHD, channel-based PCM, or PCM with metadata (object audio) over HDMI (e.g. Dolby Atmos content). |
| static Int |
Audio data format: DOLBY TRUEHD compressed |
| static Int |
Audio data format: DTS compressed |
| static Int |
Audio data format: DTS HD compressed |
| static Int |
Audio data format: E-AC-3 compressed |
| static Int |
Audio data format: E-AC-3-JOC compressed E-AC-3-JOC streams can be decoded by downstream devices supporting |
| static Int |
Audio data format: compressed audio wrapped in PCM for HDMI or S/PDIF passthrough. |
| static Int |
Invalid audio data format |
| static Int |
Audio data format: MP3 compressed |
| static Int |
Audio data format: OPUS compressed. |
| static Int |
Audio data format: PCM 16 bit per sample. |
| static Int |
Audio data format: PCM 8 bit per sample. |
| static Int |
Audio data format: single-precision floating-point per sample |
| static Int |
Sample rate will be a route-dependent value. |
Inherited constants |
|
|---|---|
Public methods |
|
|---|---|
| Int | |
| Boolean | |
| Int |
Return the channel count. |
| Int |
Return the channel index mask. |
| Int |
Return the channel mask. |
| Int |
Return the encoding. |
| Int |
Return the frame size in bytes. |
| Int |
Return the sample rate. |
| Int |
hashCode() |
| String |
toString() |
| Unit |
writeToParcel(dest: Parcel!, flags: Int) |
Properties |
|
|---|---|
| static Parcelable.Creator<AudioFormat!> | |
Constants
CHANNEL_CONFIGURATION_DEFAULT
static valCHANNEL_CONFIGURATION_DEFAULT: Int
Deprecated: Use CHANNEL_OUT_DEFAULT or CHANNEL_IN_DEFAULT instead.
Value: 1
CHANNEL_CONFIGURATION_INVALID
static valCHANNEL_CONFIGURATION_INVALID: Int
Deprecated: Use CHANNEL_INVALID instead.
Value: 0
CHANNEL_CONFIGURATION_MONO
static valCHANNEL_CONFIGURATION_MONO: Int
Deprecated: Use CHANNEL_OUT_MONO or CHANNEL_IN_MONO instead.
Value: 2
CHANNEL_CONFIGURATION_STEREO
static valCHANNEL_CONFIGURATION_STEREO: Int
Deprecated: Use CHANNEL_OUT_STEREO or CHANNEL_IN_STEREO instead.
Value: 3
CHANNEL_INVALID
static val CHANNEL_INVALID: Int
Invalid audio channel mask
Value: 0
CHANNEL_IN_FRONT_PROCESSED
static val CHANNEL_IN_FRONT_PROCESSED: Int
Value: 256
CHANNEL_IN_RIGHT_PROCESSED
static val CHANNEL_IN_RIGHT_PROCESSED: Int
Value: 128
CHANNEL_OUT_7POINT1
static valCHANNEL_OUT_7POINT1: Int
Deprecated: Not the typical 7.1 surround configuration. Use CHANNEL_OUT_7POINT1_SURROUND instead.
Value: 1020
CHANNEL_OUT_7POINT1_SURROUND
static val CHANNEL_OUT_7POINT1_SURROUND: Int
Value: 6396
CHANNEL_OUT_DEFAULT
static val CHANNEL_OUT_DEFAULT: Int
Default audio channel mask
Value: 1
CHANNEL_OUT_FRONT_LEFT_OF_CENTER
static val CHANNEL_OUT_FRONT_LEFT_OF_CENTER: Int
Value: 256
CHANNEL_OUT_FRONT_RIGHT_OF_CENTER
static val CHANNEL_OUT_FRONT_RIGHT_OF_CENTER: Int
Value: 512
ENCODING_AAC_ELD
static val ENCODING_AAC_ELD: Int
Audio data format: AAC ELD compressed
Value: 15
ENCODING_AAC_HE_V1
static val ENCODING_AAC_HE_V1: Int
Audio data format: AAC HE V1 compressed
Value: 11
ENCODING_AAC_HE_V2
static val ENCODING_AAC_HE_V2: Int
Audio data format: AAC HE V2 compressed
Value: 12
ENCODING_AAC_LC
static val ENCODING_AAC_LC: Int
Audio data format: AAC LC compressed
Value: 10
ENCODING_AAC_XHE
static val ENCODING_AAC_XHE: Int
Audio data format: AAC xHE compressed
Value: 16
ENCODING_AC3
static val ENCODING_AC3: Int
Audio data format: AC-3 compressed
Value: 5
ENCODING_AC4
static val ENCODING_AC4: Int
Audio data format: AC-4 sync frame transport format
Value: 17
ENCODING_DEFAULT
static val ENCODING_DEFAULT: Int
Default audio data format
Value: 1
ENCODING_DOLBY_MAT
static val ENCODING_DOLBY_MAT: Int
Audio data format: Dolby MAT (Metadata-enhanced Audio Transmission) Dolby MAT bitstreams are used to transmit Dolby TrueHD, channel-based PCM, or PCM with metadata (object audio) over HDMI (e.g. Dolby Atmos content). * @apiSince 29
Value: 19
ENCODING_DOLBY_TRUEHD
static val ENCODING_DOLBY_TRUEHD: Int
Audio data format: DOLBY TRUEHD compressed
Value: 14
ENCODING_DTS
static val ENCODING_DTS: Int
Audio data format: DTS compressed
Value: 7
ENCODING_DTS_HD
static val ENCODING_DTS_HD: Int
Audio data format: DTS HD compressed
Value: 8
ENCODING_E_AC3
static val ENCODING_E_AC3: Int
Audio data format: E-AC-3 compressed
Value: 6
ENCODING_E_AC3_JOC
static val ENCODING_E_AC3_JOC: Int
Audio data format: E-AC-3-JOC compressed E-AC-3-JOC streams can be decoded by downstream devices supporting ENCODING_E_AC3. Use ENCODING_E_AC3 as the AudioTrack encoding when the downstream device supports ENCODING_E_AC3 but not ENCODING_E_AC3_JOC. * @apiSince 28
Value: 18
ENCODING_IEC61937
static val ENCODING_IEC61937: Int
Audio data format: compressed audio wrapped in PCM for HDMI or S/PDIF passthrough. IEC61937 uses a stereo stream of 16-bit samples as the wrapper. So the channel mask for the track must be CHANNEL_OUT_STEREO. Data should be written to the stream in a short[] array. If the data is written in a byte[] array then there may be endian problems on some platforms when converting to short internally.
Value: 13
ENCODING_INVALID
static val ENCODING_INVALID: Int
Invalid audio data format
Value: 0
ENCODING_MP3
static val ENCODING_MP3: Int
Audio data format: MP3 compressed
Value: 9
ENCODING_OPUS
static val ENCODING_OPUS: Int
Audio data format: OPUS compressed.
Value: 20
ENCODING_PCM_16BIT
static val ENCODING_PCM_16BIT: Int
Audio data format: PCM 16 bit per sample. Guaranteed to be supported by devices.
Value: 2
ENCODING_PCM_8BIT
static val ENCODING_PCM_8BIT: Int
Audio data format: PCM 8 bit per sample. Not guaranteed to be supported by devices.
Value: 3
ENCODING_PCM_FLOAT
static val ENCODING_PCM_FLOAT: Int
Audio data format: single-precision floating-point per sample
Value: 4
SAMPLE_RATE_UNSPECIFIED
static val SAMPLE_RATE_UNSPECIFIED: Int
Sample rate will be a route-dependent value. For AudioTrack, it is usually the sink sample rate, and for AudioRecord it is usually the source sample rate.
Value: 0
Public methods
describeContents
fun describeContents(): Int
| Return | |
|---|---|
Int |
a bitmask indicating the set of special object types marshaled by this Parcelable object instance. Value is either 0 or android.os.Parcelable#CONTENTS_FILE_DESCRIPTOR |
equals
fun equals(other: Any?): Boolean
| Parameters | |
|---|---|
obj |
the reference object with which to compare. |
| Return | |
|---|---|
Boolean |
true if this object is the same as the obj argument; false otherwise. |
getChannelCount
fun getChannelCount(): Int
Return the channel count.
| Return | |
|---|---|
Int |
the channel count derived from the channel position mask or the channel index mask. Zero is returned if both the channel position mask and the channel index mask are not set. |
getChannelIndexMask
fun getChannelIndexMask(): Int
Return the channel index mask. See the section on channel masks for more information about the difference between index-based masks, and position-based masks (as returned by getChannelMask()).
| Return | |
|---|---|
Int |
one of the values that can be set in Builder#setChannelIndexMask(int) or AudioFormat#CHANNEL_INVALID if not set or an invalid mask was used. |
getChannelMask
fun getChannelMask(): Int
Return the channel mask. See the section on channel masks for more information about the difference between index-based masks(as returned by getChannelIndexMask()) and the position-based mask returned by this function.
| Return | |
|---|---|
Int |
one of the values that can be set in Builder#setChannelMask(int) or AudioFormat#CHANNEL_INVALID if not set. |
getEncoding
fun getEncoding(): Int
Return the encoding. See the section on encodings for more information about the different types of supported audio encoding.
| Return | |
|---|---|
Int |
one of the values that can be set in Builder#setEncoding(int) or AudioFormat#ENCODING_INVALID if not set. |
getFrameSizeInBytes
fun getFrameSizeInBytes(): Int
Return the frame size in bytes. For PCM or PCM packed compressed data this is the size of a sample multiplied by the channel count. For all other cases, including invalid/unset channel masks, this will return 1 byte. As an example, a stereo 16-bit PCM format would have a frame size of 4 bytes, an 8 channel float PCM format would have a frame size of 32 bytes, and a compressed data format (not packed in PCM) would have a frame size of 1 byte. Both AudioRecord or AudioTrack process data in multiples of this frame size.
| Return | |
|---|---|
Int |
The audio frame size in bytes corresponding to the encoding and the channel mask. Value is 1 or greater |
getSampleRate
fun getSampleRate(): Int
Return the sample rate.
| Return | |
|---|---|
Int |
one of the values that can be set in Builder#setSampleRate(int) or SAMPLE_RATE_UNSPECIFIED if not set. |
toString
fun toString(): String
| Return | |
|---|---|
String |
a string representation of the object. |
writeToParcel
fun writeToParcel(
dest: Parcel!,
flags: Int
): Unit
| Parameters | |
|---|---|
dest |
Parcel!: The Parcel in which the object should be written. |
flags |
Int: Additional flags about how the object should be written. May be 0 or PARCELABLE_WRITE_RETURN_VALUE. Value is either 0 or a combination of android.os.Parcelable#PARCELABLE_WRITE_RETURN_VALUE, and android.os.Parcelable.PARCELABLE_ELIDE_DUPLICATES |