USBAudio API discussion document

I'm hoping that this document will stimulate discussion on the API that the USBAudio module will present.

This document has only just begun - it's a work in progress! Please join in the discussion on the RISC OS Open fora.

Comparison between built-in and USB

Computers like the Risc PC, Iyonix, BeagleBoard etc. have audio interfaces built in. Although they differ between model of computer, they are built in, and so all specimens of a model have the same interfaces. These interfaces also tend to be relatively simple.

USB audio devices come and go as they are plugged in or unplugged. More than one can be plugged in at the same time. Different models have different interfaces. Some models can be quite complex, with many input and output terminals. They have to be handled somewhat differently, hence the need for a USBAudio module to facilitate their handling.

What facilities are needed

• Enumerate the available audio devices.

• Get a text description of an audio device, e.g. the manufacturer and the device name.

• Enumerate the input and output terminals of an audio device.

• Open an isochronous pipe to or from an audio device.

• Enumerate the paths through an audio device.

• Enumerate the controls (volume, mute, etc.) of a path through an audio device. Note that a stereo or multi-channel interface is likely (but not guaranteed) to have one volume control per channel.

• Control the volume and mute of one or more channel.

• Return the endpoint number of any HID associated with an audio device.

HID endpoints

HIDs associated with audio devices are not as straightforward as you might imagine. For example, I have two headset interfaces. One has four buttons on it legended as volume up, volume down, mute headphones and mute microphone. The associated HID report descriptor shows usages of "Volume Increment", "Volume Decrement", "Mute", and several unassigned. We have to make an assumption as to which volume we are incrementing and decrementing and which we are muting (it could be either the headphones or the microphone), and what we might use any of the unassigned channels for. Note that there are more controls in the report than present on the device. I also have a second headset adaptor, which has identical descriptors, but has no controls whatsoever! Presence of something in the HID report descriptor is no guarantee that you will get a signal from it.

Proposed USBAudio API

USBAudio_EnumerateDevices (SWI &59480)

Entry
R0 Pointer to a buffer for returned string
R1 Buffer length in bytes
Exit
Buffer updated with returned string

Use

The returned string contains a comma-separated, null-terminated list of the USB device names of all USB audio devices found, e.g. "USB8" or "USB8,USB11"

USBAudio_GetDeviceName (SWI &59481)

Entry
R0 Pointer to buffer with null-terminated USB device name, e.g. "USB8"
R1 Index of name to get
R2 Pointer to buffer for returned name
R3 Buffer length
Exit
R4 Contains the USB language ID
Buffer updated with returned string, e.g. "Griffin Technology, Inc"

Use

Normally index 1 and 2 are the only ones of any use, 1 usually being the manufacturer's name, and 2 being the device's description. The string is in Unicode, and the USB language ID is required in order to interpret the string correctly.

Comment

The language ID is almost always 0x0409 (I have yet to see any other value). In practice you are most likely to find that the characters are ASCII with every alternate byte 0x00. I have one device that does not return a string in response to index 2.

USBAudio_OpenOut (SWI &59482) and USB_OpenIn (SWI &59483)

Entry
R0 Pointer to buffer with null-terminated USB device name, e.g. "USB8"
R1 Pointer to 32 byte parameter block
OffsetLengthPurpose
+04Sample rate, Hz
+41Resolution, bits
+51Bytes per sample for 1 channel
+61Number of channels
+71Format code, usually 1 for linear PCM
+81Configuration number, almost always 0
+91(Unused)
+101(Unused)
+111(Unused)
+124USB buffer size
+164Volume bitfield
+204Mute bitfield
+241Streaming interface number
+251Alternate setting
+261Endpoint number
+271Feature unit ID
+281Control interface number
+291(Unused)
+301(Unused)
+311(Unused)
Exit
R0 The stream handle, or 0 if the open fails
Parameter block updated

Use

Given the sampling rate, resolution, number of channels, format code, sample block size and USB buffer size, this call opens a stream for output or input. From then on, the calling application needs to either keep filling the USB buffer (for output) such that it never becomes empty, or keep emptying the USB buffer (for input) such that it never becomes full. The USB buffer is assigned by the OS, not the user. The stream can be closed after use by calling USBAudio_Close.

Parameters in locations +0 to +15 must be supplied by the calling application, although locations 9 to 11 are currently unused. Parameters in locations +16 to +31 are returned by the USBAudio module, although locations 29 to 31 are currently unused.

Comment

If a calling application only wants to control the volume or mute settings, all it needs is the feature unit ID and the volume or mute bitfields - and it doesn't need to understand them, merely provide them when calling USBAudio_SetVolume and USBAudio_SetMute. The rest are included for completeness.

The volume and mute bitfields represent the channels that can have volume or mute controlled. Bits 0 and 1 are channel 0 (the master channel, i.e. it applies to all audio channels). Bits 2 and 3 are channel 1, bits 4 and 5 are channel 2, etc. for the individual channels. This is how a feature unit tells you what you can control.

For Release 2 devices, bits 0, 2, 4 etc. are set if the feature can be read; bits 1, 3, 5 etc. are set if the feature can be written to. If a feature can be written, it must (according to the specification) be readable too.

Release 1 only tells us whether the feature exists, i.e. it can be read from and/or written to, but leaves us to guess which we can do. In order to make the returned values from Releases 1 and 2 compatible, Release 1 devices have both the read and write bits set in the bitfields.

USBAudio_Close (SWI &59484)

Entry
R0 Stream number returned by USBAudio_OpenOut or USBAudio_OpenIn
Exit
-

Use

Call this to close the stream after use.

USBAudio_SetVolume (SWI &59485)

Entry
R0 Pointer to buffer with null-terminated USB device name, e.g. "USB8"
R1 Pointer to 32 byte parameter block from OpenOut or OpenIn
R2 Volume bitfield
R3 Volume setting
Exit
-

Use

The parameter block is that passed to, and updated by, USBAudio_OpenOut or USBAudio_OpenIn. For this call, only the Feature Unit ID and Control Interface Number fields are used, and are only valid after the USBAudio_OpenOut or USBAudio_OpenIn operation. All other fields are ignored. No field is updated by this call.

The volume setting is the absolute volume, in units of 1/256 dB, and is a signed integer in the range +32767 to -32768, representing +127.996 dB to -128 dB. A well-behaved device will set to its nearest available setting.

The volume bitfield should normally be the volume bitfield entry of the parameter block. If a different bitfield is supplied, any set bit that corresponds to a channel that does not have a volume control is likely to cause a "Bad request" error. All the channels represented by the bitfield have their volume set by this call.

USBAudio_SetMute (SWI &59486)

Entry
R0 Pointer to buffer with null-terminated USB device name, e.g. "USB8"
R1 Pointer to 32 byte parameter block from OpenOut or OpenIn
R2 Mute bitfield
R3 Mute setting: 1 to mute, 0 to not mute
Exit
-

Use

The parameter block is that passed to, and updated by, USBAudio_OpenOut or USBAudio_OpenIn. For this call, only the Feature Unit ID and Control Interface Number fields are used, and are only valid after the USBAudio_OpenOut or USBAudio_OpenIn operation. All other fields are ignored. No field is updated by this call.

The mute bitfield should normally be the mute bitfield entry of the parameter block. If a different bitfield is supplied, any set bit that corresponds to a channel that does not have a mute control is likely to cause a "Bad request" error. All the channels represented by the bitfield have their mute state set by this call.

USBAudio_GetVolume (SWI &59487)

Entry
R0 Pointer to buffer with null-terminated USB device name, e.g. "USB8"
R1 Pointer to 32 byte parameter block from OpenOut or OpenIn
R2 Volume bitfield
Exit
R3 Current volume setting and resolution
R4 Maximum and minimum volume settings

Use

The parameter block is that passed to, and updated by, USBAudio_OpenOut or USBAudio_OpenIn. For this call, only the Feature Unit ID and Control Interface Number fields are used, and are only valid after the USBAudio_OpenOut or USBAudio_OpenIn operation. All other fields are ignored. No field is updated by this call.

The volume bitfield should normally be the volume bitfield entry of the parameter block. The volume settings returned are in units of 1/256 dB, and are signed integers in the range +32767 to -32768, representing +127.996 dB to -128 dB.

The volume bitfield is likely to contain more than one set bit, i.e. to refer to more than one channel. This call returns the settings of the lowest numbered channel among the set bits. Since all channels are likely to have been set to the same volume, one result is probably what you want. If a different bitfield is supplied, any set bit that corresponds to a channel that does not have a volume control is likely to cause a "Bad request" error.

R3 on exit contains the current volume setting in the top 16 bits, and the volume setting resolution in the bottom 16 bits. The resolution is not likely to be of much interest, since (a) it's largely mythical, (b) well behaved devices set to their nearest available setting when instructed to set to a volume that doesn't match the resolution. Simply, then, arithmetic shift R3 right by 16 bits, and you have the current volume setting in units of 1/256 dB.

R4 on exit contains the maximum volume setting in the top 16 bits, and the minimum volume setting in the bottom 16 bits.

USBAudio_GetMute (SWI &59488)

Entry
R0 Pointer to buffer with null-terminated USB device name, e.g. "USB8"
R1 Pointer to 32 byte parameter block from OpenOut or OpenIn
R2 Mute bitfield
Exit
R3 Mute setting: non-zero to mute, 0 to not mute

Use

The parameter block is that passed to, and updated by, USBAudio_OpenOut or USBAudio_OpenIn. For this call, only the Feature Unit ID and Control Interface Number fields are used, and are only valid after the USBAudio_OpenOut or USBAudio_OpenIn operation. All other fields are ignored. No field is updated by this call.

The mute bitfield should normally be the mute bitfield entry of the parameter block.

The mute bitfield may contain more than one set bit, i.e. may refer to more than one channel. This call returns the setting of the lowest numbered channel. Since all channels are likely to have been given the same mute setting, one result is probably what you want. If a different bitfield is supplied, any set bit that corresponds to a channel that does not have a mute control is likely to cause a "Bad request" error.

USBAudio_EnumerateResolutions (SWI &59489)

Entry
R0 Pointer to buffer with null-terminated USB device name, e.g. "USB8"
R1 Pointer to buffer for returned results
R2 Length of return buffer
Exit
R2 Contains the used or required buffer length - see comment
Buffer updated with returned direction/resolution/subframe size values

Use

This call returns every combination of direction, resolution and subframe size offered by the named device. The returned values are as follows:

OffsetLengthPurpose
+01Direction: 0x00 for output, 0x80 for input
+11Resolution, bits
+21Subframe size, bytes per channel

Each entry, therefore, requires 3 bytes. It must be very rare for a device to support many different resolutions. I suggest calling with a buffer of 96 bytes, which would support (for example) 16 different resolutions for output and 16 different resolutions for input. If you find a device that has that many or more, I'd like to hear of it! If the buffer is long enough, the actual length used is returned in R2. If the buffer is not long enough, the call returns the error "Buffer too short" and R2 is returned 3 greater than the length on entry. It isn't possible to return the required length when a buffer is too short because the unique resolutions are stored in the buffer and counted from there; a buffer that is too short cannot contain them all and therefore they cannot all be counted.

USBAudio_EnumerateSampleRates (SWI &5948A)

Entry
R0 Pointer to buffer with null-terminated USB device name, e.g. "USB8"
R1 Pointer to buffer for returned results
R2 Length of return buffer
Exit
R2 Contains the used or required buffer length - see notes
Buffer updated with returned direction/resolution values

Use

This call returns every combination of direction, number of channels, resolution, subframe size, format code and range of sample rates offered by the named device. The returned values are as follows:

OffsetLengthPurpose
+01Direction: 0x00 for output, 0x80 for input
+11Number of channels
+21Resolution, bits
+31Subframe size, bytes per channel
+41Format code, e.g. 1 for linear PCM
+51(Unused)
+61(Unused)
+71Number of sample rates - see below
+8...4Sample rate data - see below

If the value 'n' at offset +7 is non-zero, there are 'n' values at offsets +8 onwards, each being a discrete sample rate.

If the value 'n' at offset +7 is 0, the values at offsets +8, +12 and +16 represent a range of sample rates as follows:

OffsetLengthPurpose
+84Minimum sample rate, Hz
+124Maximum sample rate, Hz
+164Sample rate step, Hz

Note that Release 1 devices return a step size of 0, leaving us all to guess at the step size.

The value returned in R2 is the buffer size needed. If the size needed is less than or equal to the size provided on entry, that number of bytes is used. If the buffer was too small, the call returns with the error "Buffer too short" and R2 is the required size, but the last entry in the buffer may be incomplete.

USBAudio_GetConfigurationDescriptor (SWI &5948B)

Entry
R0 Pointer to buffer with null-terminated USB device name, e.g. "USB8"
R1 Index of configuration to get
R2 Pointer to buffer for returned configuration descriptor - must be at least 4 bytes long
R3 Buffer length
Exit
R3 Contains the required buffer length
Buffer updated with returned string

Use

Normally index 0 is the only configuration that exists, therefore call with R1 = 0. The length of the return buffer must be at least 4, because the descriptor's length is in the third and fourth bytes. This means that there are two obvious ways in which to use this call:

• Call with a buffer longer than you can reasonably expect the descriptor to need (2048 bytes is probably plenty), and check that the returned value of R3 is less than the descriptor's length. If it's not, call again with a buffer at least R3 bytes long.

• Call with a 4 byte buffer. Call again with a buffer R3 bytes long.

Comment

You only need to call this if you're interested in picking through the device's configuration descriptor yourself. You probably aren't. It will make your brain hurt.

USBAudio_TextifySampleRates (SWI &5948C)

This SWI will be detailed in a further revision of this document. It is a diagnostic aid to programmers.

USBAudio_GetFeatureVector (SWI &5948D)

Entry
R0 Pointer to buffer with null-terminated USB device name, e.g. "USB8"
R1 Pointer to 60-byte buffer for returned results
R2 Buffer length
R3 Zero to get the output feature descriptor, non-zero to get the input feature descriptor
Exit
Buffer updated with returned results: 15 4-byte bitfields, as follows
OffsetLengthPurpose
+04Mute bitfield
+44Volume bitfield
+84Bass bitfield
+124Mid bitfield
+164Treble bitfield
+204Graphic eq bitfield
+244Automatic gain bitfield
+284Delay bitfield
+324Bass boost bitfield
+364Loudness bitfield
Bitfields at offsets +40 to +56 are only obtainable from Release 2 devices
+404Input gain bitfield
+444Input gain pad bitfield
+484Phase inverter bitfield
+524Underflow bitfield
+564Overflow bitfield

Use

This SWI returns a bit vector of all the features supported by a given feature unit in the master channel and the first 15 channels. The words in the bit vector are intended for use in calls to USBAudio_SetFeature and USBAudio_GetFeature.

If the supplied buffer length is less than 60, the error "Buffer too short" is returned.

The bitfields are of exactly the form described in USBAudio_OpenOut and USBAudio_OpenIn for the mute and volume bitfields. The mute and volume bitfields are returned in this call too.

USBAudio_SetFeature (SWI &5948E)

Entry
R0 Pointer to buffer with null-terminated USB device name, e.g. "USB8"
R1 Pointer to 32 byte parameter block from OpenOut or OpenIn
R2 Feature bitfield
R3 Feature number
R4 Feature setting value
Exit
-

Use

This SWI sets the current value of a feature.

The parameter block is that passed to, and updated by, USBAudio_OpenOut or USBAudio_OpenIn. For this call, only the Feature Unit ID and Control Interface Number fields are used, which are only valid after a successful USBAudio_OpenOut or USBAudio_OpenIn operation. All other fields are ignored. No field is updated by this call.

The feature bitfield is bit-mapped to the channels in the same way as the volume and mute settings. The normal way to get the bitfield is to call USBAudio_GetFeatureVector and copy the 4-byte value of the wanted feature. Note that most devices do not implement any features other than volume and/or mute; some don't even implement those.

The feature number is shown in the following table. Beware: it is off by one from the word offsets of the buffer returned by USBAudio_GetFeatureVector!

Feature numbers
NumberTypeFeature
11Mute
23Volume
32Bass
42Mid
52Treble
6XGraphic eq
71Automatic gain
84Delay
91Bass boost
101Loudness
Features 11 to 15 are only obtainable from Release 2 devices
113Input gain
123Input gain pad
131Phase inverter
145Underflow
155Overflow

Type 1 features

The feature setting value is TRUE or FALSE. I have yet to find an official definition of TRUE for USB Audio, but most of the devices I've tried seem to obey and return the value 1. FALSE is 0.

Type 2 features

The feature setting value ranges from +127 (representing +31.75 dB) to -128 (representing -32 dB) in increments of 1/4 dB.

Type 3 features

The feature setting value ranges from +32767 (representing +127.9961 dB) to -32768 (representing -128 dB) in units of 1/256 dB.

Type 4 features

The feature setting is a 10.22 unsigned fixed point value representing the delay in seconds, with a maximum value just short of 1024 seconds.

Type 5 features

These features can only be read; see USBAudio_GetFeature. Any attempt to set them returns the error "Illegal call to SetFeature".

Type X features

These features are not yet implemented. Any attempt to set them returns the error "Not yet implemented".

Comment

The volume and mute features are functional duplicates of USBAudio_SetVolume and USBAudio_SetMute, but there was no good reason to exclude them from here since volume and mute are just two of the list of possible features.

The bitfield should normally be the bitfield entry returned by USBAudio_GetFeatureVector. If a different bitfield is supplied, any set bit that corresponds to a channel that does not have a volume control is likely to cause a "Bad request" error. All the channels represented by the bitfield have their chosen feature set by this call.

USBAudio_GetFeature (SWI &5948F)

Entry
R0 Pointer to buffer with null-terminated USB device name, e.g. "USB8"
R1 Pointer to 32 byte parameter block from OpenOut or OpenIn
R2 Feature bitfield
R3 Feature number
Exit
R4 Feature setting value

Use

This SWI gets the current value of a feature.

The parameter block is that passed to, and updated by, USBAudio_OpenOut or USBAudio_OpenIn. For this call, only the Feature Unit ID and Control Interface Number fields are used, which are only valid after a successful USBAudio_OpenOut or USBAudio_OpenIn operation. All other fields are ignored. No field is updated by this call.

The feature bitfield is bit-mapped to the channels in the same way as the volume and mute settings. The normal way to get the bitfield is to call USBAudio_GetFeatureVector and copy the 4-byte value of the wanted feature. Note that most devices do not implement any features other than volume and/or mute; some don't even implement those.

The feature number is shown in the following table. Beware: it is off by one from the word offsets of the buffer returned by USBAudio_GetFeatureVector!

Feature numbers
NumberTypeFeature
11Mute
23Volume
32Bass
42Mid
52Treble
6XGraphic eq
71Automatic gain
84Delay
91Bass boost
101Loudness
Features 11 to 15 are only obtainable from Release 2 devices
113Input gain
123Input gain pad
131Phase inverter
145Underflow
155Overflow

Type 1 features

The feature setting value is TRUE or FALSE. I have yet to find an official definition of TRUE for USB Audio, but most of the devices I've tried obey and return the value 1. FALSE is 0.

Type 2 features

The feature setting value ranges from +127 (representing +31.75 dB) to -128 (representing -32 dB) in increments of 1/4 dB.

Type 3 features

The feature setting value ranges from +32767 (representing +127.9961 dB) to -32768 (representing -128 dB) in units of 1/256 dB.

Type 4 features

The feature setting is a 10.22 unsigned fixed point value representing the delay in seconds, with a maximum value just short of 1024 seconds.

Type 5 features

The feature setting value is TRUE or FALSE. I have yet to find an official definition of TRUE for USB Audio, but you can expect a non-zero value, probably 1. FALSE is 0. These features are read only; then act of reading them resets the device's internal value to FALSE.

Type X features

These features are not yet implemented. Any attempt to get them returns the error "Not yet implemented".

Comment

The volume and mute features are functional duplicates of USBAudio_GetVolume and USBAudio_GetMute, but there was no good reason to exclude them from here since volume and mute are just two of the list of possible features.

The bitfield should normally be the bitfield entry returned by USBAudio_GetFeatureVector. If a different bitfield is supplied, any set bit that corresponds to a channel that does not have a volume control is likely to cause a "Bad request" error. All the channels represented by the bitfield have their chosen feature set by this call.

USBAudio_NameToVIDPID (SWI &59490)

Entry
R0 Pointer to buffer with null-terminated USB device name, e.g. "USB8"
Exit
R1 Contains the USB vendor ID
R2 Contains the USB product ID

Use

Given a USB device name, this call returns the USB vendor ID and product ID. Since the device name is unique, there can only be one vendor ID and product ID returned at any time. Note that a device's USB device name can change, for example if the device is unplugged and reconnected.

If no matching device is found, the call returns with the error "Device not found", vendor ID = 0 and product ID = 0.

This call works for any USB vendor and product ID; it is not restricted to audio devices.

USBAudio_VIDPIDToName (SWI &59491)

Entry
R0 Contains the USB vendor ID
R1 Contains the USB product ID
R2 Pointer to buffer for returned results
R3 Length of return buffer
Exit
R3 Contains the buffer length used or required
Buffer updated with returned values

Use

This call returns a list of all the USB device names whose USB vendor and product IDs match those given. The result is in the form of a comma-separated, null-terminated list similar to that returned by USBAudio_EnumerateDevices. It is possible, though unlikely, that multiple devices connected at one time may have the same USB vendor and product IDs, therefore the result has to be in the form of a list.

If no matching device is found, the call returns with the error "Device not found".

If the length of the output buffer is at least enough for the returned results including the terminator, R3 is returned equal to the number of bytes used. If the buffer is not long enough, R3 is returned equal to the number of bytes that are required, and the error "Buffer too short" is returned.

This call works for any USB vendor and product ID; it is not restricted to audio devices.

USBAudio_EnumerateSubframeSizes (SWI &59492)

Entry
R0 Pointer to buffer with null-terminated USB device name, e.g. "USB8"
R1 Pointer to 32 byte parameter block
OffsetLengthPurpose
+04Sample rate, Hz
+41Resolution, bits
+51(Unused)
+61Number of channels
+71Format code, usually 1 for linear PCM
+81Configuration number, almost always 0
+91Direction: 0 for output, &80 for input
+1022(Unused)
R2 Pointer to buffer for returned list
R3 Buffer length
Exit
R3Number of results in the buffer
Buffer updated

Use

This SWI returns a list of the subframe sizes (i.e. the number of bytes for one sample in one channel) that the device supports , for a given sample rate, resolution, direction and number of channels.

The 32 byte parameter block is compatible with that used for OpenOut and OpenIn, but please note:

• The byte at offset +9 must be set according to the direction desired. This byte is unused for OpenOut and OpenIn;

• Many bytes are unused in this SWI;

• The parameter block is not updated by this SWI.

The output is a list of subframe sizes, each one byte long, terminated by a zero. The list will probably contain either zero or one size, plus the terminating zero. However, it is possible that devices support more than one subframe size, so the output has to be in the form of a list.

Note that R3 on exit contains the number of results in the buffer, not including the terminating zero.

How to use the USBAudio API

In simple terms, this is what an application needs to do in order to find the available audio device(s) and play out audio:

• Optionally, call USBAudio_EnumerateDevices to get the list of device names;

• Optionally, call USBAudio_GetDeviceName, so that a user can see what devices are available and choose from them, if more than one is available;

• Call USBAudio_OpenOut to open a stream;

• Keep the buffer full (stop it from emptying, anyway);

• Call USBAudio_SetVolume and USBAudio_SetMute as necessary;

• Finally, call USBAudio_Close to close the stream.

Work in progress - this document is incomplete

Page last updated 2014 February 5