Developing an Audio Unit effect

UPDATE (1/2008): Note that is much easier to get started coding up DSP effects now than it was in 2004, since Xcode 3 includes AU effect and synth templates.

Introduction

This fall, I set out to develop a DSP music effect using Apple’s Audio Unit API. This document describes my experiences and provides pointers for other programmers who are interested in developing software using Apple’s technology.

Basics

Audio Units are software components that implement Apple’s plug-in interface and provide digital signal procesing and/or sound synthesis capabilities to a host program (such as a sequencer or an audio editor). An Audio Unit may operate either as a synthesizer, generating sound, or as an effect, processing sound. The API provides for flexible sound routing capabilities, including m to n channel effects and multiple-out instruments. Finally, effect audio units may operate either in real time or offline.

“Pull” model

Audio Units operate with a “pull” model. This means that, instead of a program “pushing” some amount of audio data to a (virtual or physical) audio interface, the host program invokes a callback function on every plugin at fixed intervals, synchronously “pulling” a fixed number of frames from each. I’ve read that the pull model provides better scalability and lower latency; it seems that most current audio APIs (i.e. PortAudio, JACK, and ALSA) require a pull model.

Implementing an Audio Unit plugin

To implement an Audio Unit plugin, one must write a C++ class that extends one of the Audio Unit component classes; I list some notable ones here:

  1. AUEffectBase A basic audio effect
  2. AUMIDIEffectBase An audio effect that can receive MIDI data
  3. MusicDeviceBase A software synthesizer that responds to MIDI note and continuous controller messages

Extending AUEffectBase

The easiest way to start developing a subclass of AUEffectBase is to copy the SampleEffectUnit code from the developer tools. SampleEffectUnit presents a simple effect that passes its input to its output; as such, the simple pass-through mechanism can be replaced with any custom processing code. SampleEffectUnit assumes an n-in, n-out effect with no interactions between channels and uses a subclass of AUKernelBase to do the bulk of the DSP work.

Using AUKernelBase

The class AUKernelBase provides the interface for a monophonic processing unit; it is provided as a convenience for cases in which, like SampleEffectUnit, n inputs are processed independently to produce n outputs. For an effect processing n channels, n objects extending AUKernelBase will be instantiated. Therefore, each channel can have independent audio state.To build an Audio Unit effect based on AUKernelBase, subclass AUEffectBase and AUKernelBase. You will then modify your AUEffectBase subclass. Override the AUEffectBase::NewKernel() method to return a newly-allocated instance of your AUKernelBase subclass. The NewKernel() method is invoked once for each channel in the AUEffectBase::MaintainKernels() method. (This is in contrast to the documentation for NewKernel(), which is ambiguous.)The AUKernelBase subclass does all of the audio processing in the AUKernelBase::Process() method. (Although the documentation for this method claims that it receives a stream of interleaved samples, this is not the case.) Simply read the input samples (as 32-bit floating point values) and write the output samples. Channel state can be stored in data members of an AUKernelBase subclass.

Overriding AUEffectBase::Render()

The AUKernelBase approach is convenient for common cases. However, it is insufficiently flexible to develop an effect that does not process equal numbers of independent input and output channels. (We’ll call the simpler effects one to one effects.) When developing an effect that either has

  1. different numbers of input and output channels, or
  2. interactions between channel states,

one must use a more flexible method.AUEffectBase::Render() is the callback method invoked by host applications. In its default implementation, it calls AUEffectBase::ProcessBufferLists(); in turn, the default implementation of ProcessBufferLists invokes the Process() method for each channel’s effect kernel on that channel’s input. For non-one to one effects, you must override ProcessBufferLists to perform an appropriate mapping from input channels to output channels.

Appendix: Developing effects with state

It’s pretty straightforward to develop a stateless effect—that is, one in which the value of each output sample depends solely on the value of exactly one corresponding input sample. (Of course, there are many useful distortion-type effects that fit into this category, like waveshapers, clippers, bit-depth reducers, and so on.) However, the “pull” model makes it a little more difficult to develop effects that require maintaining state, like filters, reverbs, and frequency-domain effects—even though some of these are straightforward to develop without the “pull” model.For example, if you were doing frequency-domain manipulation offline on an audio file, it would be as simple as reading in a chunk of audio data, performing windowing operations, performing the FFT, manipulating the FFT results, and outputting the FFT to an output buffer. After doing this for all of the samples in the file, you could write the output buffer to a file.In a synchronous, “pull” environment, it’s harder. You can’t choose which samples you get or when you get them. Rather, you get each sample in order, exactly once, and in fixed-size buffers. Making matters worse, you’ll have to output a fixed number of samples at the same time. Therefore, if you want to develop an effect that relies on state, you’ll have to do some buffering yourself. The figure below indicates the input-to-output relationship for an FFT-based effect with 1x overlap (click to download a printable PDF):

Appendix: References

  1. Audio Unit SDK Documentation. (this is a local link and only works on a Mac with developer tools installed)
  2. CoreAudio-API mailing list. A useful source for information.
  3. CoreAudio Wiki at UCSB. A collaborative web site with comments on several topics relating to Audio Units.
  4. OSXAudio.com. Check the “Developer forum” for some helpful tips.
  5. Urs Heckmann’s CAUGUI toolkit for custom interfaces
  6. Airy Andr’s AUGUI framework also for custom interfaces