All About Fragments

In my on-going series Writing Better Audio Applications for Linux, here's another installment: a little explanation how fragments/periods and buffer sizes should be chosen when doing audio playback with traditional audio APIs such as ALSA and OSS. This originates from some emails I exchanged with the Ekiga folks. In the last weeks I kept copying this explanation to various other folks. I guess it would make sense to post this on my blog here too to reach a wider audience. So here it is, mostly unedited:

Yes. You shouldn't misuse the fragments logic of sound devices. It's
like this:

   The latency is defined by the buffer size.
   The wakeup interval is defined by the fragment size.

The buffer fill level will oscillate between 'full buffer' and 'full
buffer minus 1x fragment size minus OS scheduling latency'. Setting
smaller fragment sizes will increase the CPU load and decrease battery
time since you force the CPU to wake up more often. OTOH it increases
drop out safety, since you fill up playback buffer earlier. Choosing
the fragment size is hence something which you should do balancing out
your needs between power consumption and drop-out safety. With modern
processors and a good OS scheduler like the Linux one setting the
fragment size to anything other than half the buffer size does not
make much sense.

Your [Ekiga's ptlib driver that is] ALSA output is configured
to set the the fragment size to the size of your codec audio
frames. And that's a bad idea. Because the codec frame size has not
been chosen based on power consumption or drop-out safety
reasoning. It has been chosen by the codec designers based on
different reasoning, such as latency.

You probably configured your backend this ways because the ALSA
library docs say that it is recommended to write to the sound card in
multiples of the fragment size. However deducing from this that you
hence should configure the fragment size to the codec frame size is
wrong!

The best way to implement playback these days for ALSA is to write as
much as snd_pcm_avail() tells you to each time you wake up due to
POLLOUT on the sound card. If that is not a multiple of your codec
frame size then you need to buffer the the remainder of the decoded
data yourself in system memory.

The ALSA fragment size you should normally set as large as possible
given your latency constraints but that you have at least two
fragments in your buffer size.

I hope this explains a bit how frag_size/buffer_size should be
chosen. If you have questions, just ask.

(Oh, ALSA uses the term 'period' for what I call 'fragment'
above. It's synonymous)

Category: projects