レナート   TBFKAYIBYNYAAYB   ﻟﻴﻨﺎﺭﺕ

Fri, 16 Jun 2006

TPFKAPA: The Project Formerly Known as Polypaudio

It came to our attention that some people really disliked the name of Polypaudio, because it reminded them of that medical condition, though the software was actually named after the sea dweller. I actually liked that double entendre, but many did not and expressed concerns that the name would hinder Polypaudio's adoption. After a long discussion on #polypaudio we came to the conclusion that a name change is a good idea in this case. Name changes are usually a bad idea, but this time it's worth it, we think.

The new name we agreed on is PulseAudio, or shorter just Pulse. It has the nice advantage that it abbreviates to pa, just as Polypaudio did. This allows us to keep source code compatiblity (and binary compatibility to a certain degree) with the current releases of Polypaudio, because the symbol prefix can stay pa_. In addition the auxiliary tools paman, pavucontrol, pavumeter need not to be renamed.

We will try to make the transition as smooth as possible and would like to apologize to all the packagers, who need to rename their packages now.

The next release of Polypaudio (0.9.2) will be a bugfix release and be the first to bear the new name: PulseAudio 0.9.2.

Polypaudio is dead. Long live PulseAudio!

posted at: 18:22 | path: /projects | permanent link to this entry | 68 comments


Posted by Alan H at Sun Aug 27 05:11:57 2006
Suggestions:

Pulse audio preferences. 
Avoid acronyms like LAN.  Break it down: Local Area Network (LAN).  Network is the keyword. 

"Allow other machines on the Nework to browse for local sound devices"
At a glance this setting looks very similar to the setting just above it.  Maybe there is a good reason for it but if it is not obvious at a glance then odds are it could be made easier and clearer. 

The Gnome Human Interface Guidelines recommend "Don't be negative" ;) and I recommend "Do not abbreviate" which would change the next option to
"Require Authentication" and then you set the default value of the checkbox as needed. 

More acronyms in the dialog, avoid them if at all possible.  Acronyms make intelligent people look stupid, just ask a lawyer/doctor/accountant to throw some of their acronyms at you and prepare to be baffled. 

The Pulse Audio Volume control uses mnenics on the tab labels.  The Gnome HIG recommends against doing this, part of the reason being is it far too easy to end up with conflicting mnemonics. 
Requiring right click is bad for accessibility, and the fact that you have to label it so clearly at the bottom shows you already know how bad it can be for discoverability. 

from the larger screenshot showing various dialogs I see the sink dialog has bold text which is centre aligned.  The Gnome HIG recommends you right align the text, it all fits with how English (and other languages) is written from left to right and how the reader can most easily follow it. 

And one final suggestions and I understand you will probably disagree but only German allows you to put capitals in the middle of words so you should definately need to put a space between Pulse and Audio. 

Look like great software, thanks for writing about it.

Posted by John at Sun Aug 27 05:38:16 2006
This is amazing news. Thanks for the PerfectSetup wiki page also.

I cant wait to see this the default for GNOME!

Posted by MattW at Sun Aug 27 11:08:45 2006
Impressive. Very impressive. Now if only everything used GStreamer...

Posted by Peteris Krisjanis at Sun Aug 27 12:13:00 2006
I am just confused, where is reason to doing this seperately, in special sound server, not in GNOME Mixer, adding missed pieces and fixing bugs? It is just adding additional complexity to all system and drains developer resources, in my humble opinion. However, it is nice that someone does something at least in this front, so I downpay this effort not at all. I just want to know are devs very sure that it won't introduce lot of additional bugs and if it will, then how to deal with that.

Would like to know, just all :)

Peter.

Posted by Lennart at Sun Aug 27 15:16:45 2006
MattW: Why do you ask for everything to use GStreamer? PulseAudio is no way specific to GStreamer. We have a plugin in for ALSA's libasound library which allows you to access PulseAudio through the ALSA libraries much like a normal hardware sound card. In addition we support the OSS API very well. You should be able to run about 90% of all Linux sound applications on top of PulseAudio.

Posted by Lennart at Sun Aug 27 15:21:59 2006
Peteris: GNOME Mixer? Do you mean GSmartMix? PulseAudio is not really comparable with GSmartMix. In addition PulseAudio predates GSmartMix. And eventually PulseAudio will offer some way of integration into GSmartMix allowing volume control of PA with the GSmartMix tools.

The old sound server used by GNOME ("Esound") shows its age, PulseAudio tries to be its successor which does everything right what Esound did wrong.

Posted by Lennart at Sun Aug 27 15:26:34 2006
Alan: thanks for the advices. BTW: German doesn't allow you to put capitals in the middle of words as much as english doesn't. However it's used that way in ads.

Posted by Marius Gedminas at Sun Aug 27 15:41:05 2006
Wow.  I can't wait until PulseAudio makes it into my distro.  Switching output from my local speakers to those connected to a different machine on my LAN in the middle of playback?  Wow.

Can PulseAudio play the same stream on multiple sinks (say, two different machines on the LAN) with proper time synchronization?

Posted by Lennart at Sun Aug 27 15:47:09 2006
Marius: Yes it already can do that for you to a certain degree. However this still leaves some room for improvements.

Posted by Richard Palmer at Wed Nov 15 14:52:34 2006
I am relatively new to Linux and currently enamoured of Zenwalk. Researdh suggests that PulseAudio will enable me to seamlessly integrate RealPlayer, Skype, Gzine etc.

As a former Win XP user I got used to the convenience of being able to switch between running sound applications without difficulty.


Best regards

Richard Palmer

Posted by Tony McNamara at Thu Dec 6 05:41:39 2007
thank you so much for PulseAudio, I use it in Ubuntu Gutsy. All that remains is to get RealPlayer to use PulseAudio. So far I have mplayer,xine (uses xine alsa-plugin), FlightGear flight simulator, vlc, StreamTuner using Audacious, and several more - yes I am a junkie for getting things to work! But really it is so good to not have the sound card blocked!
Thanks again,
Tony

Posted by Paul Davis at Tue Apr 8 22:36:03 2008
Congratulations on redesigning the internals of CoreAudio :) Apple has at least a couple of papers on the subject. Its an excellent design, much better for USB and ieee1394 devices than the one inspired by PCI-style devices.

Although putting this in PA doesn't strike me as a bad idea at all, to be honest I think your effort would be much better applied to getting ALSA to switch over to this model (which is where it can have the most benefit).

2 technical points. (a) this design adds latency to the output path (because of the need for a gap between the h/w and s/w ptrs; apple calls it the "safety buffer"), which is not always a desirable thing, and therefore can be a step backwards for some use cases (b) you've made an assumption that every h/w device can accurately and reliably tell you where its h/w pointer is. this isn't true, and its one reason why interrupts are helpful.

Posted by Philip Withnall at Tue Apr 8 22:46:17 2008
Absolutely brilliant work!

Posted by Lennart at Tue Apr 8 23:01:23 2008
Paul: Care to post the links to those Apple papers? I think I read most of them, but I am always interested in more fodder.

I doubt it makes sense to move ALSA over to this model, because it would make ALSA a complex userspace daemon. And quite honestly I don't see why we would need to do this, since PA already is just that.

Regarding your points:

a) Sure it adds latency. However, according to my measurements my code that estimates the deviation of the sound card timer is accurate to a subsample precision (on my HDA PCI hardware that is). Since you should stay away a few samples from the hw read ptr anyway this should be more than good enough. I only tested this on one piece of hw however, and I am not sure what exactly caused Apple to add that extra safety, though.

Also, while I try to offer very low latencies with PA I am not aiming for the low-latency crown. I had to enter a few comprimises (like not being able to synchronously wait for each client to provide me with audio data on each iteration, which adds a bit of buffering latency) which will certainly increase the latency, for robustness, security and network-transparency reasons.

b) Yep, the interrupt issues are true. The idea is to use the timestamp data from the ALSA timing structure to correlate the sound card clock with the system clock. Unfortunately this doesn't work properly yet, since ALSA lacks support for CLOCK_MONOTONOUS. The scheme would then be this: for soundcards which do support sample-granularity for querying the playback time, use it and disable interrupts. For all others set NFRAGS=1 and use the ALSA timestamp. I talked to a few ALSA people about this, but this needs more love.

I am not even sure if this glitch-free model makes any sense for pro-audio stuff at all. In pro-audio you need the most exact timing info you can get, the lowest latency you can get, you know your latency requirements are well known in advance, the user is very technically skilled, you don't care about power consumption and so on. So most of the reasons to use this model don't matter. Right now I am tempted to say that software like JACK should better stay away from this model.

Posted by Stoffe at Wed Apr 9 00:32:58 2008
Fantastic work. What you do for free sound is purely amazing and we are many more who are cheering you on than wants you to get off the lawn. :)

Keep it up!

Posted by Paul Davis at Wed Apr 9 02:10:28 2008
I'll try to dig up the Apple refs I have, but I suspect you've read them.

I'm a bit tired to think much right now, but I will note regarding your final point that JACK runs on top of this model on CoreAudio and works extremely well there. I am not sure why you think this has to be done in user-space - CoreAudio does it all in the kernel. I wasn't advocating a user-space daemon in ALSA - I was suggesting (and have been mumbling about if for a year or three) that ALSA should abandon its current interrupt-driven model for the "timer-based" model (in which interrupts are used to drive a DLL that lets you sync with the h/w sample clock). What happens in user-space on top of that is irrelevant (though of course, interesting :)

Great work by the way.

Posted by Paul Davis at Wed Apr 9 02:16:06 2008
Lennart, actually let me just outline my view of how this model works. Its not meant to be replacement for what you've explained, but it might (just might) shed a different light on the whole system in a way that could be useful. Or not :)

There are two clocks in the system. One of them is a system clock (no matter what its actual h/w origins). The other is the clock that drives audio to/from the interface. If you stop using interrupts as the driving force, the problem of audio i/o reduces to one of knowing the relative position of the s/w pointer and the h/w pointer. If you can establish the correct relationship between the system clock and the audio clock, this problem is essentially solved. Doing that can be done (relatively easily too, and with amazing accuracy) using a DLL that is driven by the interrupts. At each interrupt, you determine the h/w pointer position and the system time, and input the deltas since last time into the DLL. YOu end up with massively sub-sample accurate prediction of the h/w pointer based on the system time, and if the s/w pointer is driven by system timer events, the issues are solved.

You can do this in the kernel or in user-space.

Posted by Xav at Wed Apr 9 12:16:20 2008
"we configure a system timer to wake us up 10ms before the buffer would run empty and fill it up again then. If the overall latency is configured to less than 10ms we wakeup after half the latency requested."

Curious, I would have set the wakeup to half the requested latency when it reaches 20ms (not 10ms), just to have a continuity in the wakeup size:
- if you request 22ms, you have 10ms
- if you request 20ms, you have 10ms
- if you request 18ms, you have 9ms

Posted by Thomas at Wed Apr 9 13:20:40 2008
When remixing, it should be sufficient to have the previous mix result (which is in the buffer?) and not all previous inputs. Mixing is scaling and adding, and the order of adding should not matter.

Mixing in the new sound could mean a reduction of the other sound levels. If the original and new sound levels are known, this can at least be approximated, unless only a single original signal was supposed to be muted.

Good work!

Posted by Lennart at Wed Apr 9 13:50:59 2008
Thomas, no, this won't work since we do saturated integer mixing. Mixing is thus generally not invertible.

Posted by Lennart at Wed Apr 9 13:55:03 2008
Paul, what you suggest is basically what glitch-free PA does now. pcm_status->tstamp and  pcm_status->delay (in conjunction withh a user maintained sample counter) contain the necessary information to establish the correct relationship between the system clock and the audio clock.

Posted by Paul Davis at Wed Apr 9 15:41:08 2008
Lennart, yes, they sort of do. What is sad, and what I would love to see changed, is that ALSA could establish the clock-to-clock relationship itself, thus allowing any app to ask "where is the h/w pointer now?" and get an at-least sample accurate answer at any time. This would remove quite a bit of complexity from user space (lets be honest, as great as PA is, we're going to continue to see at least 2 or 4 audio APIs living on for some time), and would also solve a fundamental problem with USB and ieee1394 devices where the interrupt interval is not slaved to the audio sample clock.

Posted by Joe Henley at Wed Apr 9 20:33:50 2008
Will it be implemented as poorly as PA was in FC8?

Posted by Lennart at Wed Apr 9 20:40:33 2008
Troll Henley: No, of course, even poorer. What did you expect? Find some other place to troll! Oh, and of course you filed BZ reports for all issues you found, right?

Posted by William Lovaton at Wed Apr 9 22:17:05 2008
Lennart, you are truly a hero!  Congratulations.

I was wondering, isn't Rawhide very close to a final release? are you sure this is Fedora 9 material? If yes, then great! but I'd like to know your point of view about this.

Another question, right now I'm using totem to listen to my music on my up to date Fedora 8 system and powertop shows interrupts from:
- HDA Intel (43 wps)
- totem : schedule_timeout (process_timeout) (21)
- totem : do_nanosleep (hrtimer_wakeup) (20)
- totem : futex_wait (hrtimer_wakeup) (15)

PA is well down with 0.5 -> wps pulseaudio : schedule_timeout (process_timeout).

Reading your post I understand that most of the totem's interrupts will be reduced by a great extent.  Am I wrong?  Do you have numbers?

Cheers and keep up the good work!.

Posted by seringen at Thu Apr 10 07:14:07 2008
hi, i was wondering how portable pulseaudio is for non linux use, or even for oss4 use on linux? i'd be interested on your take about that.

Posted by ethana2 at Thu Apr 10 09:40:36 2008
Keep up the awesome work, guys!
~that is all.

Posted by Lennart at Thu Apr 10 14:28:05 2008
seringen: Older versions of PA have been ported to non-Linux systems (the BSDs, Solaris, Windows). The current code has not, although a lot of glue code is in place and the code should generally be friendly to porters. Patches welcome.

OSS is supported as backend. However, OSS4 tends to be less compatible with the established OSS API ("3") than ALSA. (Yes, this is not a joke) So running PA an OSS should generally work, but YMMV.

Also, I consider OSS to be more like a zombie. Dead but still coming back to haunt people. It would be great if it finally died a silent deth. But I guess due to intensified support from the Solaris camp it won't be so easy. I do think that ALSA is by far the more capable system, and while it has issues it still is not as fucked up as OSS, not by a far margin. (And I say that as someone who knows both APIs very, very well on a technical level, and is not a lame fanboy with no clue)

Anyway, I am a Linux developer, payed to bring Linux forward. I only care about ALSA. Basic OSS support is there. It's not as fancy as the ALSA code, i.e. can't do glitch-free and stuff. If anyone wants to see support for this in PA, then he is welcome to contribute. But for myself I don't see any reason why I should invest more time on this than the most basic housekeeping.

Posted by David at Thu Apr 10 18:28:13 2008
Lennart - Thanks so much for all the fantastic work!  I'm thankful not only for the heaps of software, but also for the great posts explaining the internals.  Thanks again!

Posted by Aster at Fri Apr 11 01:02:48 2008
Lennart, maybe you should use a temporary buffer with no saturation (32 bit/sample to avoid overflows)? This way you can do O(1) remixing when a client rewrites its data. Of course you have to convert the buffer afterwards, but you can use a more clever dynamic range compression than simple saturation.

Posted by Donald Wallace Rouse II at Sat Apr 12 04:03:45 2008
I agree with Aster; for each channel, use a 32-bit/sample simple additive mixing buffer, plus an unsigned-8-bit/sample number-of-samples buffer.
Scale each 32-bit sample by number-of-samples and the channel volume (if any).
Mixing in a new sound will be much easier.

Posted by Karl Zollner at Sat Apr 12 13:34:25 2008
Part 1:
Lennart, this is the third time I have read disparaging comments from you about OSS. I know you work at Redhat and are working specifically for Linux-and ALSA has won in the Linux camp. Yet I wonder what your solution is to the problems of cross-platform free software audio.

I can understand your resentment for OSS3-that API and it's use by propietary apps for Linux has caused lots and lots of problems in the past. But the situation seems to have changed. I can also understand a desire to have a single low-level audio driver system-this simplifies things for you. Yet OSS4 is being adopted by a number of different platforms, in spite of your lambasting it, and there are probably some thousands of Linux users using it right now.

ALSA has it's own problems-it is seriously under-manned, it is an absolute usability nightmare-god forbid a mere mortal must ever edit asoundrc, the labeling of mixer elements and the non-deterministic nature of many of it's  controls makes it extremely painful to even use the mixer, its documentation belongs to the worst of the worst of all free software documentation -where there is documentation it is outdated, incorrect, incomplete or simply misleading and ALSA is simply not getting the kind of love and attention it would need to begin to fix these problems. ALSA cannot and will not ever be ported to any other platforms. Moreover the lack of love for ALSA plays no small part in ALSA failing to provide things that would help Pulseaudio, in fact if ALSA had proper community support a lot of what you are doing in Pulseaudio could/should be done in ALSA proper-ie. in kernelspace.

Most of the free software written today can and will be used on platforms other that Linux. While I do not believe that any of the other free software platforms will ever be as popular as Linux this does not mean that those who write applications which manipulate audio can simply ignore these other platforms. OSS4 is poised to be the solution for some of these platforms and as long as OSS provides better support for some hardware and any support for ALSA unsupported hardware there will be Linux users using OSS.

Now, admittedly, it is not your responsibility to provide good OSS support for Pulseaudio-those who develop OSS should be doing this work. Yet the interesting work you are doing on Pulseaudio does not appear to be portable, which is one of the things which leads me to believe that your probing of the bounds of that which can be done in userspace could probably be done better in kernelspace. I have seen your attempts to sway the GNOME community to embrace Pulseaudio. After having not received support for Pulseaudio by the GNOME and KDE communities Pulseaudio is now being embraced by the distributions(fedora, mandriva, ubuntu etc.). Congratulations for this victory! Perhaps libsydney will end up being the portable portion of Pulseaudio that does get adoption by the GNOME and KDE communities.

I do not want to see Linux being held back by concerns about cross-platform availability, this applies not only to Pulseaudio but also projects like FUSE and HAL. But as promising as Pulseaudio currently is the outlook still seems awfully bleek. Every time I start to think I see light at the end of this tunnel, this tunnel seems to start stretching out into infinity.

What is it going to take to get a rallying call around freedesktop(*cross desktop/cross platform) audio ? What is it going to take to get ALSA the needed love and attention to make it truly viable ? Right now the freedesktop audio community has virtually 0 support from the audio manufacturers. Right now 3rd party application writers(propietary) do not have a platform-neutral API against which they can program(with perhaps the exception of gstreamer). What is it going to take to get freedesktop audio to a point that it can demand support from the manufacturers ? that it can dictate to 3rd parties which API's to use(ie. if you want Flash to properly integrate you must use this API) ?

continued...

Posted by Karl Zollner at Sat Apr 12 13:35:18 2008
Part2:
If we look over our shoulder at Xorg what we see is truly amazing. Xorg has a vibrant community which has undergone a true revolution in the last 5 years. The manufacturers of graphic cards are working with us now. Opengl is being adapted to our needs. This is empowerment.

If we look over our shoulders at ALSA what we see is just nigh of outright depressing. The glaciers are moving at a far faster rate. There is no community communication between ALSA and anyone else. Not one blog. No forums. No communication. ALSA is at its best when no one even notices it exists-for mere mortals are confronted with their own mortality by merely using and configuring it.

If we look over our shoulders at OSS we see new free software, which is cross-platform, easily configurable, and not user hostile. We see a new API which addresses the gregarious mistakes of previous OSS API's and which is being adopted by other platforms. We see in some cases better support for the same hardware that is available under ALSA and we see cases where some hardware is supported under OSS and not under ALSA. 

My concerns are easy to understand. I have been fighting with Linux audio and been left defeated more times that I can count for more than 10 years now. Because some of the apps I use were written for OSS3 I was forced to install a second audio card-and with 2 cards I still cannot do what I easily can do with 1 card under Windows. I have such an interest in Pulseaudio because it promises to solve some of the nightmares I have been fighting with-but it remains, for me, a promise.

We need the manufacturers to open up documentation for audio hardware like what has happened in graphics community. We, as a community, need to be able to bring something to the bargaining table to get the manufacturers to work with us. We need Adobe, Skype and others to use the API's that are created by our communities. My fear is that we are facing another wave of balkanization-where Linux has ALSA, and the rest has OSS, and we are left with no bargaining position-because hardware manufacturers would be forced to support ALSA and OSS-and in all likelihood would simply choose OSS because it is already cross-platform-and 3rd party app writers must support ALSA and OSS-and probably should just support OSS because it is already cross-platform.

Given this scenario it is imperative that there is a community supported cross platform alternative(is this gstreamer? should it be Pulseudio ? perhaps libsydney ?) Perhaps Pulseaudio should just abandon any attempts at being cross-platform, allowing libsydney to fill this role, and actually try to tie itself even more closely with ALSA-and in thus doing infuse ALSA with enough creativity, talent and life to raise it from it's current zombie like status. Could ALSA incorporate some of your work directly ? Could Pulseaudio become the userland counterpart to ALSA and free us from the god-awful ALSA configuration and tools ? Is there absolutely NOTHING that can be learned and adopted in either ALSA or Pulseaudio from OSS4 ? Is there any hope of any freedesktop activity to actually sort these issues out ?

enough of my rambling, sorry for writing so much.

Posted by Wout Mertens at Sat Apr 12 16:09:26 2008
I too am curious why you don't just use a big pre-mixed sample buffer as Aster and Donald suggest.

Even at 96KHz that's less than 1MB/channel, and assuming you use 24bit precision per stream with 256 max voices. The buffer would already be resampled from the stream bitrate to the audio card bitrate, which is the most expensive part, no?

I'm armchair programming here; I'm just curious why such a method would not work...

In any case, great work! Thanks for taking the time to explain the technicalities.

Posted by Lennart at Sat Apr 12 16:44:38 2008
Karl: There are a few misconceptions in what you are saying:

- You claim that the situation generally changed in regards to the mess that is OSS3 to OSS4. I would claim the contrary. There are some things fundamentally flawed in OSS, among other things this is the fact that it tries to be a portable kernel interface. This makes it inherently difficult to virtualize. The approach is just wrong, wrong, wrong.

- I am not sure why you came to the conclusion that ALSA was undermanned while OSS was not. AFAIK there's just Hannu on OSS, while we there are at least Takashi and Jaroslav looking after ALSA full-time on the payroll of Novell resp. Red Hat, in addition to James an Clemens, and relatively strong community support.

- The ALSA API has been proted to other platforms. There are even plugins for libasound that use OSS as a backend, they are shipped as part of alsa-plugins.

- ALSA still provides much better what I need than OSS does. The ALSA people are aware of the problems I have to deal with and already made a good way inroads to fix those issues. I have listed my issues on http://pulseaudio.org/wiki/WhyIHateALSA and it's getting better.

- The things PA does should not happen in kernel space. The kernel people made that clear before, and everyone who has a clue acknowledges that. The fact that OSS4 does mixing in kernel space is another one of these inherently wrong approaches, again caused by the wrong approach that it is to define a portable kernel interface. If your API is ioctl() based you are are forced to do mixing in kernel space. And that's just evil. No mixing in kernel space, please. (And you know, the thought about mixing FP in kernel is just frightening, since kernel space is territory where FP is forbidden) Believe me, doing this mixing in kernel space is wrong, really really wrong.

- The "interesting" part of PA (as you call it) is not inherently unportable. Given a powerful driver interface that implements the right number of basic operations this is implementable on non-ALSA backends too.

- I think you didn't really get the story right behind PA and GNOME.

- libsydney is intended to be the cross-platform API you are asking for. And nothing stops people to port PA to non-Linux systems again.

Posted by Lennart at Sat Apr 12 16:45:15 2008
Karl here's my second part:

- Most (consumer) audio HW these days is based on HDA which is very well documented. The driver situation is much much better than with 3D, where running the newest hardware is far more troublesome than with ALSA.

- Communication between ALSA and me is not as bad as you might claim. There's also a mailing list where Jarsolav and Takashi and the others post regularly. Sure they don't maintain "forums" or blogs, but I am not sure if that's really what makes a software project a good software project. Handling user support requests takes up a lot of time, and is quite frankly not what Takashi, Jaroslav are I am being payed for. The fact that OSS4 apparently manages to keep a forum around is probably more due to the fact that the its userbase is much much smaller than ALSA's. Also note that there is an #alsa irc channel where people answer user questions.

- Again, the biggest probelms of OSS3 are not adressed at all in OSS4.

- Docuemtation for audio HW is pretty much available. Certainly much better than for video HW.

- Adobe, Skype and otheres are using the ALSA APIs these days.

- Quite frankly I don't care too much about the otehr Unixes. And I don't think anyone should really care. I will not make my code inherently unportable, and I will happily take portability patches. But investing time in portability for OSes only the tiniest fraction of people actually uses I won't do.

- There's no need to configure ALSA in any way these days. The default config should be fine for almost everyone. If you however want some fancy setups with channel routing and so on then ALSA lets you do this. And for that it offers you a complex config language. But you shouldn't be complaining about that. OSS4 doesn't allow you to do things like that at all.

- Again, libsydney is intended to be a cross-platform API. The relavant platforms for that API are Linux, Windows and MacOSX. It will run on top of PA, of ALSA, on CoreAudio and DirectSound.

- ALSA is no zombie. It's a very lively project working closely with the kernel community.

I think you have a bit of an unfounded hatred against ALSA. I don't know why? ALSA has problems, sure, every software does. But it ain't any worse than anything else. And certainly not worse than OSS4. Au contraire, mon ami!

Posted by Lennart at Sat Apr 12 17:10:21 2008
Donald, Aster, Wouter:

Inverting the mixing for one specific stream would not have any positive impact on memory consumption: I'd still have to keep a copy of the recent past of every single stream around, so that I can subtract it from the mix buffer, before adding the new data.

As said, mixing is generally not invertible, due to saturation (yes, and doing dynamic range compression when mixing is on the TODO list and won't help here either). And of course for FP samples it will add numeric noise.

To work around the fact that we generally cannot invert the mixed buffer we'd have to keep around a copy of the mixed buffer in a higher resolution. And this copy of course takes up more memory due to the increases sample width. Cache pressure will also be much bigger, since we have to store away the high-precision sample we are summing into instead of just dropping it after we did the saturation.

So, what you propose has a negative effect on memory consumption and cache pressure in the general case when we do not have to remix.

It does -- however -- have a good effect on CPU time if we have at least four streams to remix. If M is the mixed buffer; A,B,C the data of the first three streams; D1 being old data of the fourth stream, D2 being new data of the fourth streem: Then first we'd have to calculate M as A+B+C+D1. If D is then rewritten, we'd have to calculate for my algorithm: A+B+C+D2. In your algorithm it would be M-D1+D2. The latter calculation has a smaller cache pressure and one operation less. However, this is only the case if we have four channels. If we have less than that your alg is worse. Now, think of how many streams you usually play back at the same time on your desktop. You'll notice that usually you play back only one, maybe 2 when there is an event sound, and in the worst case 3. Optimizing for four channels and more is thus not really a good idea I would say. Also, rewinding is not the common case.

Also, let's not forget that redoing the same calculation in case of rewinding requires much simpler code than implementing a different scheme when rewinding.

Posted by Wout Mertens at Sun Apr 13 10:10:26 2008
Hi Lennart,

I completely forgot about the old data you need to substract. I now see the error of my ways :-) Thanks for explaining.

Posted by Aster at Sat Apr 19 19:51:48 2008
Lennart, you're right that many simultaneous samples (>=4) is not a common case. And yes, rewinding also is not a common case. But I don't care if it is common or not - PulseAudio should not suck in such "rare" cases (I hope you agree). However, after giving it more thought, I realised that you would want to rewind only a small numer of samples (<1s I guess). So my optimization may not be necessary.

One idea to think about is to dynamically choose the algorithm based on the number of streams. If you have less than lets say 8, you use your algorithm (remixing from scratch). If you have more than 8 streams, you store the 32bit/sample mix, so that you can remix it later in a bounded time (independent of the number of streams, which could go really high for some pro applications or games).

One more thing: In my opinion saturation is NOT ACCEPTABLE. PulseAudio should have some high quality dynamic range compression when it gets too loud (to protect your ears or speakers, for example).

Posted by Aster at Sat Apr 19 19:58:06 2008
About memory consumption: applications could specify that they will not rewind. Using my algorithm, you wouldn't have to store their audio for remixing at all. You would never have to substract the old sound, so why store it?

What about an API (I'm not familiar with pulse audio, maybe it already has this), which would allow an application to specify an upper bound for rewinding? You would have to store a history of only this many samples.

Posted by gordboy at Sat May 17 15:53:37 2008
Having a cross-platform audio system is something people have been crying out for, for years. However, Pulseaudio is one of the most amateur, crass audio projects I have ever seen.

To even HAVE a "glitch-free" branch, speaks volumes about the toytown nature of the project. Call this trollery if you like, but I'm a serious sound programmer, not some fly-by-night amateur muppet.

How you ever got any exposure in the Linux World completely baffles me.

Posted by bodorgy at Sun May 18 06:38:36 2008
gordboy, "serious sound programmer" extrordinare: Knowing someone like you is a "serious sound programmer" for something that is the polar-opposite of PA makes me want to use PA even more.

Go back to your "serious sound" programming and stop wasting time by drooling on your bib in public.

Posted by Lennart at Sun May 18 20:51:28 2008
gordboy: I love you too, you serious sound programmer, you.

I'd love to have a peek on your serious sound code. Care to share a link with an amateur muppet like me?

Posted by Andy at Mon May 19 16:45:54 2008
I suppose if I were a serious sound programmer and knew that people were crying out for a cross-platform audio system for years, that even an amateur muppet (not even a professional muppet? harsh...) could get undeserved publicity by working on one, that my "serious" implementation would be pretty far along by now... and I would be ready to sit-back with all the coder-groupies I had acquired, sort through my lucrative job offers from various technology corporations, and "kick it," as they say, rather than hanging out on a website comment board trying to start the coder equivalent of a "mine is bigger" argument.

But on a more serious note...mess with the people who make me free software and I'll fight you. :P

I suppose it is coincidence "gordboy" and "bodorgy" consist of all the same letters?

Lennart: I'm a techie and musician, but someone not familiar with audio programming, there is plenty here I don't understand, but quite a bit that I do. I always appreciate when the coder gurus on various projects take the time to explain and dialog with the community, and I hope you don't let any negative comments or trolling behavior discourage you.

The gist of what I get from this is, we are going to have a sound server that is up there with the modern ones on the "big boys": more efficient, less error prone. Sounds good to me. :)

Anyway, now that we've all fed the troll enough for hibernation through the winter...back to the important stuff...making us free software :D

Posted by Paul Wayper at Tue May 20 06:36:42 2008
[Andy] I'd say bodorgy was a back-flame at gordboy.<P>
[Lennart] This sounds great - memory is cheap these days and keeping the wake-ups to a minimum is something that everyone likes.  I'd suggest multiplying the wake-up anticipation time (10ms normally) by a factor of ten if you get an underrun, and then turning it back down in a linear fashion until it was a set time away (I'd go for 7msec but it should be configurable) from when the buffer is empty.  This exponential-lower-on-fail linear-rise-on-success is what TCP and other well-tested algorithms already do.  You could probably even calculate how much time you had spare and back down intelligently.  The idea here is to make sure there's a vanishingly small chance of a second underrun, but also to get to a state where the wake-up time no longer has to be calculated or checked - thus reducing the amount of extra calculation you do and thus the amount of time that the processor is awake.<P>

Keep up the good work!

Posted by gianni at Tue Jun 3 02:50:32 2008
How would this glitch-free model affect or be used for capturing?

Currently in PA (that distributed in Fedora 8, at least) there are major problems with PA and sound capturing (I have not been able to capture any sound from any source while PA is running, in my laptop or desktop), how will glitch-free PA integrate with capture (especially with applications such as Skype or Mumble which use ALSA as their backend)?

Posted by Alex Lukin at Tue Jun 3 17:48:45 2008
I'd like to point you to usecase where PA is not good yet and causes only problems.

I use Rosegarden+jackd+fluidsynth+qsynth for composing and recording some music tracks. It works fine with rt kernels and  with "timerless" kernels I compile myself.
It does not work in F8 and F9 with PA on. Only scratching sounds and 3 times accelerated tempo.

How PA will work with jack infrastructure?

I think that PA must grow in some reliable and professional sound API like Steinberg's ASIO to allow low latency high quality sound on *nix.

I understand that a lot of things depends on rt capatibility of kernel and scheduler but even more depends on sound server.

Anyway, thaks for good work and good luck!

Posted by Don Fitzpatrick at Mon Jun 9 22:39:28 2008
When using a Xen DomU running Windows32 to test software, the lack of sound output is a serious limitation.  PA appears to be a means of directing sound to the Linux Dom0 VM.  However, I can't seem to get PA working in Windows.  Is Windows not supported as a sound source now?  If so, might it be supported in the future?

Thanks for your contribution to the Linux community.

Posted by Downhill Games at Sat Jun 14 18:56:57 2008
Lennart, it would appear your payment is compensation for putting up with idiots, while your programming is just doing what you love :) I'm glad there is some counter-point against(for?) the morons, otherwise we might not have this wonderful and MUCH NEEDED daemon. I really do hope the other daemons (maybe except arts?) die off a quiet, unnoticed death. Of course, users will notice when their [sound-enabled] applications start working without a hitch -- I know I have :)

Thanks a lot for your work, and I hope they pay you well enough for putting up with idiots like Karl Spamzalot^

Take care ^_^

Posted by Downhill Games at Sat Jun 14 18:59:21 2008
Oh, oh! And I can't wait to try .11 :D

Posted by Pizuz at Sun Jul 20 12:43:41 2008
I'm certainly curious.

Are those latencies applications can "request" the real-world latencies you will actually hear? Or will there be an additional delay? I mean, a latency of 20ms or even 10ms would be pretty awesome already for games or emulators.

Posted by Simos Xenitellis at Thu Jul 24 19:49:18 2008
Congratulations Lennart!
Amazing work and great write-up.

Posted by apanloco at Sat Aug 2 19:58:43 2008
It seems this is now released with PA 0.9.11 :)

Posted by Lorenz at Tue Aug 12 18:36:56 2008
Thanks for your great work, Pulse Audio lets me do things I never thought were possible!

Posted by Animesh Saxena at Sat Sep 20 13:11:27 2008
This is truly awesome :).

Initially I also couldn't help moving back to alsa on slightest of error, but I think once anyone sees the numerous advantages that pulseaudio has, nobody would switch back ;).

Thanks for explaining how it all works. Got me interested in the code as well!

Posted by JoeS at Sun Oct 5 16:00:41 2008
It might be great when it finally works, but so far it's as bad as KDE4. Blame the distros, blame the apps, but audio worked for the majority of people before pulse was included in distros. The number of bug reports related to audio has quadrupled since pulse. Jack did a better job and still does. All we needed was configuration for ALSA and JACK. Why add another layer? I might be impressed in a few years, but I'll still be wondering why we went through the hassle when we had a better engineered audio all along (jackd).

Anyone that wants to record audio needs to remove all traces of pulse to get decent results and from what I can read in all the pulse code and comments, pulse will never be useful for an audio workstation. Nor has it solved any of the issues with audio that where there before pulse. Multi-speaker setup, multi-channell setup, asound.rc configuration, low-latency recording and playback, etc...

Please quit blaming others and start working with distros, ALSA, JACK, OSS and existing audio app developers to solve these problems or quit pushing pulse down the distros throats. And yes, Pulse was marketed to distros, not requested.

Posted by Towner at Sun Oct 12 02:08:46 2008
Thanks for the article, a thoroughly interesting read. The API neck beard comment wars are a fairly hilarious side show as well.

Posted by Muppet at Sat Nov 22 23:24:11 2008
For 10 years sound in Linux was a nightmare. Now it seems to get better and better, many thanks!

But the user interface is not really muppet-proof yet. E.g. the pulse audio manager mentioned in the perfect setup is such an example.

Also configuration in Gnome is far from muppet-proof. One has preferences-sound, every application also has a chooser of the backend and individual tests, the names are then different in skype, in sound recorder, in pidgin, in wine, etcetc.

Posted by anon at Wed Nov 26 05:52:11 2008
Whatever, will the new release resolve the dlopen issue?  <a href="http://www.google.com/search?hl=en&q=ubuntu+%2B%22pulse+audio%22+dlopen&btnG=Search">http://www.google.com/search?hl=en&q=ubuntu+%2B%22pulse+audio%22+dlopen&btnG=Search</a>

Posted by Jerome at Fri Nov 28 11:19:27 2008
Saturated integer mixing? I was under the impression that floating point mixing was just as fast (if not faster) on modern hardware, and of course much easier to implement without fear of clipping etc. Correct me if I'm way off-base with this.

Posted by Paul Wayper at Tue Dec 2 00:29:21 2008
[Lennart] Will these changes support ultra-large playback buffers?  One thing that I saw on Intel's power saving pages (http://www.lesswatts.org/projects/applications-power-management/large-buffers.php) was the idea that applications decode large amounts of audio (and video) to memory and then let the decoding (thread/process) go idle while the media is played back.  This is especially useful if reading of DVDs or CDs where keeping the device spun up uses more power than reading a large chunk and letting it spin down.  Of course, applications would get more out of buffering the compressed data rather than uncompressed, but if an application can be simpler by pushing the last twenty seconds of decoded audio out to PulseAudio before then buffering the next two minutes of compressed data, then this is an overall win.

Hope that makes sense :-)

Posted by Lennart at Wed Dec 10 22:56:51 2008
Paul: yes, the ultra-large buffers is exactly what g-f is about. But not 2mins. Just 2s -- which is very long already.

Posted by zero latency at Fri Feb 27 06:33:10 2009
I think pulseaudio server is too aggressive in rewinding the application pointer when starting a new audio stream while another stream is playing

Refer to image 1 , the server can only safely rewind the application pointer up to the start of fragment 3 to mixed the new stream and the running stream

If glitch free mode using NFRAGS = 1 , the server can only rewind to the start of the buffer (i.e. the start of fragment 1 )

Posted by DMA transfer at Thu Mar 5 03:22:24 2009
"the hardware reads the samples from the buffer, one at a time, and passes it on to the DAC so that eventually it reaches the speakers"

http://thread.gmane.org/gmane.linux.alsa.devel/60428/focus=60478

At the end of interrupt , the driver just setup the starting address and the number of samples for the DMA transfer and the pointer callback return the value from a hardware register about the number of processed samples


i.e. snd_pcm_hw_rewindable() is not equal to snd_pcm_mmap_hw_avail() for driver using DMA transfer , especially those driver which using snd_pcm_indirect_playback_transfer()

e.g. cs46xx copy a period of samples from the system memory to the DSP 's memory and perform mixing by the DSP before sending it to AC97 's DAC

Posted by Grr at Sun May 17 03:27:52 2009
"The code in the glitch-free is still rough and sometimes incomplete ... I hope this text also explains to the few remaining PA haters a little better why PA is a good thing, and why everyone should have it on his Linux desktop."

No, it shouldn't be on our desktops.  It's incomplete, "rough", buggy, and has a horribly complex user interface.

It should be finished first, and given an intuitive interface, and tested thoroughly, and THEN it should be on our desktops.

Posted by Konrad at Wed Aug 5 15:21:04 2009
I'm more than confusing with audio in linux, for programmer everything ok, but for normal user there is never one example of basic settings of audio tools!!

Every teaching book in mathematics there is first the explain and than following one example with real figures.

Best regards

Konrad

Posted by MrQuincle at Sat Dec 12 14:14:19 2009
On the moment I am in a team that creates a physically realistic robot simulator. Although I am responsible for AI (and not game AI, but real AI about which you don't wanna know), we are also running into audio problems.

OpenAL Soft seems to allow multiple robots with each their own microphone. I've to check that though.

From what I understood, the remarks by Donald, Aster, Wouter would be valid if their are more than 4 streams involved in mixing. So, would that mean that their suggestion would be better as soon as we have a robot simulator with more than 4 robots listening to the sounds of each other with their own (virtual) microphones?

Sorry, for my layman question, but I was intrigued by the high quality of the post, and even the comments. Even though, I think this question about the scalability of the sound daemon is not for the faint of hearth. :-)

Kind regards!

Posted by hybridantrieb at Wed Sep 1 15:54:46 2010
Yes I agree with u. I felt great while going through your article. And I will be looking forward to see more of this kind of appealing writings.
<A HREF="http://www.wasserauto24.de">hybridantrieb</A>

Leave a Comment:

Your Name:


Your E-mail (optional):


Comment:


As a protection against comment spam, please type the following number into the field on the right:
Secret Number Image

Please note that this is neither a support forum nor a bug tracker! Support questions or bug reports posted here will be ignored and not responded to!


It should be obvious but in case it isn't: the opinions reflected here are my own. They are not the views of my employer, or Ronald McDonald, or anyone else.

Please note that I take the liberty to delete any comments posted here that I deem inappropriate, off-topic, or insulting. And I excercise this liberty quite agressively. So yes, if you comment here, I might censor you. If you don't want to be censored your are welcome to comment on your own blog instead.


Lennart Poettering <mzoybt (at) 0pointer (dot) net>
Syndicated on Planet GNOME, Planet Fedora, planet.freedesktop.org, Planet Debian Upstream. feed RSS 0.91, RSS 2.0
Archives: 2005, 2006, 2007, 2008, 2009, 2010

Valid XHTML 1.0 Strict!   Valid CSS!