A couple of years ago, Pau Guillamet and I wrote a series of tutorials on Linux audio tools for the mainstream Spanish magazine Personal Computer & Internet.
Well, after all this time the tutorials finally see the light!

Actually, I had the editors permission to release it online from some time ago; but this last week I had the perfect excuse to work on its formatting (using wiko, of course) since I gave a seminar on this topic at the esmuc. A seminar that, by the way, I enjoyed very much giving, and I wouldn’t mind repeating the experience!

Indeed, some applications –Ardour specially I’d dare say– have changed a lot during this lapse of time. And other apps have not change that much. Anyway I hope it can be of some use to people willing to introduce themselves to the power or the Linux audio tools.

Comments on the tutorials can go as comments of this blog.


3D audio made with Clam

January 25, 2008

While it is true that the clam-devel mailing-list an irc channel have been a little quiet recently –specially compared with the summer period (well it was called “summer of code” for a good reason!–, this doesn’t mean that we recently had a low development activity. (Being an open-source project the commits say it all)

The quietness is related to David and me being now involved with the acoustics group of the Fundació Barcelona Media, where we work in a more traditional –and so less distributed– fashion collaborating with people who actually sit together. Further, I enjoy very much working with such skilled and interdisciplinary team (half are physicists and half computer scientists), and also assessing that Clam is very useful in these 3D-audio projects. These latest developments on 3D audio rendering where mostly driven, by the IP-RACINE European project aiming to enhance the digital cinema.

The kind of development we do in Clam also changed since last summer. Instead of improving the general infrastructure (for example the multi-rate data-flow system or the NetworkEditor) or improving the existing signal processing algorithms, what we’ve done is… writing plugins. Among many other things the new plugins feature a new lightweight spectrum and fft, and efficient low-latency convolutions.

And this feels good. Not only because the code-compile cycle is sooo fast, but because it means that the framework infrastructure is sufficiently mature and its extension mechanisms are very useful in practice. Further, rewriting the core spectral processing classes allowed us to do a lot of simplifications in the new code and its dependencies. Therefore, the new plugins only depends on the infrastructure, which I’d dare to say is the more polished part of Clam.

And now that IP-RACINE final project demos have been successfully passed, it is a great time to show some results here.

Flamencos in a virtual loft

Download and watch the video in the preferred format:


Listen to it carefully through the headphones (yes, it will only work with headphones!) You should be able to hear as if you were actually moving in the scene, identifying the direction and distance of each source. It is not made by just automating panning and volumes: but modeling the room so it takes into account how the sound rebounds into all the surfaces of the room. This is done with ray-tracing and impulse-responses techniques.

This stereo version has been made using 10 HRTF filters. However, our main target exhibition set up was 5.0 surround, which gives a better immersive sensation than the stereo version. So, try it if you have a surround equipment around:

Credits: Images rendered by Brainstorm Multimedia and audio rendered by Barcelona Media. An music performed by “Artelotú”

Well, the flamenco musicians in the video should be real actors. Ah! Wouldn’t have been nice?

What was planned

The IP-Racine final testbed was all about integration work-flows among different technological partners. All the audio work-flow is very well explained in this video (Toni Mateos speaking, and briefly featuring me playing with NetworkEditor.)

So, one of the project outcomes was this augmented reality flamencos video in a high-definition digital cinema format. To that end a chroma set was set up (as shows the picture below), and it was to be shoot with a hi-end prototype video camera with position and zoom tracking. The tracking meta-data stream fed both the video and audio rendering, which took place in real-time — all quite impressive!

The shouting of the flameco group “Artelotú” in a chroma set

Unfortunately, at the very last moment a little demon jumped in: the electric power got unstable for moment and some integrated circuits of the hi-end camera literally burned.

That’s why the flamencos are motionless pictures. Also, in absence of a camera with position tracking mechanism we choose to freely define the listener path with a 3D modelling tool.

How we did it

In our approach, a database of pressure and velocities impulse-responses (IRs) is computed offline for each (architectural) environment using physically based ray-tracing techniques. During playback, the real-time system retrieves IRs corresponding to the sources and target positions, performs a low-latency partitioned convolution and smoothes IR transitions with cross-fades. Finally, the system is flexible enough to decode to any surround exhibition setup.


The audio rendering (both real-time and offline) is done with Clam, while the offline IR calculation and 3D navigation are done with other tools.

The big thanks

This work is a collaborative effort, so I’d like to mention all the FBM acoustics/audio group: Toni Mateos, Adan Garriga, Jaume Durany, Jordi Arques, Carles Spa, David García and Pau Arumí. And of course we are thankful to whoever has contributed to Clam.

And last but not least, we’d like to thank “Artelotú” to the flamenco group that put the duende in such a technical demo.

Lessons for Clam

To conclude, this is my quick list of lessons learnt during the realization of this project using Clam.

  • The highly modular and flexible approach of Clam was very suited for this kind of research-while-developing. The multi-rate capability and data type plugins, where specially relevant.
  • The data-flow and visual infrastructure is sufficiently mature.
  • Prototyping and visual feedback is very important while developing new components. The NetworkEditor data monitors and controls were the most valuable debugging aids.
  • Everybody seems to like plugins!

When upgraded Ubuntu from Feisty to Gusty, the newer freebob audio-firewire driver broke the support of Focusrite Saffire Pro. I have not tested the non-pro versions (Saffire and Saffire LE), but I guess it applies also to them.

Freebob/ffado main developer quickly identified the problem and proposed a provisional patch, which is what I’m gonna explain in this post. For a more definitive solution we’ll probably have to wait to the next ffado release. Hopefully soon.

First, the symtoms: This is how Gutsy freebob complains when starting jack with a Saffire Pro.

$ jackd -d freebob
jackd 0.103.0
Copyright 2001-2005 Paul Davis and others.
This is free software, and you are welcome to redistribute it
under certain conditions; see the file COPYING for details
JACK compiled with System V SHM support.
loading driver ..
Freebob using Firewire port 0, node -1
unexpected response received (0x9)
Error (bebob_light/bebob_light_avdevice.cpp)[1679] setSamplingFrequencyPlug: setSampleRate: Could not set sample rate 48000 to IsoStreamInput plug 0
Error (bebob_light/bebob_light_avdevice.cpp)[1696] setSamplingFrequency: setSampleRate: Setting sample rate failed
FreeBoB ERR: FREEBOB: Error creating virtual device
cannot load driver module freebob

The problem is related to the sample rate interface. The quick solution is not using that interface. Just a matter of commenting out a piece of code.

Update 9th March:
Pieter Palmer makes me notice that from version 1.0.9 there is a ./configure switch that does what the below patch does. So you can safely skip the part of this blog about downloading 1.0.7 version and applying the patch.

Instead you should download the svn trunk

svn co libfreebob

and then use –disable-samplerate at the ./configure step.

Begin deprecated:

Apply the following patch. If you don’t know how to do it, follow these steps: copy-paste the following patch into a file (i.e. /tmp/saffire.patch).

Index: src/libfreebobstreaming/freebob_streaming.c
--- src/libfreebobstreaming/freebob_streaming.c	(revision 449)
+++ src/libfreebobstreaming/freebob_streaming.c	(working copy)
@@ -154,7 +154,7 @@
 	 * This should be done after device discovery
	 * but before reading the bus description as the device capabilities can change
+#if 0 //disabled for Focusrite Saffire
    if(options.node_id > -1) {
        if (freebob_set_samplerate(dev->fb_handle, options.node_id, options.sample_rate) != 0) {
@@ -178,8 +178,8 @@
			return NULL;

	/* Read the connection specification

Then change dir to the checked out repository and apply the patch:

cd libfreebob-1.0.7
patch -p0 < /tmp/saffire.patch

End deprecated:

Now, the building phase. Get the build dependencies.

sudo apt-get build-dep libfreebob0

There is a problem here. For some reason libtool (or some package that includes it) is missing in the debian/control file. This seems to me a packaging bug that makes autoreconf give errors like this one:

possibly undefined macro: AC_ENABLE_STATIC

So, install libtool:

sudo apt-get install libtool

And build:

autoreconf -f -i -s 
sudo make install

To make sure that jack will take the new libfreebob (in /usr/local) we will hide the current freebob libs to jackd. I mean the ones installed by the debian package.

cd /usr/lib
sudo mkdir freebob_hidden
sudo mv libfreebob.* freebob_hidden/

That’s all.
As always check that raw1394 module is ready. Else do this:

sudo modprobe raw1394
sudo chmod a+rw /dev/raw1394

Now jack should work without complaining:

 jackd -d freebob

How to compile freebob with optimizations

CFLAGS="-march=core2" ./configure --enable-optimize --enable-mmx --enable-sse

Or use this for 64bits enabled cpus:

CFLAGS="-march=nocona" ./configure --enable-optimize --enable-mmx --enable-sse


I’m very thankful to Pieter Palmer for the quick help at #ffado irc channel.

My next posts will talk about the reason I needed audio-firewire in linux. It was related to a real-time 3D audio exhibition developed with CLAM. It just happened yesterday and I’m very happy that it all worked really well.

As the other how-tos of this serie, this is also a side-effect of the project I’ve been working recently.

Libsndfile is a very popular library for reading and writing lossless audio files written by Erik de Castro Lopo. We use it in Clam and I’ve use it in other small applications.

This time I wanted to use the C++ wrapper (sndfile.hh header) added recently and, since I couldn’t find an example of use, well, time to post mine here.

I like a lot better the C++ api than the C one. See also the C sndfile API documentation.

int main()
const int format=SF_FORMAT_WAV | SF_FORMAT_PCM_16;
// const int format=SF_FORMAT_WAV | SF_FORMAT_FLOAT;
const int channels=1;
const int sampleRate=48000;
const char* outfilename=”foo.wav”;

SndfileHandle outfile(outfilename, SFM_WRITE, format, channels, sampleRate);
if (not outfile) return -1;

// prepare a 3 seconds buffer and write it
const int size = sampleRate*3;
float sample[size];
float current=0.;
for (int i=0; iYou’ll find a complete reference on the available formats on the sndfile API doc. But this are typical subformats of the Wav format. As in the example above, put them after the SF_FORMAT_WAV | portion:

Signed 16, 24, 32 bit data:


Float 32, 64 bit data:


In my last how-to we played a multichannel video with surround audio (coded with ac3) using mplayer and jack. Jack (the audio server and router) is perfect to use with firewire audio interfaces. However, the Ubuntu Gutsy mplayer is not jack enabled by default.

To see the list of audio backends:

mplayer -ao help

However, it is very easy to compile a new mplayer .deb package jack enabled. First, create a temp dir (for example debs) and change to it. Install libjack with it’s development headers

sudo apt-get install libjack-dev

Install all the packages to build the mplayer-nogui package. This is done in a single command:

sudo apt-get build-dep mplayer-nogui

Now just get the source package and build it (the -b option is for build). This will take time.

sudo apt-get -b source mplayer-nogui

The mplayer build system automatically searches for libjack headers and when found configures its build system to use jack.

Finally install the new packages (dpkg complains with an error error processing mplayer which you can ignore):

sudo dpkg -i *.deb

To use it, choose the jack audio driver with -ao jack. A nice feature is that it automatically connects all the outputs to the physical device inputs.

mplayer -ao jack someaudiofile.mp3

If your file have more that 2 channels:

mplayer -ao jack -channels 5 surround.wav

As a final note, I’ve checked that debian (not ubuntu) package is jack enabled by default. The reason is that their packaging scripts differ a little bit (specifically debian/control file). The Ubuntu one don’t define the jack-dev build dependency.

AC3 is a compressed (lossy) audio format that allows multichannel (from 1 to 6 channels) and is very used in DVDs and video (mpeg2 and mpeg4) files.

Surprisingly, I found creating multichannel ac3 files in Linux much harder that expected. This is basically because most of my searches led to experiences of ffmpeg and no other tools. However, the fact is that encoding more than 2 channels don’t work (at least I wasn’t able) using the ffmpeg version that comes with Ubuntu Gutsy.

For example, this seemingly correct ffmpeg command produced a ac3 file with correct metadata, but totally silent and useless:

ffmpeg -i l.wav -i r.wav -i sl.wav -i sr.wav -i c.wav -ab 192k -ar 48000 -ac 5 -acodec ac3 -y surround.ac3

Luckily there is an alternative to ffmpeg in the open-source/linux arena which is aften. And it is, actually, very flexible and overall better ac3 encoder.

It is still not packaged for Ubuntu, so you’ll need to download and build it from sources

svn co aften

Change to the aften director,
create a new directory, e.g. default, and change into it.
Then compile:

cmake .. -DCMAKE_INSTALL_PREFIX:STRING="/usr/local"
sudo make install

To create a multichannel .ac3 you need to create first a multichannel wav.
Of course sox comes very handy:

sox -M L.wav R.wav C.wav SL.wav SR.wav surround.wav

And now the ac3 encoding

aften -b 448 -cmix 0 -smix 0 -dsur 2 -acmod 7 surround.wav surround.ac3

That’s all. Now you’re ready to use the created 5.0 ac3 file.

Since the options are not self explaining here goes a brief explanation of the used options:

  • -b 448 — Bitrate in kbps (4 channels: 384 kbps, 5 channels: 448 kbps)
  • -smix 2 — Surround mix level ( 0: -3dB, 1: -6dB, 2: 0 )
  • -dsur 2 — (2: Dolby surround encoded, 0: not indicated, 1: not Dolby surround encoded)
  • -acmod 7 — Audio coding mode, or channel order (7: L,R,C,SL,SR)

Now let’s put some images to the sound.

Embed the created .ac3 audio to a video

It is very easy to do using avidemux, a very nice graphical user interface application designed to do encoding and cutting video tasks. Use the menus to Open and choose the video file. Choose Audio, Main Track choose external AC3 and Save Video

Or, alternatively, you can use a quick command line with mencoder:

mencoder -ovc copy video.avi -audiofile surround.ac3 -oac copy -o newvideo.avi

Now it’s ready to play the video with the 6 audio channels. For example with a (jack enabled) mplayer

mplayer newvideo.avi -ao jack -channels 6

Changes: small update on Des 16

Reading data files in C++

December 3, 2007

Text processing in Python is so easy that I don’t feel like doing such kind of programming in C++ at all. However, sometimes I don’t have the choice. For example now I just had to implement a CLAM Processing that loads a big table of float data from a text file. Specifically, this is done in the processing ConcreteConfigure method.

What annoys me more of C++ file streams is the time I spend understanding its API. Let alone remembering it! So since I’ve just implemented a quite generic solution that loads a table of floats let’s blog about it and keep it at hand.

The input file. Actually, the file format is very flexible in terms of separators and the number of columns per row. The only requirement is that everything can be read as a float.

0 51.0 0.00164916 -0.00770348 73.0007 40.7776 214.276 76.1719
1 51.0 0.00164916 -0.00770348 73.0007 40.7777 214.276 76.1719
2 51.0 0.00164916 -0.00770348 73.0007 40.7782 214.276 76.1719
3 51.0 0.00164916 -0.00770348 73.0007 40.7795 214.275 76.1719

And this is the C++ code that reads the file in a data structure and then uses it:

#include ;
#include ;
#include ;
#include ;

// Data type where to load the table
typedef std::vector Row;
std::vector table;

// Load table from file
std::ifstream file(“/tmp/data”);
while (file)
std::string line;
std::getline(file, line);
std::istringstream is(line);
Row row;
while (is)
float data;
is >> data;

// Now let’s use the table
for (unsigned i=0; i

This is a quick post (or should I say nanoblog?) to share my new home page. It’s at (I did a big update on Des 2nd). After using so many web2.0-social networks I felt the need to have a boring, simple and static home page written in html. Actually, I happily failed at writing directly in html just because I had wiko at hand.

Wiko stands for wiki compiler, and is the project name we gave to some python scripts David and me have been lately developing for personal use. It basically convert wiki text files to html, LaTeX and blogs; as is easy to imagine it combines very well with your favorite version control system to create an actual wiki (it’s collaborative but only through committing to the version control system). Visit the wiko home page to learn more. And maybe use it.

I’ve been a long time longing for a N700 or N800 so this short message that just popped into my inbox made my day ;)

Assumpte: N810 maemo submission accepted
Data: Fri, 9 Nov 2007 18:17:31 +0200 (EET) (17:17 CET)

Congratulations! You have been accepted to the N810 maemo
device program. We will send your discount and instructions
as soon as the device is available in your selected shop (soon).

maemo team –

The N810 maemo device program aims to offering a low price for the new Nokia N810 Internet Tablet (99€) to the active contributors of the maemo community, open source programmers, designers, bloggers and the like.
I’m eager to have it into my hands. We’ll see how hard it is to port Clam and other Linux audio apps to it.

Since the change of SVN for CVS in Clam (a year ago aprox) we do not explicitly tag the releases and that’s fine because SVN revision numbers are global. On the other hand, SVN do can create tags but it is dangerous because a tag is exactly the same thing as a branch. So SVN doesn’t prevent to commit to a tag.

Our tagging approach is very simple and proved useful: just write the revision number of each release in the CHANGES files. This simplifies the release process and also the way to make diffs –since you always use the same syntax.

Now an example: Let’s say we want to see changes in SMSTranspose files from last stable release (1.1.0):

1) Look for the svn revision corresponding to a stable version in the CHANGES file

NetworkEditor$ head CHANGES

2007-??-?? NetworkEditor 1.1.1 SVN $Revision: 10220 $

2007-06-08 NetworkEditor 1.1.0
'More eye-candy, please'
* SVN Revision: 10216
* Using CLAM 1.1.0
* New examples
* genderChange: fully working now and with an interface

So version 1.1.0 is revision 10216.
By the way, you maybe are curious on this $Revision: 10220 $ part of the first line. This is a SVN variable (or pattern). Each time you commit this CHANGE file the number gets updated to the current revision. That means that we actually never write the revision numbers in CHANGES files, we only have to remove the “$” when we decide to tag the release.

2) Now diff the files of interest on that version and head

NetworkEditor$ svn diff -r 10216:HEAD src/processing/SMSTranspose.*xx
Index: src/processing/SMSTranspose.cxx
--- src/processing/SMSTranspose.cxx (revision 10216)
+++ src/processing/SMSTranspose.cxx (revision 10281)
@@ -20,11 +20,15 @@
#include "SMSTranspose.hxx"

Last, a quick tip for the gvim users: pipe the diff result to gvim using -R – options:
$ svn diff | gvim -R -
And take advantage of vim syntax highlighting and quick navigation.