Wavelet Transform for sound synthesis | 14SEP2010

Download Poster, Soundfile

The Discrete Wavelet Transform (DWT) is a mathematical technique used mostly for signal and image analysis. Our research goal was to find out if there would be any use for it as applied to digital audio synthesis: the algorithmic generation of sound using only computers. Our interests are motivated primarily by electronic music. Producers in this genre of music are interested in finding interesting ways to make new sounds. I know, because I am one of them. According to subject matter experts, we aren’t the first group to try using the DWT for music. Unfortunately, there is virtually no documentation of prior attempts or findings, apparently because no one found any use for it. Let us introduce our efforts. A research poster and demonstration of the sounds are available for download above. This research was carried out with funding from California Institute for Telecommunications and Information Technology (Calit2).

It is fair to ask, “Why wavelets?” Wavelets are small pulses that are carefully constructed with very specific mathematical properties. We can decompose a signal into combinations of these very delicate pulses, then reconstruct the exact original signal from them. What we’ve done is akin to smashing a Rolex and using the little parts that come flying out as confetti. What appealed to us about this method was the time-frequency representation of signals. Another factor is that there is much infrastructure already in place to implement the DWT, so less work. With that said, we found little use for using wavelets, specifically, for sound synthesis. So our focus has changed somewhat since the original project was proposed.

We can abandon the constraint that the wavelets actually be of any precise mathematical form. We will just call them pulses from this point forward. Our new model will take an input waveform (say, the same that we started with in the DWT experiment) and expand it in time for each higher octave of the filter bank. These pulses will be fed into the reconstruction filters as before. Now we can experiment with different forms for the filter coefficients. What constraints would we put on such a filter? We want linear-phase so that we can line up the different channels at the end of the filter bank. We also want to skirt the problem of not being able to modify the fundamental frequency, which was previously dominated by the computational block size.

Advertisement

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.