Warm analogue it ain’t. I knew when I started coding my synth-sequencer, Foldy, a few months ago, that it’d be harshly digital and crude sounding. I was inspired by tracker software as well as by two old PC-only music programs, Drumsynth and Hammerhead (which were the basis of my beat-creating project last year).
I’m releasing it today and calling it version 1.0. It works, but some iffy design decisions mean I won’t keep developing it.
That said, the code quality is a step up from my last release, the experimental art tool MoiréTest. I was able to go back and make big changes in Foldy, without the whole thing crumbling, which is always a good sign.
For the rest of this post I’ll explain what the program does, then what questionable decisions I made and how I would do it again.
(To try it yourself, download Foldy.jar from here and double click on it. If that doesn’t work try the further instructions in the readme.)
Foldy takes in a musical sequence, which you can type into a box in the app window. Notes are numbered as MIDI notes, where A=440 is at 69, and notes range from 0 to 128, and separated by commas. A rest is -1.
(By the way, did you know that, incredibly annoyingly, there is no industry standard for numbering the octaves of MIDI notes? The frequencies are agreed on, but one manufacturer’s C3 is another’s C4… how sad. This doesn’t impact Foldy though, I just work from the frequencies.)
The speed that notes are played is altered using tempo and beat subdivision controls. All the other parameters in the window modify the sound of individual notes. Only one note can play at a time. This kept things a bit simpler though, with the Java Sound API, opening another output line or mixing two together wouldn’t be much harder.
I was going to include a choice of mathematical curves, possibly Bezier curves, for the amplitude envelope, out of a perverse desire to avoid the bog-standard Attack-Decay-Sustain-Release model, which is suited to a keyboard instrument where a note is attacked, held and released. I was thinking this synth could be more percussive, inspired by the basic sample-playback model of drum machines and trackers (a type of sampler software originally made for Amiga computers and associated with the demoscene).
Unfortunately I didn’t finish the Bezier stuff, but in any case it probably wasn’t suitable. (For one thing, Bezier curves can easily have two y values for one x value.) In fact, I didn’t do any extra envelope options, partly because envelopes typically drive filters or modulations, but these are not allowed by my architecture. If there’s an obvious v1.1 feature, extra envelope curves is it.
One feature that did make it in is “wave-folding”. To get more complex waveforms, I cut a sine wave at a certain amplitude, and invert anything above that amplitude. This can be done multiple times to add a lot of harmonics.
However, this is a restrictive technique with a distinctive grinding, mechanical sound. All we’re doing here is shaping a waveform which is then repeated exactly at the period of the note frequency. The ear instantly picks up the lack of complexity.
I remember when I was a teenager, having the following bright idea: if I can see that the recorded waveform from my bass consists of repeated bumps, can’t I just sample one of those and repeat it/change the speed of it to get any bass note I want?
This is the basic concept of wavetable synthesis. However, when done as simply as that, it sounds completely artificial, not at all like a bass guitar. The sound of any real instrument has complexities like propagating resonances, changes in pitch, string rattle and other distortions/energy loss.
(E.g. listen to the low note in this sampled bassline – it’s starts really sharp, then reverts to normal. That’s because plucking of a stringed instrument raises the pitch of the note momentarily, especially on an open string – I think this was an open E string on the original sampled recording, just it’s been pitched up here.)
Foldy has no capability for such modulations. I could try put them in, but here we come up against the compromises I made at the start.
Because I was afraid that rounding errors would mount up and give me grief, I decided to keep everything as whole numbers, taking advantage of the fact that digital audio ultimately is whole numbers: a series of amplitudes or “samples” each expressed as, for example a 16bit or “short” integer. (Most studios mix at 24bit these days, but say CD audio only goes up to 16bit precision.)
This informed the basis of the synth. Desired frequencies and tempos are approximated by a wavelength and a subdivision length expressed in whole samples. 44100 samples per second might seem fairly precise, but for musical pitches, it isn’t. So I found a compromise that bounded pitch error to about 20 cents:
Foldy tries to fit multiple wave cycles within a whole number of samples, for example 3 cycles in 401 samples. This gives a bit more precision, because the wavelength is 401/3 = 133.667 samples, in between the 133 and 134 wavelengths that are all I could get otherwise.
I then use these bits of audio, which I call “chunks”, and which could contain a single cycle or a handful of cycles, in the same way I was using single wave cycles originally. So every note would contain hundreds of them. Then I decided I could reuse this division to store amplitude envelopes – I gave each chunk a starting amplitude, and interpolated between these. (Of course, this is redundant at the moment because my overall envelopes are merely a linear interpolation from maximum to zero! But with a curved envelope, the result would be to store the curve within a few dozen or hundred points, with straight lines from point to point.)
Ugh… I don’t even want to write about it anymore. It wasn’t well conceived and caused me a lot of hassle. It precluded any of the more intriguing synthesis techniques I like, such as frequency modulation, because pitch in this system is fixed for each note (and imprecise).
Long story short, when I opened up the source code of Drumsynth recently, I realised that… it just uses floats and gets along fine. For modulation, it simply keeps track of phase as another float. I should’ve done that.
(That said, I think Drumsynth’s sound quality is far from pristine. This isn’t from rounding errors, I’m certain, but from not doing more complex stuff like supersampling. But, that’s out of my ability level right now anyway.)
Using floats, I still would have had trouble with the timing for the sequencer, probably… but that would have led me to the realisation that I was biting off too much!
It’s not a complete loss. I really enjoyed trying to calculate sine waves while sticking to integer arithmetic . I found out about Bhaskara‘s approximation, implemented it, and then found some really nice code using bitshifts to do a Taylor Series approximation of a sine wave. (I wish I had the chops to come up with it myself!)
Reading the source of Drumsynth also completely changed my approach to the GUI code. I originally had all of the classes that make up the synth – Note, Chunk, Sequence and so on – also be GUI elements by inheriting Java Swing component classes. I think I picked this up from some book or tutorial, but it’s obviously not good. It breaks the basic principle of decoupling.
Drumsynth blew my mind with its simplicity. There are no classes as it’s written in C, an imperative language. The synthesis is just one long function! I almost didn’t know you could do that, having spent a year studying Java and OOP. But given that the app is non-realtime (meaning that there is a third of a second pause to calculate the sound before you can hear it)… this is the sensible approach. Logically, it is one long straight task that we’re doing.
So I ripped out the GUI code from my main classes, and stuck it into one class called Control. Drumsynth’s GUI is even more decoupled: it’s written in a different language – a Visual Basic form that calls DLLs to access the synth functions!
(Yes, I know this is pretty out-of-date inspiration – heck Drumsynth even cheekily uses INI files for configuration though they were officially deprecated – but I think the lesson on directness and decoupling stands.)
My overall lessons from this project are:
- Do normal stuff rather than trying to reinvent things.
- Find exactly what draws you to a project and make that the focus. E.g. with this I would’ve been better off making something smaller and more conventional but which allowed me to try some unusual FM stuff.
- Even though I’ve so, so much further to go, I kinda like low-level stuff. I mean, okay, nothing in Java is actually low-level, but still I was dealing with buffers, overflows, even endianness! Those are fun errors to fix.
- Read other people’s code!
Even more generally, there’s a kind of tricky question here. This project showed me that it’d be a huge amount of work to approach the quality level of some of the audio programming frameworks out there such as JSFX, VST/Synthmaker, or JUCE. If I’m interested in actually programming synths for musical purposes, I should use one of those.
On the other hand, these are all coded in C or C++ (maybe with another abstraction layer such as EEL scripting language in the case of JSFX). If I really want to understand fundamentals, I should learn C.
But, it’s not very likely I’ll get a job doing high performance programming of that sort, considering the competition from those with as much enthusiasm as me for flash graphics or cool audio, but much more chops! I’m at peace with that – I quit music to get out of a profession that is flooded with enthusiasts.
Stuff to mull over.