This blog was founded to promote study of black music. In the last months anti-racism has become unprecedentedly mainstream with the 2020 Black Lives Matter protests in the US and their global echoes. I support that cause and I’m glad to see this widespread shift in opinion affecting many organisations.
Closer to home, Irish people are channeling that energy into the End Direct Provision movement. Direct provision is a disgraceful, inhumane and wasteful system that deprives individuals and families seeking asylum in Ireland the right to work or cook their own food, for years on end.
The other anti-racism challenge for Ireland, from what I can see, is integrating immigrants, especially second-generation youth, at the community level.
I don’t play music or do musicological research anymore, but the respect I gained for black culture through both of those activities will always stay with me. I love black music so much, I could go on for days! And don’t get me started on the black philosophy, metaphysics, style and other wonders I glimpsed in the course of my old studies.
This blog stands for fairness for black people. How beautiful that will be when we get there, probably only song can express.
I was working on video rendering on Android devices – reading through and mashing together libraries and example code I found. The task as it’s currently structured has 6 main classes interacting:
An extractor to present encoded frames of video from a file
A decoder to decode each frame
An animator to keep this happening regularly and check if it’s time to show a new frame
A view which is the interface element the video will be shown in, and which I’m using as a convenient repository for all the other action (this seems to be pretty standard approach)
A renderer which talks to OpenGL, the rendering API
I’ve been bodging everything together in one prototyping/sketchpad project. My approach was to keep it just-about-functional for my one test case (displaying three overlaid alpha-blended video elements) as I swapped in and out components and ported bits from Java to Kotlin.
I definitely learnt a lot from that, but today I tried a different tack.
Instead of accomplishing a fairly complete task messily and crudely, I took the single component I wanted to understand and placed it in a blank, fresh project. Then I tested it!
I got the class, TimeAnimator, to report all salient variables to GUI elements. Like debug text but organised so I can see the situation at a glance.
I realised I couldn’t make it do what I thought it had been doing, which was fire events at a steady framerate equal to the video framerate (of 30fps).
I shuffled through variants of the class and learnt what they do.
After a bit, I realised none of them did what I wanted. I went back to the library I was studying and finally twigged that the controlling framerate/timing information came from the decoded video data, which was merely being checked (at a frequency much faster than the framerate) by the TimeAnimator, which is not meant to run anywhere near as slow as 30fps.
So far, so trivial. But I might have kept playing with the original code for ages without actually challenging my assumptions, distracted by all its moving parts and the fact that it looked somewhat like the final product I wanted.
I will definitely be reusing this technique of isolating the smallest component that I don’t understand, and “prickin’ it and frickin’ it” as my favourite rapper Guru might say.
Hope to be back here soon to discuss what this video functionality is actually for!
Warm analogue it ain’t. I knew when I started coding my synth-sequencer, Foldy, a few months ago, that it’d be harshly digital and crude sounding. I was inspired by tracker software as well as by two old PC-only music programs, Drumsynth and Hammerhead (which were the basis of my beat-creating project last year).
I’m releasing it today and calling it version 1.0. It works, but some iffy design decisions mean I won’t keep developing it.
That said, the code quality is a step up from my last release, the experimental art tool MoiréTest. I was able to go back and make big changes in Foldy, without the whole thing crumbling, which is always a good sign.
For the rest of this post I’ll explain what the program does, then what questionable decisions I made and how I would do it again.
(To try it yourself, download Foldy.jar from here and double click on it. If that doesn’t work try the further instructions in the readme.)
Foldy takes in a musical sequence, which you can type into a box in the app window. Notes are numbered as MIDI notes, where A=440 is at 69, and notes range from 0 to 128, and separated by commas. A rest is -1.
(By the way, did you know that, incredibly annoyingly, there is no industry standard for numbering the octaves of MIDI notes? The frequencies are agreed on, but one manufacturer’s C3 is another’s C4… how sad. This doesn’t impact Foldy though, I just work from the frequencies.)
The speed that notes are played is altered using tempo and beat subdivision controls. All the other parameters in the window modify the sound of individual notes. Only one note can play at a time. This kept things a bit simpler though, with the Java Sound API, opening another output line or mixing two together wouldn’t be much harder.
I was going to include a choice of mathematical curves, possibly Bezier curves, for the amplitude envelope, out of a perverse desire to avoid the bog-standard Attack-Decay-Sustain-Release model, which is suited to a keyboard instrument where a note is attacked, held and released. I was thinking this synth could be more percussive, inspired by the basic sample-playback model of drum machines and trackers (a type of sampler software originally made for Amiga computers and associated with the demoscene).
Unfortunately I didn’t finish the Bezier stuff, but in any case it probably wasn’t suitable. (For one thing, Bezier curves can easily have two y values for one x value.) In fact, I didn’t do any extra envelope options, partly because envelopes typically drive filters or modulations, but these are not allowed by my architecture. If there’s an obvious v1.1 feature, extra envelope curves is it.
One feature that did make it in is “wave-folding”. To get more complex waveforms, I cut a sine wave at a certain amplitude, and invert anything above that amplitude. This can be done multiple times to add a lot of harmonics.
However, this is a restrictive technique with a distinctive grinding, mechanical sound. All we’re doing here is shaping a waveform which is then repeated exactly at the period of the note frequency. The ear instantly picks up the lack of complexity.
I remember when I was a teenager, having the following bright idea: if I can see that the recorded waveform from my bass consists of repeated bumps, can’t I just sample one of those and repeat it/change the speed of it to get any bass note I want?
This is the basic concept of wavetable synthesis. However, when done as simply as that, it sounds completely artificial, not at all like a bass guitar. The sound of any real instrument has complexities like propagating resonances, changes in pitch, string rattle and other distortions/energy loss.
(E.g. listen to the low note in this sampled bassline – it’s starts really sharp, then reverts to normal. That’s because plucking of a stringed instrument raises the pitch of the note momentarily, especially on an open string – I think this was an open E string on the original sampled recording, just it’s been pitched up here.)
Foldy has no capability for such modulations. I could try put them in, but here we come up against the compromises I made at the start.
Because I was afraid that rounding errors would mount up and give me grief, I decided to keep everything as whole numbers, taking advantage of the fact that digital audio ultimately is whole numbers: a series of amplitudes or “samples” each expressed as, for example a 16bit or “short” integer. (Most studios mix at 24bit these days, but say CD audio only goes up to 16bit precision.)
This informed the basis of the synth. Desired frequencies and tempos are approximated by a wavelength and a subdivision length expressed in whole samples. 44100 samples per second might seem fairly precise, but for musical pitches, it isn’t. So I found a compromise that bounded pitch error to about 20 cents:
Foldy tries to fit multiple wave cycles within a whole number of samples, for example 3 cycles in 401 samples. This gives a bit more precision, because the wavelength is 401/3 = 133.667 samples, in between the 133 and 134 wavelengths that are all I could get otherwise.
I then use these bits of audio, which I call “chunks”, and which could contain a single cycle or a handful of cycles, in the same way I was using single wave cycles originally. So every note would contain hundreds of them. Then I decided I could reuse this division to store amplitude envelopes – I gave each chunk a starting amplitude, and interpolated between these. (Of course, this is redundant at the moment because my overall envelopes are merely a linear interpolation from maximum to zero! But with a curved envelope, the result would be to store the curve within a few dozen or hundred points, with straight lines from point to point.)
Ugh… I don’t even want to write about it anymore. It wasn’t well conceived and caused me a lot of hassle. It precluded any of the more intriguing synthesis techniques I like, such as frequency modulation, because pitch in this system is fixed for each note (and imprecise).
Long story short, when I opened up the source code of Drumsynth recently, I realised that… it just uses floats and gets along fine. For modulation, it simply keeps track of phase as another float. I should’ve done that.
(That said, I think Drumsynth’s sound quality is far from pristine. This isn’t from rounding errors, I’m certain, but from not doing more complex stuff like supersampling. But, that’s out of my ability level right now anyway.)
Using floats, I still would have had trouble with the timing for the sequencer, probably… but that would have led me to the realisation that I was biting off too much!
It’s not a complete loss. I really enjoyed trying to calculate sine waves while sticking to integer arithmetic . I found out about Bhaskara‘s approximation, implemented it, and then found some really nice code using bitshifts to do a Taylor Series approximation of a sine wave. (I wish I had the chops to come up with it myself!)
Reading the source of Drumsynth also completely changed my approach to the GUI code. I originally had all of the classes that make up the synth – Note, Chunk, Sequence and so on – also be GUI elements by inheriting Java Swing component classes. I think I picked this up from some book or tutorial, but it’s obviously not good. It breaks the basic principle of decoupling.
Drumsynth blew my mind with its simplicity. There are no classes as it’s written in C, an imperative language. The synthesis is just one long function! I almost didn’t know you could do that, having spent a year studying Java and OOP. But given that the app is non-realtime (meaning that there is a third of a second pause to calculate the sound before you can hear it)… this is the sensible approach. Logically, it is one long straight task that we’re doing.
So I ripped out the GUI code from my main classes, and stuck it into one class called Control. Drumsynth’s GUI is even more decoupled: it’s written in a different language – a Visual Basic form that calls DLLs to access the synth functions!
(Yes, I know this is pretty out-of-date inspiration – heck Drumsynth even cheekily uses INI files for configuration though they were officially deprecated – but I think the lesson on directness and decoupling stands.)
My overall lessons from this project are:
Do normal stuff rather than trying to reinvent things.
Find exactly what draws you to a project and make that the focus. E.g. with this I would’ve been better off making something smaller and more conventional but which allowed me to try some unusual FM stuff.
Even though I’ve so, so much further to go, I kinda like low-level stuff. I mean, okay, nothing in Java is actually low-level, but still I was dealing with buffers, overflows, even endianness! Those are fun errors to fix.
Read other people’s code!
Even more generally, there’s a kind of tricky question here. This project showed me that it’d be a huge amount of work to approach the quality level of some of the audio programming frameworks out there such as JSFX, VST/Synthmaker, or JUCE. If I’m interested in actually programming synths for musical purposes, I should use one of those.
On the other hand, these are all coded in C or C++ (maybe with another abstraction layer such as EEL scripting language in the case of JSFX). If I really want to understand fundamentals, I should learn C.
But, it’s not very likely I’ll get a job doing high performance programming of that sort, considering the competition from those with as much enthusiasm as me for flash graphics or cool audio, but much more chops! I’m at peace with that – I quit music to get out of a profession that is flooded with enthusiasts.