Mural Musings

Last week, as I scrolled around Google Maps trying to orient myself in Brussels – where I have just relocated – I spotted a familiar name: Schuiten.

It turns out that Francois Schuiten, one of my favourite ever comic book artists, executed a mural in 2013 (collaborating with a painter and set designer named Alexandre Obolensky) on the Chaussée d’Ixelles, only a few minutes’ walk from my temporary accomodation. I’ve been a fan of Schuiten’s work for ages, so I had to pay a visit.

On my way over, though, I saw this:

A mural of a cartoony city scene with a sinister figure skulking on a steeple in the foreground. The mural takes up a whole three-storey wall.
Photo by Lionel Gripon, from trompe-l-oeil.info

It’s not by Schuiten. Nor is it exactly a mural, but printed on vinyl. At first glance I thought it was some American superhero thing. The French pun on moustique, (“mosquito”) for the hotel name, plus the European houses and cars, should have alerted me. It’s actually from a comic book panel drawn by Olivier Schwartz in 2013 for a spin off of the Spirou series. The artwork is called La Femme Léopard (“The Leopard-Woman”).

I passed it by fairly quickly first time round. I don’t love superhero themes, and the cartoony humour in the details is something I could dismiss as “charming”.

It was only once I reached my intended destination, L’Arche (“The Ark”) by Schuiten and Obolensky, that some shock set in.

My favourite living comic book illustrator – an artist whose utopian visions of an alternate-history Brussels were part of the reason I moved here in the first place – put his name to this. Is it just me, or is it badly lacking in punch?

A painted mural of a vast ship with trees bursting out the top looming over an old-fashioned city. The mural is three storeys high.
Image credited to Graphivore, from bd-best.com

I couldn’t deny it had notably less impact than the cartoony piece I’d passed on my way over.

How did Schuiten, an experienced illustrator and even something of a cultural ambassador for this city, come off second best like this?

Some technical points came to mind as I compared the two pieces.

When your image is surrounded by the sky, just standing out against the light is a challenge. Unfortunately, Schuiten’s mural is north-facing and therefore permanently in shadow – with even its brightest parts much dimmer than the sky.

Compounding the problem, though, the tones of the ship’s hull and the watching figures are not dark enough. (We can see in the bottom left corner of the painting that darker tones were available, so it’s not just a limitation of the paint used.)

This lack of contrast saps the mural of its impact. Meanwhile La Femme Léopard features pure black lines and two large masses of pure or nearly pure black (one of which is also the focal point of the composition).

Both murals use the visual idea of a bright space between dark elements. But L’Arche pulls its punches, not getting dark enough where it counts!

I do have to acknowledge Schuiten’s ambitious use of the building. While its north-facing aspect is a problem, another quirk is incorporated quite cleverly. The trick relies on a visual idea in the source drawing: the near head-on view of a ship whose curved prow becomes a vertical line leading the eye upwards, where it hits the dramatic peak of the ship’s 2D shape – and appreciates its looming mass!

A drawing of a vast ship with trees bursting out the top looming over an old-fashioned city
Francois Schuiten’s original drawing L’Arche, from altaplana.be

In the mural, that vertical line is aligned with an edge where the wall turns slightly, so that the two sides of the hull are angled away from each other in real life. It’s a good idea, yet not entirely successful. I suspect that the eye would prefer a straightforward flat image to peruse.

On the topic of looming weight, though, let’s examine how both pieces arrange their main shapes.

Huge, hulking vehicles and buildings are a preferred subject of Schuiten’s – they dominate entire pages in L’Archiviste, for example, to great effect.

An intricate drawing of a sci-fi cityscape with house-sized buildings that resemble cylinder heads from a car engine. In the foreground two figures clamber over one such construction. The background fades into pale mist.
Here, we have buildings that look like parts from vehicles. A page from the book that got me into Schuiten

Giant, rugged, inhuman masses, swathed in mist… I’m no expert, but it’s a lot like the Romantic era’s striving for the sublime. I feel like in the last decade or so this kind of depiction of fantasy machines/buildings went solidly mainstream. The Blade Runner and Dune remakes, games like Shipbreakers, the recent Star Wars spin-off Andor, all celebrate such harsh grandeur. And in general, I’m all for it.

But, filling a page with a subject is different to filling a wall – the eye can parse the former instantly and feel the impact, while my first impression of Schuiten and Obolensky’s mural was of having to drag my gaze over a large, unrewarding expanse of rusted panels. And the already aggressive perspective of the source image is even more distorted when seen from below.

A different view of the mural of the looming ship, revealing the rusty panelled sides of the ship and the figures looking up at it from below
A large, unrewarding expanse of rusting panels? Photo by Linda De Volder

The big, weird, exotic shape is just too big and weird to read at that scale!

The Schwartz mural, despite being copied straight from a comic book, fits better into its surrounds. The picture’s horizon line (the fundamental axis of its city vista) is placed at the same height as the building’s eave line. This lends a reliable frame for the perspective. (And above that horizon is a starry night sky, rather wittily skirting the problem of competing with the real sky’s brightness.)

A mural of a cartoony city scene with a sinister figure skulking on a steeple in the foreground. The mural is seen between two other buildings.
Photo by Sanza Francois, from bdmurales.wordpress.com

Rather than trompe-l’oeil, we have an unambiguous window into a completely different perspective. But some subtle interactions with the surrounds help to sell it: the foreground masses feel vaguely contiguous with the real life walls on each side of the mural, and the two-point perspective means that verticals (sides of buildings) in the image are vertical in real life. They align with the mural’s edge and verticals in surrounding buildings.

It all adds up to a feeling of peering between buildings at a city scene. In the modern world, anyhow, we’re used to viewing photos and screens from less-than-perfect angles: with the clear orientation provided by the framing devices, the mural reads fine from street level.

By contrast, Schuiten and Obolensky, in pursuit of something more illusionistic, felt the need to tweak proportions. They scaled down the human figures, the element closest to the viewer. (Similar to Michelangelo sculpting David’s body at a smaller scale than his head.) But this weakens the composition. The source drawing has a point of tension where a human figure almost brushes against the downward sweep of the hull. The mural shrinks down that figure, leaving more blank space between it and the ship, and so that focal point loses energy.

I don’t like to keep ragging on Obolensky (who passed away a few years ago) and Schuiten. Their piece is the more ambitious and technically difficult of the two. Even the fact that it’s all hand-painted rather than printed out.

So. Although Schwartz’ blown-up comic panel is crisp and attractive, “stick with what you know” is too narrow and prescriptive a conclusion to draw from this tale of two murals.

Maybe I’d go with “don’t muddy your intent with gimmickry”. I think L’Arche would have worked better on one flat wall, keeping the proportions and the distribution of tones of the source image.

I still love your work, Francois Schuiten!

Diatonic Tetrachords

Recently, I started thinking about what four-note combinations are possible within the diatonic scale. I was familiar with the idea of 4-note cells made of adjacent tones, for example, D E F G. If I broadened this to allow non-adjacent tones, for example, D F A B, would there be an unmanageable number of combinations?

Actually, there are only 7 x 5 = 35. All possibilities are covered if, from each of the 7 scale notes, we ascend in 5 different ways: in seconds, in thirds, in fourths; and then following these two patterns: second, second, third and third, second, second.

(By the way, this whole analysis is written without regarding what octave a note is in. The technical term for a particular note – say, C – considered without regard for what octave it’s in, is a “pitch class”. So, except when I discuss inversions, my analysis is in terms of pitch classes.)

So anyway, I had this grid of chords.

Or can we call them that? Seven of them are indeed conventional seventh chords, and there are some other common chords in there such as the Maj(add2) “Steely Dan chord”. But most of these structures would be too dissonant to be conventional chords. I think their interest is as melodic source material. I’ll call them “tetrachords” without the implication that these notes should be sounded together.

So we have this grid. But actually I hate when music theory books give grids of algorithmically-derived combinations. How boring. Where’s the meaning?

To try find out, I investigated some of the characteristics of each of my 5 movement types and came up with… another grid, but with more interesting info.

Three of my categories (actually, I call them “types” in the column names of my second table) yield tetrachords with inner symmetry. That means that two of the notes of the tetrachord are exact reflections (inversions) of the other two.

I use Steve Coleman’s terminology for the two types of symmetry (one reflects around a single note, Spiral #1, the other reflects around the crack between two notes, Spiral #2).

Next, the “span” stuff is about the interval between the top and bottom note of the tetrachord if they are crammed into one octave (known as a “closed position” chord). We see that the stacked seconds fit within a fourth, which makes sense as they are just four adjacent notes in the major scale; and that the stacked thirds (the seventh chords) cannot fit in an interval smaller than a sixth.

In my investigations I was drawn to tetrachords with tritones as these are, unsurprisingly, particularly colourful. They are marked in the second last column.

Finally, although there are 35 tetrachords, a number of these are exact transpositions of each other. If we consider the exact transpositions to be duplicates and remove them, we are left with 20 unique structures. For more direct comparison, I’ve transposed these 20 so they all start on C. I was able to coin or reuse a name for all but three of them, as you can see.

To finish, I’ll chatter for a bit about the musical possibilities of these structures. This part will reflect my past as a jazz bass player… I’ll be mostly thinking of possibilities for writing riffs and improvising.

Here’s a good riff. When I first heard this song as a young teenager, the darkness of that clean guitar pattern blew my mind.

We could say the riff is in F phrygian, or thinking in diatonic tetrachords we could say it’s a C diminished triad (VII degree of Db major, that is) with an added fourth. That fourth, F, is the root in this particular inversion.

Here’s another good riff.

The tetrachord used here is, well, a jazzer would say it’s a G minor 6th arpeggio, but in terms of our categorisation, it’s stacked thirds off the 7th degree, or an E half-diminished seventh arpeggio.

When you are considering the melodic possibilities of these structures, remember that any of the four tones could be the root.

To my mind all 20 of the unique shapes are interesting. All the stacked 4th ones have flavour. All the ones with tritones have flavour.

The stacked step ones (perhaps better thought of as four adjacent scale tones) are what started me down this investigation. Steve Coleman finds them in Charlie Parker. It’s my understanding that medieval music theory built modes by combining two of these tetrachords, placing one at the root and the other at the fifth.

I find that these cells (the four adjacent scale tones) invite syncopated improv. Lacking a fifth, they are unsettled, and with their small range, very melodic, so it’s tempting to explore the constrained yet rich combinations.

The strongest tetrachords of all maybe in a tonal context are the filled triads that don’t involve tritones. The first three bars of the horns melody in Parker’s “Relaxing at Camarillo” are entirely made of a triad plus a fourth.

And the minor reflection of the major triad with a fourth – that is, the minor triad with a second – is equally strong.

Sticking to one diatonic scale might seem constraining. I’ll end with two ideas for generating further tasty sets of pitches.

You could take one of these tetrachords and add a new note, and consider the new note the root note. (The resulting structure needn’t be diatonic.) To my ear this can create a nice tension between two worlds, the inwardly consistent tetrachord and the root note which can be a rare release or contrast.

Or you could take one of the tetrachords and add it to a transposed version of itself. In this scenario, zero, one, two or three of the notes might overlap so you end up with a whole bunch of possible scale-like structures of 5-8 notes. But the point being that each of the two tetrachords could retain its identity and inner dynamic.

Hope you enjoyed today’s foray into theory, and maybe even got an idea for something to investigate or jam over yourself!

Subtractive Sounds

I made some presets for a nifty little software synth called Lokomotiv, and felt like sharing!

The sounds are on the basic side. A few sources of inspiration were: the crappy soundcard FM synthesis that I heard when I first started sequencing little riffs on my computer aged about 12; the soundtrack for Age of Empires II; some of the sounds in Hiroki Kikuta’s soundtrack for Secret of Mana (which I’ve blogged about before); and maybe some other game soundtracks like the mighty Deus Ex OST. I went for lots of “guitar”, lots of mellow pads, and lots of “organ”. Plus some vaguely ethnic and folky things.

As a synth, Lokomotiv is quite constrained – which I like, but it limits the sound palette. In particular, the filter and the amp share the same envelope, so adding independent filter movement like a sharp initial attack, a funky quack or some movement in the tail, is pretty much impossible.

There are three oscillators, but each is locked to a central frequency and only produces a limited palette of waves. It’s a smaller possibility space even than my cheap little Korg Minilogue.

So I had to squeeze as much juice as I could out of what was there. This pushed me to learn about hard sync and pulse-width modulation. And I also made use of detuned saw waves. (BTW if you want authoritative info on these topics, check out the Synth Secrets series from Sound on Sound magazine.)

I had never understood hard sync before although it’s a feature on my Minilogue. The idea sounds silly: there are two oscillators, each with their own frequency, but the second resets every time the first completes a cycle. So, you’d be right in saying that the second does not in fact have its own frequency . The subtlety is that second oscillator does vibrate at its own speed in between every reset: if its speed is faster, it will fit in multiple peaks in between each reset; if its speed is slower, only part of its waveform will play before each reset. So actually, although the second oscillator’s base frequency is locked, you can create a wide variety of wave shapes at that frequency. The shape is that multiple or else fractional waveform that the second oscillator spits out between each reset.

So you’re generating extra harmonics, basically, of the base frequency. For technical reasons, the spectra generated this way tend to have gaps in them, and this can mimic (extremely crudely) the unimaginably complex sound generation of acoustic instruments like guitars. Particularly if you modulate the frequency of the second oscillator (and hence the resulting harmonic content).

I used this technique for lots of my guitars, pads, basses and organs. They often reminded me of FM sounds, even though this is subtractive synth which is meant to be the polar opposite of cold FM. The thing is, at the mild levels of modulation that one uses for these instruments (as opposed to say a distorted guitar or brass sound or a bell), both hard sync and FM are just about adding in a few harmonics, and end up sounding comparable. (“Adding in a few harmonics”, of course, is a synth technique in its own right, additive synthesis.) However, when you increase the severity of the effect, FM goes into a different territory, metallic and shooshy, while hard sync gets very gritty and noisy (and unusable, really).

And that last point has to do with a limitation of hard sync that got annoying. I couldn’t explain the maths to you, but hard sync basically always generates a side frequency. It’s often out of audible range, but unfortunately, particularly in the upper registers, many of my sounds betray this unwanted aliased, whistling tone.

Pulse-width modulation is a bit simpler to visualise. You take a square wave, and lengthen the part where the wave is up and shorten the part where it’s down. Changing these lengths, known as changing the “pulse width”, also modulates harmonic content in a manner some describe as “stringed instrument like”.

I used PCM for some basses, pads and synth leads.

Finlly, the detuned saw waves got used for chorusy stuff like choirs and pads, and at lower levels of detuning, for reed instruments like sax and accordion. (I think the sax was so un-sax-like I left it out of the sample pack.)

Finally, Lokomotiv has an amp section with some overdrive and I used that to pull out some more sizzle and shine from some of the pads, synth basses and electric pianos.

All in all… well I think some of the sounds are nice and evocative, for sure, of my 90s-2000s youth of fooling about on PCs – bad MIDI, dinky game OSTs. I’m not sure if the samples are all that useful. They’re quite plain really. But I was glad to make them just as a recognition of my achievement.

I’ve put my presets an RPL file you can load into Lokomotiv. Lokomotiv itself is a free VST instrument you can easily add to your Digital Audio Workstation (Ableton, Reaper, Logic, whatever). Lokomotiv is free and I think it’s a good tool, so give it a shot by all means. If you search for it you can find a 32-bit version if you need one (I did cause I’m a Windows 7 boy).

And in any case, you can check out the samples here. 577 files, sampling about 41 instruments.

Thanks for reading!

Keep It Simple

Having been into game design since way back, I’ve read my share of “how to make a game” articles. Many of these advise creating a finished version of the simplest computer game you can think of. Accordingly, last month I started designing a version of Tic Tac Toe, and coding it in a programming language I’m learning for work: Go.

Thinking that this would be too easy, I upped the challenge by taking a proper software engineering approach: structuring my code into reusable parts that only interact with each other via narrow, strictly-defined channels. I managed to create a clean separation between my underlying game engine and the specific Tic Tac Toe game logic, using object-oriented programming techniques (methods and unexported fields) and functional programming ones (closures), plus some common patterns such as events and a dirty flag.

Actually, though, defining the Tic Tac Toe game logic wasn’t trivial. It took multiple sessions of pen and paper revision to figure out the data structures, the underlying engine entities, and the AI.

(Some of that was just learning the intricacies of Go. I’ve found that any programming language, no matter how elegantly-designed, contains unpalatable complexities somewhere, but Go staves that off for as long as possible. Which is nice. All the same, pointers, embedded structs, and the details of how slices work, had me scratching my head at times.)

As on previous occasions, I’ll present my account of developing my app – which I call Triplet – as a software postmortem. I hope you enjoy the read.

You can check out the game code here.

What went well

  • Lots of design over about a week of evenings. I was influenced by some advice from the preface of How To Design Programs. I tried to determine what I want my game to do (functions) to what data (structures passed between functions).
  • Isolating the input and output code. I used the ubiquitous SDL library (via the go-sdl2 bindings) to handle opening a window, drawing to the screen, and capturing mouse clicks. But I was proud that I was able to hide this implementation detail from the game logic. (I.e., I would be able to switch out SDL for a different library, without any modification to my game logic.)
  • Creating reusable UI widgets. I made generic Button and TileGrid objects in my engine, which were then used for the reset button and the game over message, and to display the Tic Tac Toe board.
  • Callbacks and closures. Taking inspiration from JavaScript, I had my Button objects incorporate a custom click function which can be defined at the moment of creation (using Go’s struct literals, so no need for a constructor or factory method). This function is both a callback, (a bit of code handed over to an object which it can execute at will later on) and a closure (a function which remembers what variables were visible to it at the moment it is defined and can access those variables forever).
  • “Triplet positions”. This concept refers to all possible “threes-in-a-row” which one can claim to win a game. I made a data structure simply representing these positions, and then had the AI and the game rules use it when interpreting the raw board state.
  • AI that can play perfectly. It has weights for all possible states of a triplet position (the highest weighting is for blocking an imminent enemy win), and then gives each empty square on the board a value by summing the weightings for each triplet position it’s part of. The highest valued squares represent perfect plays. (To make the game moderately fun, I have the AI take random moves now and then, otherwise the human player could never win.)
  • A finished game and a codebase that I never lost control of. Such are the benefits of designing beforehand. Even then, I found myself at times having to design on the fly, but the underlay of sensible structures meant it never turned into a mess.
  • Picking the easiest project I could think of. A few months ago I tried to code a Breakout clone in Python – and got bogged down in some bounce physics and couldn’t finish it. A bit of humility this time led to a much more satisfying project.

What went badly

  • No tests *hands-over-face emoji*. Yes, I was that guy and it’s bad practice. But on the other hand, the game works. I got away with this because of the extremely limited possibility space in the game (and the lack of features, see below).
  • Multiple, unreconciled conceptions of time. We have the time of sequential Tic Tac Toe turns, basically up to 9 discrete “moments” – and also the real time in which the game waits for mouse clicks. This became an issue because I didn’t want the AI to move instantaneously after the player, which would feel jarring. So the AI has to participate in “real time”, waiting for 750ms to take its turn. I did this by sending in a “tick” callback function which is called by the engine in every iteration of the game loop, to check whether enough time has elapsed for the AI to take a move. This doesn’t fit with the otherwise event-based design. If I wanted to make animations, say, this would have to be refactored.
  • A bug (which I find rather amusing) with the delayed AI reaction. The player can actually play the AI’s move if they click an empty square during the wait period!
  • Unfinished graphics and input handling. You can’t clear a graphic, only draw on top of it; and overlapping only works if the order of items in the drawables slice is the draw order you want. There is no conception of overlaid buttons blocking clicks from whatever’s below. But… for the current design, it’s workable. But perhaps a bit silly to bother making an engine that provides hardly any features.

Well, that’s about it. I hope this post wasn’t too narrowly focused on details. Overall I had a ball with this, it’s delightful to sit down and do an hour of work, and have it slip neatly into a larger structure (and compile!). My imperfect but dogged attempt to stick with boring, standard programming patterns, enabled that.

I appreciated how much I’ve soaked up from reading game dev articles, post-mortems, code. I have a bit of an idea of what game’s codebase should look like. I’m sure my implementations are somewhat incoherent and wrong-headed, but even half an idea of what I was doing really helped.

I also appreciated what I perceive as a contradiction at the heart of game dev…

In the rest of the software industry, programmers reject approaches likely to cause unpredictability: functions with side effects, objects with persistent state, or strong coupling.

But fun computer games, aiming to present “a series of interesting choices”, are all about those rejected things! An engrossing game often requires side effects (of player decisions), persistent state (of the world), and coupling of widely separated systems.

I believe games have an inherent pull away from computation’s abstraction, towards concreteness – ultimately, towards the embodied human being who’s experiencing the pleasure. There’s a resistance there (to the correct, mainstream, rational approach) that I find philosophically provocative.

Anyway, thanks for reading! Till next time.

Boom Bap

It’s been nearly a year since I posted here! In the meantime, I bought a sweet synth, and today I’m sharing some drum samples I made on it.

My synth is a Korg Minilogue, a two-oscillator analogue synth but with digital preset management and self-tuning. I recommend this little beast, it’s a lot of fun to play with. I honestly haven’t used it to jam with people or make a finished song yet. But I have programmed a bunch of patches (sounds) for it.

You can download them here! That’s 99 WAVs ready to play in your music production software of choice.

The Minilogue can’t synthesise realistic drum hits. But it can get within shouting distance of the drum machines that have been ubiquitous in pop for a few decades, your 808s and 909s (which were analogue circuits, by the way). This level of fidelity was and is about mimicking a few key timbral qualities of a snare, cymbal or bass drum – not convincingly replicating the full sound.

I crafted my sounds mostly by ear, but I did use some basic recipes which I’ll very quickly summarise here. (If you’re interested, I recommend the drum synth series on Sound On Sound for more depth.)

Kicks: these are done with a downward swooping tone. There are two ways to get this on the Minilogue: using the PITCH EG INT knob to cause the second envelope to detune the second oscillator; or turning up the filter resonance so high that it’s the primary tone and then modulating the cutoff so it swoops down. I tended to use the first oscillator as a subtle extra tone, suggesting overdrive. Similarly I used the SHAPE knob on VCO2 (the second oscillator) to add some harmonics for that distorted feel. Some filtered noise was used too.

Snares: these are done with two tones, generally about a sixth apart (for reasons to do with the physics of drum skins and snare rattles, their harmonics are not spaced according to the normal harmonic series – this sixth approximates the distance between the first two harmonics of a snare), plus filtered noise. Then you tweak the knobs and hope for the best. The PITCH EG INT was good for adding subtle pitch movement for a more “boing-y” feel.

Cymbals: you can’t really make anything like a cymbal on a simple analogue synth. The closest I could get was by turning on ring modulation for some vaguely metallic effects, and also using CROSS MOD DEPTH which I believe causes some frequency modulation, also vaguely metallic. Then mucking about with noise and filtering. The Minilogue seems to have interesting edge cases and interactions between its few elements, so this approach is not as bad as it sounds.

Other drums: again, as with the snare, I used two tones pitched by ear, and tried to use the SHAPE knobs, CROSS MOD DEPTH and all the rest to get away from sounding like a musical interval.

Unlike when programming dance music basses or leads, the filtering tends to be pretty subtle: resonance turned down, envelope inconspicuous. I did use the HI PASS CUTOFF which is a handy high-pass filter later in the chain, to lighten up some cymbals, snares and claps – but again, subtle.

I think that’s about it. I would be delighted if you found a use for the samples. A credit would be appreciated of course. Hope you have fun and you find something tasty in there.

Synth Update

Just a real quick one today. I’ve made a new version of my software synth Golden (that uses the golden ratio as its sound source!) and you can check it out here: https://drive.google.com/file/d/1LnacJhq6x6Zuam2NLdw2seiM93gbcf2Q/view?usp=sharing

I simplified the interface a lot, mostly by removing unnecessary options. There are no longer two instances of the additive synthesis engine bundled together – a more flexible way to experiment with that stacking effect is to use multiple tracks in your DAW.

For example, you can get instant deep drones by making three identical tracks with a long MIDI note, and then setting track 2’s instance to “First overtone” and track 3’s instance to “Second overtone”. This will get you a chord tuned according to golden ratio intervals! The sound is a little harsh but it’s amazing with the well-known free delay effect NastyDLA providing some dusty air.

I also fixed the polyphony/retriggering issue so notes will behave as expected. And I fixed a bug in the 8th voice and standardised the startup values.

As always with additive synthesis, watch out, it can get very loud.

That’s it from me, have fun if you download it!

Revisiting A Classic

I finished another programming project and I think it’s my strongest yet, thanks to me finally getting serious about testing! I called my app HuffmanRevisited because it implements the classic Huffman coding algorithm from 1951, and also because I had previously tried to program this algorithm a few months ago.

This time round, I coded it in Java, not JavaScript. It probably required about a solid week of work. And unlike my earlier attempt, I made a finished app. It can:

  • load a text file, encode (compress) it and save the encoded version, or else
  • load a previously compressed file, decode it and save the original content back as a text file

(You can read my code here if you like.)

A few aspects of my approach helped make coding this an enjoyable task: I tried to specify it rigorously, I wrote a lot of tests, and I tackled a clear, well-known problem.

Before Christmas, I read a book – well, no, I skimmed a book after reading the prologue (which took multiple attempts) – called How To Design Programs. It’s an MIT textbook, you can read it here. I recommend it.

My paraphrase of the book’s prologue is: “precisely specifying what a program does is the hardest part of making it.”

Of course, I had encountered this sentiment in my Object Oriented Software Engineering module in the National College of Ireland. But the MIT textbook avoids the familiar paraphernelia of design patterns, verbose conventions and profuse diagrams. Instead, the book’s prologue challenges you to specify a problem by answering two questions: how is the data represented (in terms of data types that the machine can understand, and at all steps of the program)? and what are the functions (in terms of operations on those types)?

I mean, when I write it out, it’s dull and obvious. Yet, time and again I’ve found myself skimping on this specification step. Because, yes, it is the hardest part and one’s brain tries to escape it.

Even after my epiphany from reading the MIT book, I still evaded it. I specified most but not all of my Huffman encoding app before I started, thinking that the remainder was “simple”. But the simple part is never simple and if you haven’t thought it through, the solution will be incoherent even if it seems to work.

I failed to specify all of HuffmanRevisited, but at least I knew that this failure was to blame when I witnessed complexity mushrooming in front of my eyes as I tried to solve new, small problems that kept cropping up.

BTW, I’ll mention a couple of those little problems to see if you spot a pattern that kind of pleased me:

  • accessing individual bits from an array of bytes
  • packing an unpredictable amount of bits into an array of bytes
  • turning Java’s int and char types into a byte representation (not just casting which truncates them)
  • saving a compact representation of a binary tree containing characters in its leaf nodes (the ‘Huffman tree’)

Yeah… the pattern I spotted is that I’m doing low-level stuff and pushing somewhat against the limitations of Java. This is nice because I’ve been looking for an excuse to study a more low-level language!

The other thing that went really well with this project was testing. I’d written tests in JUnit before, but to be honest I was doing it to fulfil obligations in school assignments. Just like with specifying rigorously, I knew that tests are a great idea but was lazy about writing them.

I totally changed my tune once I had the framework up and running. (I used JUnit 5, Maven and NetBeans 11, and I mention this combination because I had no joy with JUnit 4 or with Ant.) I realised I’ve always done a lot of testing, but amateurishly: printing variables to the console all the time. That works okay… until your program starts to have a decent amount of methods on the call stack (i.e., functions that call functions that call functions that call functions…) and you spend your time trying to remember in which method did your gnomic text originate. Plus, all those print statements mess up your code.

Not to sound too much like a new convert, but using a test framework was a delight after all that. It’s much more organised. (And with my let’s say “wide-ranging” mind, I need all the organisation I can get!) It’s just like what I was doing before, except you only see debug text if there’s a problem, you get to choose any range of values you like to test, you can test as small or as large a section of the program as you like (as long as you’ve made the program itself reasonably articulated), and all of this business lives in separate files. Oh, and you get a cheerful screen of green when you pass your tests!

It’s enough to warm anyone’s heart

Okay, so specification and testing are non-negotiable aspects of real-world software development, but the last aspect I want to discuss can be more of a luxury: a clearly defined problem.

Until I get a start in a programming job, I can’t be sure, but my impression is that even communicating what a problem is, never mind completing a rigorous specification, can be hard in a typical business context.

However, I did this project for self-study so I got to choose exactly what to work on.

(I was helped in this by a comp sci book called The Turing Omnibus that a mentor very kindly lent me! It has a chapter on Huffman coding. The hook of the book, I would say, is that it conversationally introduces topics but doesn’t take you through every nuance. For example, unlike the Wikipedia article on Huffman coding, it mentions neither the need to pad bytes with zeros, nor any scheme for storing a b-tree.)

I was so glad I chose such a an old chestnut of an algorithm to implement! When I was refactoring my way out of that mushrooming complexity I mentioned earlier, the clarity of my app’s intention was a godsend.

Even better was the lack of edge cases. I could be certain my program had worked when it took a text file, compressed it into a smaller file, and then decompressed that encoded version into the exact same data I started with!

That’s intrinsically neater than some other areas I’ve boldly attempted, for example digital audio or vector graphics, where you need good control of sampling and rounding.

When I do go back to such complex topics, I’ll have a crucial bit of extra experience with the exact weapons needed to contain the ambiguity. Testing and full specification.

So, I’ll sign off there. My app could easily absorb some more effort. The next thing to work on would be efficiency. Do some profiling, and just comb through it for wastages. I can also think of some cool ways to present it, but no point hyping work I may not get around to.

Anyway, I’m pleased with it already. Jaysusin’ thing works, and the code structure is fairly sensible.

Thanks for reading. Take care in these hard times!

The header image “Massachusetts Institute of Technology Parking Lot, Memorial Drive, Viewed from Graduate Housing” by MIT-Libraries is licensed under CC BY-NC 2.0. I chose it because David A. Huffman might have seen such a view as he wrote the term paper in which he invented his compression technique.

JRPG Song Forms

I love classic Japanese console RPG soundtracks like Final Fantasy VII and Secret of Mana. The idea of writing in that style appeals to me. But one thing that saps my confidence is when I struggle to find a section to follow a fragment I’ve already written. I tend to grab at the first possibility, even when the connection is weak or forced.

It would be great to have a general idea of how sections are shaped and connected in these songs.

So today I’ve analysed ten of my fave tracks by Noboe Umatsue (Final Fantasy VII) and Hiroki Kikuta (Secret of Mana 2, Seiken Densetsu 3). I wanted to know:

  • How long are the songs and sections?
  • What elements are repeated, and how many times?
  • What textural and harmonic progressions connect sections?

I ended up with this giant chart (which you can see in full size here):

TMI

Let me try explain this crazy chart. As you can see, I laid out a timeline for each song, boiling down everything in it to the following abstract categories:

  • beats & riffs – 1-4 bars long:
    Sometimes riffs change their note content to match a chord progression. But I still view it as the same riff. E.g., the synth arpeggio in ‘Prelude’.
  • phrases – the units of melody, 2-6 bars long:
    Of course, the judgement of phrase length can be arbitrary. I just decided these cases intuitively, trying to avoid fussiness. So, my chart doesn’t show every little motif.
  • sections – the large-scale divisions

If you’re familiar with sequencer software, you’ll recognise where I got the idea for all this. It’s how these tracks would look in a sequencer’s “Arrange” window: horizontal lanes containing MIDI clips: short, repeated grooves and beats, and longer melodic or chordal themes.

However, I’m not representing every instrument. I use the “Ostinato” lane for any repeating figure or combination of repeating figures, and anything that I deem to be a melody or theme (whether single note, harmonised, counterpoint or chordal) goes in the “Phrases” lane.

I’ve done the jazz musician thing and reduced the harmony to chord symbols. I don’t condone this in general. It’s just to sketch out what’s going on for skimming purposes. And while I’m confessing sins, I also used mode names to describe chords. In a past post I complained about overuse of modes as an explanatory device. However, I think modes are the best explanation for aspects of Hiroki Kikuta’s music.

Let’s analyse!

Now, these are game soundtracks and the structure is first and foremost determined by having to loop indefinitely. Every one of these tunes has a section, the loop, that will cycle for as long as your game character stays in that location or game state (e.g. the battle screen). Four of the songs also have a preceding section that I call the intro.

The looping is part of the aesthetic, providing a hypnotic dreaminess, a melancholy, an escapism into something both boundless and yet safely predictable.

Obviously, looped music needs both variety and smoothness if it’s to avoid annoying the listener.

I never completed Final Fantasy VII (or even played either Secret of Mana or Seiken Densetsu 3) but I remember songs getting annoying when you had to redo a task too many times, like the Chocobo race. Or even the battle music, sometimes it’s the last thing you want.

Entrances and starts of sections are almost all square and on the beat. Melodic pickups are used for sure, and drum fills, but there’s never a sensation of skipping the downbeat or disturbing the start of a section. The music, after all, shouldn’t demand too much attention. It should provide drama and atmosphere, and depth for repeated listening, without snagging the ear. This doesn’t prohibit dissonance, strange sounds or unusual time signatures. But they must be safely contained in comfortable box-like structures.

Changes in instrumentation or texture are obviously important to provide diversity within the short loops. I tried to depict the instrumentation changes in the following chart:

4 songs have a (purple) intro section. Each cell of text stands for a musical texture. So, ‘Prelude’ has two textures, synth for the intro, and synth, strings & woodwinds for the loop. ‘Tifa’s Theme’ has no intro, but 5 distinct textures (instrument combinations).

Full size version here.

Again, I’m not happy with this chart. The bars look like a bar chart, but although I am depicting the song structures chronologically from left to right, longer bars don’t represent longer time periods: instead, they represent songs that have more instrumentation changes.

That’s confusing and I’d like to improve on this in future.

Generally there’s a lot of keyboards, woodwinds, strings, mallets. Bit of voice, reed instruments, plucked strings. And a leaning towards kitsch things like barrel organ, accordion, music box.

The orchestration is not dense. I counted at most five different instruments at any time. This has to do with available tech, of course. These tracks are in a sample-based format, similar to tracker music, with (I’m guessing) 8 or 16 simultaneous samples permitted at once.

I presume the instruments were sampled from Yamaha digital synths. It can be hard to tell if something is meant to sound “like a synth”, or like a synthesised version of something real. That kind of stuff gives a lot of the aura of these soundtracks. I’ve spoken about it a bit before.

All right, let’s get onto the structures!

About half of the tunes have a loop length below a minute, while half have a length from 1:30 to 2:30. If you are composing in this idiom, you’ll be writing stuff shorter than a short pop song. Maybe that’s part of the appeal: a bijou version of generally long-winded genres like classical, prog rock and fusion.

‘Prelude’, ‘Tifa’s Theme’, and ‘Fond Memories’ (Uematsu) and ‘Still of the Night’, ‘A Curious Happening’ and ‘Raven’ (Kikuta): all these have a roughly ternary form for the loop. Kikuta in particular uses an AAB form with no variation between the As, a couple of of times.

‘Few Paths Forbidden’ (Kikuta) and ‘Anxious Hearts’ (Uematsu) have four equal sections in the loop. ‘Sending A Dream’ into the Universe (Uematsu) has only two but the theme’s phrase form is compensatorily more complex. ‘Now Flightless Wings’ (Kikuta) is a special case which I’ll discuss later.

Five of the tunes use repetitions with variation. Strategies for variation are all very familiar:

  • add (or remove) a countermelody, as in ‘Prelude’ and the second part of ‘Now Flightless Wings’
  • octave shifts, that old classic
  • change instrumentation, like flute to oboe in ‘Tifa’s Theme’

Most of the tunes centre around a continuous chunk of thematic melody of around 30-50 seconds’ length. It depends on the tempo, but often that’s 16 bars long. Perhaps because I chose a lot of melancholy and pensive and nostalgic pieces, many of these tracks have a similar moderate 4/4 tempo. Both games feature some 3/4 or 6/8, but less than I expected.

8 out of the 10 tunes have an ostinato of some kind, so that’s definitely a technique to reach for. Of those 8, 6 of those have it basically throughout.

Finally, let’s mention rests and breaks. All of the songs except for ‘Still of the Night’ and ‘Tifa’s Theme’ and the tiny loop of ‘Now Flightless Wings’, feature a tag or a breakdown to rhythmic hits. This provides a relief from the main melody, within the loop. ‘Raven’ has two different rhythmic breakdown sections.

‘Tifa’s Theme’, ‘Few Paths Forbidden’, ‘Now Flightless Wings’, ‘Anxious Heart’ and ‘Sending a Dream into the Universe’ (all lyrical, emotive ones!) feature prolongations of melody endings by a bar or two, either of a V chord or a I. Nothing too surprising, but another little technique for the toolbox.

In the end, I think I’ve reached the limitations of this kind of analysis. I could try eke out some conclusions about the phrase divisions of these melodies, but we’d learn more by transcribing a couple and talking about them as, you know, melodies.

Okay, time to wrap up with individual comments on each tune.

I apologise for presenting the tunes in no sensible ordering. It’s because I (rashly) chose LibreOffice Calc to lay out my data. Putting the tunes in a sensible ordering would involve too much layout hacking to be worth it.

I gotta say, I haven’t been too impressed with Calc. I encountered a fair few tiny glitches and the export functions are unfinished: I couldn’t find a way to choose what page or what cells to export to image, and the pagination options in the PDF export appear to do nothing.

Anyway. Now comes the fun part!

‘Prelude’ (C major) – Noboe Uematsu, Final Fantasy VII

This is the first thing you hear when you start the game. Confidently, for 16 bars it features only solo synth arpeggios that climb and fall through 4 octaves with a calm wave-like effect. The synth is warm and woody in its lower registers and chime-like at the very top. An echo effect adds magic dust. The triads are decorated with 9s and, at the end, 7s, providing a bit of extra colour.

Harmonically, it’s a four-chord trick until the parallel minor chords – all familiar but powerful stuff. The mood is mystical but noble. After that full round of synth, a majestic theme, with full chords, in strings and woodwinds, begins.

One smart detail is the order of the theme variants: first a version with ascending countermelodies in the accompaniment, then a plainer version without countermelodies, providing some easing and rest.

‘Still Of the Night’ (A minor) – Hiroki Kikuta, Secret of Mana

This isn’t a million miles away from the hypnotic, chimey, magical mood of Prelude, yet Kikuta’s style is distinctive. It’s more mysterious and warmer, cheekier. This stems from a static dorian modality, alternating with major chords off flattened degrees like bII, bVI and even bI. That sense of mystery comes from the ambiguous voicings (there isn’t a clear bass note) and tensions created by the shifting, slow ostinato against a droning tonic note.

This particular tune is very open in texture though we’ll see him do busier stuff elsewhere. Sonically, we’re in chimy, dreamy land again, but Kikuta’s sounds are warmer. He famously crafted the samples himself rather than leaving it to an engineer, and the result is gorgeous.

‘Tifa’s theme’ – Noboe Uematsu, Final Fantasy VII

Wow, this is such a catchy theme, I’ve had it in my head all day. Like Prelude, it’s in a major key with some colourful chords from the parallel minor. Also like Prelude, the progression is basic and powerful. Legato orchestral sounds plus a near-constant vibes arpeggio combine in a mood I’d call soulful.

The strings are done in a bit of a hurry, I think, but we get some contrary motion from variations in the vibes. There’s some not-particularly-subtle symbolism in the melody textures, that nonetheless drew a tear from me, about how Tifa wants a man to love and a return to the happiness she had with her childhood friend Cloud: flute and oboe together, then flute alone, then oboe an octave lower with flute finally rejoining.

The loop back to the start harmonically goes to I from a II, although the melody does strongly lean on the V note. It’s as if the theoretically necessary, bridging V7 chord is only briefly hinted at.

‘Few Paths Forbidden’ – Hiroki Kikuta, Seiken Densetsu 3

What a groover! This one has an awesome syncopated drums and bass guitar groove, a warm hooting synth harmonised melody, with wheeling syncopated marimba riffage in the background.

We’re getting into Kikuta’s secret sauce here: notice how the marimba has a quiet lower harmony line which subtly contributes some pulsing bass activity alongside the expertly sparse bass guitar throbs. The slapback echo adds texture and emphasises the woody quality while pleasantly obscuring that lower line – just another example of Kikuta’s gorgeous (yet economical) sonic layering – pleasant depth like a bed of bracken.

The slightly out of tune mallet sound adds flavour and realism.

The pumping bass uses the slab-like weight of bass guitar as a powerful device in itself. This is a composer who gets it.

‘Now Flightless Wings’ (Ab major) – Hiroki Kikuta, Secret of Mana

This one’s a special case. From reading the Youtube comments, I glean that it’s the last song heard in the game and it’s there to deliver an emotional payoff at the story’s end. Tense strings chords get harmonically warmer, into an infinite loop of gorgeous glowing barrel-organ and music box sounds. I haven’t played the game but even so the bittersweet “life is sad” loveliness affects me. The extreme shortness and simplicity of the loop makes it like a lullaby, childish, vulnerable and ephemeral. That said, some subtle counterpoint and harmonic variations bring depth and ornamentation so it’s not too plain. Brilliant stuff.

‘Anxious Heart’ (F minor) – Noboe Uematsu, Final Fantasy VII

This one starts with cinematic string swells. The harmony is tenser than in the other Uematsu pieces we’ve seen: minor to parallel major shifts with roots moving in thirds, featuring that shift from a major to minor 3rd which signals a mood of awe and transcendence. A lot of emotional payoffs in music happen on these type of big, simple colour shifts. So good!

Then it goes into what I think of as “rainforest” vibraphone, after this amazing Jay Hoggard exotica track that I’ve always loved.

The intro is in 5/4, I think, just to lengthen out the chords.

‘A Curious Happening’ (C minor) – Hiroki Kikuta, Secret of Mana

Swung sixteenths sleazy freaky noir funk. There’s probably something that could be said here about Japan’s relationship to African-American culture, but I amn’t informed enough to grasp it.

This track has very funky timbres. Both the synth and the xylophone in the intro vamp are primarily sonic/timbral. Although they’re outlining a Im6 to I-7b5 jazzy chord alternation, what we’re most aware of is the warm, nearly buzzing fatness from the synth, and dry niggling woody oddness from the percussion. Both are staccato sounds, putting that African emphasis (speaking very, very, very broadly) on note onset (and hence rhythmic expression) over the continuous pure tones of classical music.

In this context, the simple clave rhythm for the breakdown was the perfect choice.

‘Sending a Dream into the Universe’ (C minor) – Noboe Uematsu, Final Fantasy VII

This one has, I dunno, maybe “Celtic New Age” instrumentation? Keening woodwind, acoustic accompaniment, slow rock drums and synth pads.

There’s a cool programmatic sequence in the harmony. Three times, we change to a minor key a fifth above, via a pivot chord sitting a third away from each key. E.g. Cm Eb Gm. Then Gm Bb Dm. The effect is simultaneously uplifting and sad. Doing it three times in a row emphasises the theme of the title, with a feeling of hopefully, nobly surging upwards. Nice work, Uematsu-san.

‘Fond Memories’ (C major) – Hiroki Kikuta, Secret of Mana

It’s little wonder people get nostalgic about these games… they were made with a clear-eyed understanding of the mechanics and value of nostalgia! This sparkling gauze of single-note piano and faint accordion, with its shimmering delay effect, just gets right down to the business of plucking your heartstrings. Nice balance between the 4-bar major part and the 16-bar minor part. The harmony is triadic, diatonic then relative minor and finally just a bit of parallel minor in the form of a bVII to get us to a colourful and rather inexplicable, but definitely good VI7 chord before going back to the tonic.

‘Raven’ (A minor) – Hiroki Kikuta, Seiken Densetsu 3

This one’s a pure groove/riff tune. A foot tapper! Like in ‘Few Paths Forbidden’, Kikuta does his dorian two-part harmonising thing in the marimba, and also in the woodwinds. This tune just stays on one chord though, with a stomping rhythmic breakdown followed by an ominous, pulsing, pizz strings and flute tag, for variation.

Thanks so much for joining me. Hope these classic JRPG songs warmed your heart! And I hope I put these lessons to use some day soon myself!

P.S. Here’s a playlist of all the tracks I analysed, here’s the full Secret of Mana OST, here’s Seiken Densetsu 3, and here’s Final Fantasy VII.

The Alternate Web

I want to bust a real quick one today on my recent experiences of dipping a toe into alternate and smaller-scale web platforms.

Of course, this article itself is hosted on a dominant web platform, WordPress. And I use Facebook daily for mundane purposes, mostly keeping up with people. (Twitter, on the other hand, gets no love from me.) I’m not writing to rag on big platforms, but to acknowledge a cultural moment when a lot of people are contemplating this switch.

I’ve been reading Hacker News (itself a big platform – they’re everywhere!) for a couple of years and quickly grew familiar with “bring back the old web” sentiments there. I would guess programmers, with their love of the esoteric and the stripped-down, have been saying such things forever. The argument, if I may sum it up crudely, is that personal webpages (whether self-hosted or on services like Geocities) and pre-Web-2.0 media like blogs, newsletters and forums, fostered a more diverse, friendly, expressive, open culture online.

Part of that nostalgia is people remembering a period when only nerds were online – no racist uncles or Karens, to reach for current stereotypes. Also, I’ve the impression that a lot of good memories come from participation in subcultures like MP3 blogs or Flash games, that would obviously have drawn together like-minded folks.

Fast forward to 2021, then, and it makes sense that the many current revivals of the old-school web favour nerdiness over mass appeal. I’ll discuss that a bit more below when I get to my actual experiences.

Another driver of interest in alternative platforms is the manifest inadequacies of Facebook, Twitter and so on. Those companies have the impossible task of trying to please everyone. High-profile bans and legal challenges show that the security, conflict-of-interest and privacy problems of ad-driven social media are out in the open these days.

That recently drove a lot of people from WhatsApp onto the competitor app Signal, including myself.

I also started my own personal website, kevinhiggins.dev, to have an online outlet where the form as well as the content are in my control.

Finally, and mostly inspired by one guy I follow called JP LeBreton, a mild-mannered, leftist game dev, I joined Mastodon, the platform I call “Twitter for nerds”.

I feel much freer to post on Mastodon than on FB, because I don’t have, nominally, 1000 people who know who I am and might be following my posts. The lack of an audience (I’ve no followers on it yet and only got a couple of transient likes) is okay by me. Same with Drum Chant, I never focused on driving traffic to here. This gets right to my perhaps idiosyncratic stance on web publishing of any kind: for me, “putting it out there” is more important than getting a reaction.

I know why this is, it’s a quirk in my personality whereby things feel much realer to me if I’ve written them down. (Hence this blog – and privately, I also journal and keep a half-dozen diaries and logs for various activities.)

Hmm. I’d thought this article might be an encouragement to others to try out alternate platforms, yet now I’m persuading myself that they’re for people like me who are mostly into organising an archive of their thoughts over hanging out with others.

That’s not to say I don’t want the hangs. My own motivation to try out these venues of expression was very simple: lockdown is very lonely and I’m hoping to meet new, like-minded people.

And there are some such on Mastodon, for sure. But rather than starting conversations, for now anyway, I’m taking the shy fellow’s tactic of crafting the feed I’d like to follow.

It’s been fun, and I especially like posting abrupt juxtapositions of content, e.g. counterpoint exercises one minute, rap lyrics the next. I feel free to perform a multipotentialite and intense persona there.

When it comes to my site I imposed more structure to present a neater picture for say a prospective employer. (Check out the site icon!) However, I chose a serif font and some moody colours specifically to hint at 90s web mischief. The links section is intended to send readers off into a maze of esoteric personal pages. Mixing business with pleasure.

I’ll wrap up today with a related trend I’ve noticed and then some blue-sky ideas for more alternate platforms I might try.

A lot of the writing that affected me most last year came by email newsletter. When I contacted the author of one of these to say hi, he mentioned in his answer that he’d found the supposedly old-fashioned format unprecedentedly effective.

I list the three newsletters I follow in the links page of my site.

And to finish… two more avenues for expressing myself online that I’ve been considering are Neocities and Project Gemini. The first is a user-friendly webpage-hosting and linking service, explicitly about recreating the old-school web. I think they might even have, whatchamacall those things, link rings? Webrings!

That could be a place to do something pseudonymous and weird. Prose poetry? Moodboards? Naughty fiction? Something warm and indulgent, anyhow.

(I already have one or two pseudonymous outlets, I recommend it. Though I’m ignorant of the whole web culture of “alts” built on the concept!)

Project Gemini is different. It’s a whole new web protocol, a communication format for online interchange like the Hypertext Transfer Protocol that underlies the whole web. So, instead of an address like https://kevinhiggins.dev, you’d have gemini://gemini.circumlunar.space/servers/

You need special software to view content using this protocol, and it’s text only. It took me more than an hour to find an app that worked, but when I did, it was weirdly fun to read people’s random posts by such a covert, strange route. I remember one person seemed to write only about guitar tunings they were exploring. That kind of thing.

If I publish in “geminispace”, I’d like to write about spirituality and wisdom literature, to lend my own brand of esotericism to the initiative. (Since the Christmas holidays I’ve been reading Chinese philosophy every day, and I’m also a big fan of the likes of M. Scott Peck… and I read a bit of Western philosophy too, until my brain gets tired.) That won’t be under a pseudonym and I’ll let you know here on Drum Chant if I get round to it!

Oh, last thing, I never said anything about Signal. Well, it’s very much like WhatsApp except I found the setup to be a bit more fiddly and tricky – getting stuck in loops asking for permissions on the phone, not immediately importing contacts. It also uses a spaced-repetition technique to get you to learn off your PIN, which is super-nerdy. (Though probably a good idea, I’m sure.) Nothing too surprising there.

Striving For 3D

This week I made some progress towards towards coding 3D rendering!

I remember when I was in my early teens and a bit bored on holidays at my grandparents, trying to code an image of a road with 1-point perspective. I asked my grandfather to show me how to load the version of BASIC he had on his ancient Amstrad PC (it was GW-BASIC).

Back then, I didn’t get the basic idea of 3D perspective, but it isn’t actually very difficult: if objects are in a space in front of you where X is across, Y up and Z forward (and you are at 0, 0, 0), dividing their X and Y coordinates by the Z coordinate will create the necessary distortion.

(It seems that, typically, a small number is added to the Z component first to reduce the strength of the distortion, otherwise things get madly stretched off screen when the Z approaches zero. I bumped into that problem when making my Pink Sparks demo the other week.)

The real issue is that, to keep the code organised, manipulations of 3D data are best done using matrices. This way, a command can become data. Instead of running some code for each manipulation (such as rotating or resizing a shape), you have one piece of code that obeys a data representation of the desired command. This data is a transformation matrix.

You can then conveniently store these commands, for example the operations “Rotate by 30 degrees around the X axis, mirror in a plane with the same orientation as the X-Z plane but 5 units above it, resize to 80% scale” could be represented as three 3×3 matrices.

If you use homogenous coordinates, which are like ordinary coordinates but containing an extra element by which all the others are divided, then the Z-divide for perspective correction can be represented by a matrix which copies the Z value into that extra, dividing element (typically called W).

But enough of my attempts at understanding linear algebra! Let’s talk implementation details.

As usual, I made my demos in JavaScript. Being able to trivially publish stuff on the web, and send no-hassle links to friends or relatives, makes this the most attractive choice.

However, I decided not to use WebGL, the graphics-card-accelerated renderer that now comes with all browsers. I’ve been having a good time with WebGL tutorials, but connecting up buffers and typed arrays introduces more places to make a mistake and lose time debugging. I’ll return to WebGL someday because the raw power, the basic idea of shader language, and the depth-buffer and texture manipulation capabilities all attract me deeply. But for now there was, again, a clear best option: HTML5 Canvas.

This is a graphics API with higher-level features such as line-drawing commands – precisely what I wanted for my demos!

The first one I made demonstrated linear transformations in two dimensions. As you can see if you click here, these all operate around the origin (the centre point where the two axes meet), or to put it another way they preserve the origin. I used setInterval(..) to make the animation – not a good choice as we’ll see in a sec.

Then, I made a demo of affine transformations – a larger category which includes the linear transformations, as well as translations (i.e. just moving shapes around with no distortion) and mixtures of linear transforms and translations. To show the way that affine transformations can occur around any point, I added some quick interactivity to let the user set the centre point and choose a transformation. I also used matrix multiplication to iteratively apply the same transformation to my shape.

Affine transformations -area-preserving squash, in this case, which incidentally describes hyperbolic curves.

Around here I started thinking of possibilities for game graphics:

  • a stick person who squashes a bit before and after jumping
  • a stick person who leans back before and at the end of a run (wind-up and breaking), done with a shear transformation
  • explosions setting off shockwaves that pass through numerous characters on a the screen, squashing them in its direction as it does so
  • interactive objects that cause the player character to change size, Alice In Wonderland-style

(I was definitely channeling some old Flash games from my teens… stickdeath.com, anyone? I think that was the URL.)

I’ll get back to these ideas in a second to discuss what I think would actually be the hard part about making them…

My final demo was in actual 3D. Still working off Greg Tavarre’s nice WebGL tutorials (though NOT following his convention for ordering matrix elements in a 1D array), I implemented homogenous coordinates and a Z-divide. My first attempt had an annoying error in the Y-axis. Turned out everything was working, I had just put my transformation matrices in the wrong order so the up and down bobbing was happening after perspective had been applied!

That’s what it looks like! Click here to see it hosted on my site.

If you look at the working demo, you may see the star seemingly spinning the wrong way, despite the perspective cues, a classic illusion. I think this is just a general fault of wireframe graphics.

BTW The animation here is handled with the preferred modern JS technique requestAnimationFrame(..).

All this demo-making begs the question: could anything here become reusable software?

The matrix stuff is eminently reusable. To make it convenient, I would need to make an engine or interface allowing a programmer to load geometrical data, transform it and display it, through well-documented, user-friendly functions, while hiding inner workings.

So the last thing I did this week was some design work on a personal 3D library. Eventually this should be in WebGL, but to test the design I might do it in Canvas and maybe just with wireframes. The crucial point is that geometry exists in all these different spaces before it’s fully processed:

  • object space, that is, vertexes positioned relative to the centre of the object they represent
  • world space, so now that object is positioned in a world
  • camera space, now the world is spun around to face the camera
  • screen space, now anything visible is referred to by the position it takes up on the screen (in this case, the rectangular Canvas on a webpage)

All of these have potential for interesting experimentation. What exactly defines an object is an open question – can an object be composed of others, and in what ways might those sub-objects be transformed? Once in camera space, what are the possibilities for fish-eye effects or non-Euclidean geometry? And of course screen space is the traditional domain of the visual artist, the flat sheet.

Well that’s some big talk on the back of a spinning star. Baby steps though!