Striving For 3D

This week I made some progress towards towards coding 3D rendering!

I remember when I was in my early teens and a bit bored on holidays at my grandparents, trying to code an image of a road with 1-point perspective. I asked my grandfather to show me how to load the version of BASIC he had on his ancient Amstrad PC (it was GW-BASIC).

Back then, I didn’t get the basic idea of 3D perspective, but it isn’t actually very difficult: if objects are in a space in front of you where X is across, Y up and Z forward (and you are at 0, 0, 0), dividing their X and Y coordinates by the Z coordinate will create the necessary distortion.

(It seems that, typically, a small number is added to the Z component first to reduce the strength of the distortion, otherwise things get madly stretched off screen when the Z approaches zero. I bumped into that problem when making my Pink Sparks demo the other week.)

The real issue is that, to keep the code organised, manipulations of 3D data are best done using matrices. This way, a command can become data. Instead of running some code for each manipulation (such as rotating or resizing a shape), you have one piece of code that obeys a data representation of the desired command. This data is a transformation matrix.

You can then conveniently store these commands, for example the operations “Rotate by 30 degrees around the X axis, mirror in a plane with the same orientation as the X-Z plane but 5 units above it, resize to 80% scale” could be represented as three 3×3 matrices.

If you use homogenous coordinates, which are like ordinary coordinates but containing an extra element by which all the others are divided, then the Z-divide for perspective correction can be represented by a matrix which copies the Z value into that extra, dividing element (typically called W).

But enough of my attempts at understanding linear algebra! Let’s talk implementation details.

As usual, I made my demos in JavaScript. Being able to trivially publish stuff on the web, and send no-hassle links to friends or relatives, makes this the most attractive choice.

However, I decided not to use WebGL, the graphics-card-accelerated renderer that now comes with all browsers. I’ve been having a good time with WebGL tutorials, but connecting up buffers and typed arrays introduces more places to make a mistake and lose time debugging. I’ll return to WebGL someday because the raw power, the basic idea of shader language, and the depth-buffer and texture manipulation capabilities all attract me deeply. But for now there was, again, a clear best option: HTML5 Canvas.

This is a graphics API with higher-level features such as line-drawing commands – precisely what I wanted for my demos!

The first one I made demonstrated linear transformations in two dimensions. As you can see if you click here, these all operate around the origin (the centre point where the two axes meet), or to put it another way they preserve the origin. I used setInterval(..) to make the animation – not a good choice as we’ll see in a sec.

Then, I made a demo of affine transformations – a larger category which includes the linear transformations, as well as translations (i.e. just moving shapes around with no distortion) and mixtures of linear transforms and translations. To show the way that affine transformations can occur around any point, I added some quick interactivity to let the user set the centre point and choose a transformation. I also used matrix multiplication to iteratively apply the same transformation to my shape.

Affine transformations -area-preserving squash, in this case, which incidentally describes hyperbolic curves.

Around here I started thinking of possibilities for game graphics:

  • a stick person who squashes a bit before and after jumping
  • a stick person who leans back before and at the end of a run (wind-up and breaking), done with a shear transformation
  • explosions setting off shockwaves that pass through numerous characters on a the screen, squashing them in its direction as it does so
  • interactive objects that cause the player character to change size, Alice In Wonderland-style

(I was definitely channeling some old Flash games from my teens… stickdeath.com, anyone? I think that was the URL.)

I’ll get back to these ideas in a second to discuss what I think would actually be the hard part about making them…

My final demo was in actual 3D. Still working off Greg Tavarre’s nice WebGL tutorials (though NOT following his convention for ordering matrix elements in a 1D array), I implemented homogenous coordinates and a Z-divide. My first attempt had an annoying error in the Y-axis. Turned out everything was working, I had just put my transformation matrices in the wrong order so the up and down bobbing was happening after perspective had been applied!

That’s what it looks like! Click here to see it hosted on my site.

If you look at the working demo, you may see the star seemingly spinning the wrong way, despite the perspective cues, a classic illusion. I think this is just a general fault of wireframe graphics.

BTW The animation here is handled with the preferred modern JS technique requestAnimationFrame(..).

All this demo-making begs the question: could anything here become reusable software?

The matrix stuff is eminently reusable. To make it convenient, I would need to make an engine or interface allowing a programmer to load geometrical data, transform it and display it, through well-documented, user-friendly functions, while hiding inner workings.

So the last thing I did this week was some design work on a personal 3D library. Eventually this should be in WebGL, but to test the design I might do it in Canvas and maybe just with wireframes. The crucial point is that geometry exists in all these different spaces before it’s fully processed:

  • object space, that is, vertexes positioned relative to the centre of the object they represent
  • world space, so now that object is positioned in a world
  • camera space, now the world is spun around to face the camera
  • screen space, now anything visible is referred to by the position it takes up on the screen (in this case, the rectangular Canvas on a webpage)

All of these have potential for interesting experimentation. What exactly defines an object is an open question – can an object be composed of others, and in what ways might those sub-objects be transformed? Once in camera space, what are the possibilities for fish-eye effects or non-Euclidean geometry? And of course screen space is the traditional domain of the visual artist, the flat sheet.

Well that’s some big talk on the back of a spinning star. Baby steps though!

7 Days

Last Monday I decided I would work on a different coding-related skill each day, for a week. These days I’m familiar enough with my own brain to know that swapping subject areas and deep-diving into topics suit my attentional style. Because motivation can be hard to come by when jobhunting and self-studying during pandemic restrictions, I thought I’d stimulate myself with novelty. It might even help me get psyched up to work on longer-term ambitions, like releasing my ritual Android app Candle Shrine, and also making a portfolio website.

Day 1 – C

On Monday I worked on C programming – an area I haven’t touched since 1999, ha! Yes as a kid I did a summer course which touched on some C. Anyhow, the aim was to set up the compiler and get console input and output. I used a build of Tiny C Compiler nabbed from the Tiny C Games bundle. My project is a simulation of life as experienced by our family cat, Goldie. Actually I’m pretty pleased with this, my sister played it through and enjoyed it. I was inspired by Robert Yang’s concept of “local level design“- a design aesthetic celebrating small-scale social meanings rather than top-down formalism. (This suited me because I don’t yet know enough C to write anything other than this ladder of if statements. Still, it works!)

Day 2 – Linux

The next mini-project was Linux shell commands. I used VirtualBox to dip my toes in -it lets you run any distro on emulated hardware, from a Windows desktop. It nearly locked up on my a couple of times, but in fairness my computer was running two OSes at once so I forgave it. It never fully crashed.

I’d hoped to get into shell scripting, which is the power user technique of saving listings of shell commands (a shell being a console for directly running programs or OS utilities, etc. – like the ol’ Command Prompt in Windows) as text files to be invoked as, effectively, little programs.

But all I had time for was to learn about 20 standard shell commands. However, I really liked this stuff. I can see why there’s a stereotype that devs use Linux. It’s rather satisfying to install stuff, edit files, and set up the file system via typed commands and not all that intimidating either.

Day 3 – Pink Sparks

See it in action here

This one was fun. I made a 3D particle demo – a spinning cube made of flying pink sparks. My focus here was to prove I could make a simple particle system, which indeed wasn’t hard, cribbing off Jonas Wagner’s Chaotic Particles and leveraging the extremely handy Canvas feature in HTML5. I also wanted to do 3D perspective, which was hard. However here I used the simplest possible version, a bare z-divide where x and y coordinates are divided by distance from the viewer. The proper way to do this involves matrices and transforms, but I’m not there yet.

If I get back to this the two things I’ll do are change it to defining line-shaped sources of sparks rather than point-shaped, and make some nicer data than a cube, like say a chunky letter ‘K’. This wouldn’t be particularly hard. Making a display of my initial fits with my interest in what I call ceremonial coding which I believe will be an emerging cultural field in years to come. As life goes online, we’re already finding the need to program and design software for celebrations and community rituals – an example being my graduation from my computer science course, which is being held on Zoom. I am certain that techniques from game design and aesthetics from digital culture will be important to create spiritual meanings and affirmations of identity on computers. My upcoming ritual app for Android phones expresses this conviction.

Again, Robert Yang’s post is very close to the spirit of this: “What if we made small levels or games as gifts, as tokens, as mementos?”

Day 4 – Huffman Coding

I hit a roadblock on Thursday. Huffman coding compresses text, by taking advantage of the fact that certain symbols (for example, ‘z’) occur far less often than others. To make a Java app implementing this was a meaty challenge, requiring binary buffer manipulations, a binary tree, sorting and file I/O. Still should have been achievable – but I let myself down by not rigorously figuring out the data representations at the start. This meant I threw away work, for example figuring out how to flatten the tree into an array and save that to disk, only to twig that the naive representation I’d used created a file far bigger than the original text file.

Though I was working from a textbook with an understandable description of the Huffman coding technique, that was nowhere near enough. I still needed to design my program and I failed to.

So I ran out of motivation as poor design decisions kept bubbling up. This was a stinger and a reminder that no project is too small to require the pencil-and-paper stage. On the plus side, I did implement a tree with saving and loading to disk, plus text analysis.

Hopefully I can reuse these if I come back to this. It’s a fun challenge, particularly the raw binary stuff and the tree flattening (although I don’t know yet if I want to store the tree in my compressed file, I think probably just the symbol table needed to restore the original.)

Day 5 – WebGL Black Triangle

In the spirit of Jay Barnson’s Black Triangles – though I’m sure it’s a million times easier these days!

Now this was pure fun. I used a tutorial to learn how to display graphics on a webpage using WebGL. I… frankly love the feel of OpenGL Shader Language. The idea of using a harshly constrained programming language to express some low-level color or geometry calculations, which is then compiled and run on your graphics card so you can feed it astronomical amounts of data for ultra-fast processing, is so satisfying. (Actually, especially the compilation process, and the narkiness of the parser where for example 1 is different to 1.0… it feels like you’ve loaded a weapon when you’ve successfully compiled at last.) I love graphical magic, but previously have been doing it at several removes, using wrappers on wrappers like Processing. I will definitely be doing more of this.

Day 6 – RESTful API

Another failure! I wanted to make a RESTful API demo as a bullet point on my CV, and host it for free on Heroku. But although I did some good revision on API design, when I got into implementation I totally got tangled up in trying to get libraries to work. Blehh!

I wanted my system to be standards-compliant so I tried using libraries that’d let me use JSON:API instead of raw JSON, which some people say is not good for APIs as it has no standard way to include hyperlinks which are, in fairness, central to the concept of REST (i.e. that instead of the client knowing to use certain hardcoded URLs, each response from the API includes fresh hyperlinks that the client can choose to follow).

But I got stuck when the examples for the library I chose wouldn’t compile because they required a new version of a build tool, Gradle, and despite trying some things off forums my IDE failed repeatedly to automatically install this.

They don’t call it “dependency heaven“!

If I get back to this I’ll use the build tooling I had working already for school projects. Life’s too short!

I wonder if the area of web services – so essential, stolid, bland – might be a natural home for rather pedantic personalities. The type who would make a typology of all things and publish it as a web standard. In any case, wading through some comments and blog posts and Wikipedia pages gave me a stronger understanding of state in web services than I had before.

Day 7 – Pathfinding in JS

Like the previous one, this project spun off into piles of research. But it’s all valuable stuff, I revised a lot of CSS and particularly the grid system, and got deep into JavaScript using this quite good book.

My plan was to implement a classic pathfinding algorithm, Dijkstra’s algorithm. But I wanted to have it so a little monster would chase your cursor around elements on a webpage! Well, as usual, I should’ve thought this through more. The fact is, web page elements are not intended to be processed as 2D shapes. HTML is semantic – web content is structured and manipulated as elements like paragraphs, headings, links that have meaningful relationships in the context of the document itself – with the final presentation of these elements done on the fly according to the user’s needs.

Anyway… my point is, I had to compromise to get this going. My original vision was of words arrayed around the page randomly, at different sizes, with a monster sprite threading his way around them.

My solution doesn’t have the words, just boring blocks, and though I think I could do words at a fixed size, having splashy words in different size could be quite a hassle.

As you can see, the paths go through diagonal choke points, something to fix

Nor did I get around to the monster although I have the sprite for when I do:

Heheheheheh.

But the thing that worked well for this mini-project was: web standards! In particular, I made the excellent decision not to hack CSS positioning from JS, but instead take the time to revise the CSS Grid system. Which, as you might imagine from the name, was actually perfect for this use case. Those numbered cells above are arranged by Grid.

Conclusion

That was fun. I might even do it again!

Inspired by “Permacomputing”

I read Viznut’s piece on “Permacomputing” last night and got all fired up. We do live in an age of unparalleled waste of computing resources. He is far from the first to discuss this – I recall Fabien Sanglard‘s and Derek Sivers’ polemics – but Viznut has the uncompromisingly ecological and long-term vision to contextualise and channel the typical hacker’s anger at wasted CPU cycles.

Viznut a.k.a. Ville-Matias Heikkilä.

That made me think of my recent graphics-heavy work, using large resolutions and high-level frameworks; and of the kind of work I see myself doing long term: “coding ceremonial space, presence, substance and light”.

My recent work….

Frankly, I felt indulgent. I think Viznut is right: our civilisation needs to make drastic adjustments to save itself from ecological collapse. And although those changes are gonna be harder than a few well-intentioned Westerners turning to gardening, common sense and intuition and aesthetics and the environmental statistics suggest that permaculture and related ideologies – emphasising resilience, adaptation, local conditions and lowered resource use – may hold more solutions than our current dependence on growth.

When I read Viznut’s weighty opinions I wanted to do some work in the vein of his competition-winning computationally minimal art. Work that uses no more computing power than needed for the goal at hand.

Viznut and his ilk are so much more learned than I am, as coders. They specialise in low-level hardware hacking, an esoteric and difficult topic. However, one skill of theirs is more generally applicable yet fits precisely in the permacomputing ideology. As Viznut puts it, “Optimization/refactoring is vitally important and should take place on all levels of abstraction.”

So I’ll get technical now and chat about how I optimised my most recent learning project: “Iridesce”, a raycaster!

(In case you don’t know what raycasting is, it’s an old and simple way of rendering a maze from a first-person perspective. Windows 95 used it in an iconic screensaver, and it featured in 90s games, most famously Wolfenstein 3D.)

My raycaster is written in JavaScript and so runs in a browser. Try it now if you like!

I used the profiling tool available in my browser, Slimjet (a clone of Google Chrome). This gives an indication of how much CPU time is spent on various aspects of running your site’s frontend code.

The pie chart at the bottom right shows the breakdown.

How did it go? Long story short, I was able to reduce 68% CPU usage to 20%! The things that worked were:

  • Lookup tables – so, instead of calculating the rainbow colour function 400 times per frame, I save all possible values of it into an array and access that instead. This was the biggest single saving and is a pretty classic technique, whenever you have more memory than CPU available. I was also already using a lookup table for the angles of the raycaster.
  • Using the HTML5 Canvas graphics API properly, in particular the ImageData objects which I was using to manipulate raw pixel data. Originally, from a vestigial memory of how raycasters are traditionally done, I was working with one-pixel-wide strips of data for every single segment of wall. This was entirely unnecessary and I saved a lot of CPU time by switching to a single large ImageData object covering the whole canvas. Incidentally, this necessitated some manual byte offset calculations, which felt pleasingly close to low-level.

And what didn’t have much effect:

  • Tidying up to remove nested if statements and repeated checkings of the same condition. This improved the legibility of the program but I don’t think it did much for performance.
  • Removing at least a dozen multiplications from loops. (Taking stuff out of inner loops is another classic optimisation approach.) I really thought this would make a difference, but I couldn’t see it in the stats.

So… the conclusion is pretty clear. Modern computers are stupidly fast at arithmetic. Manual low-level cleaning up doesn’t seem to change much. I’ll still do it for elegance. And I expect I’ll start seeing the performance benefits as I get better at implementing algorithms and doing my own profiling and benchmarking.

But more important than removing those multiplications and ifs, was checking which external functions are taking up time. Using the profiler I could spot that my colour calculations as well as the ImageData manipulations were actually taking up the most time. In both cases there was no need to call the function 400x a frame and I got massive improvements by fixing that.

So that was fun. Am I any closer to sustainable computing? A tiny bit.

Calculating byte offsets is getting into the realm of pointer arithmetic, a skill needed when programming simpler and older machines – increasing the range of systems I could do useful work on.

I learned about profiling and I did some quite major optimising, if mostly by fixing previous poor decisions.

And I got some clarity about what values are important to me, and a good dose of idealism. I’ll keep Viznut’s ideas of communally stewarded, resourceful not resource-intensive, locally appropriate, and aesthetic computing close to my heart as I decide how to direct my energies in projects and job-hunting.

There are other of Viznut’s ideas that I could elucidate through my own practice. How about developing the communal appreciation and understanding of technology in my own family home? Or working on feedback and visualisation of complex system state (I love interactive and live feedback from running systems and that’s how I tackle a lot of coding problems). Or the classic yet delicious challenges familiar from the 80s and 90s: making impressive graphics on slow processors! Perhaps even finding styles and tricks so that the imperfections of low-fidelity enhance the aesthetic affect.

Anyhow. Thanks for reading!

Harping On

I made an online toy in JavaScript, called TextHarp. Try it out (it needs a computer rather than a phone because it uses mouse movements).

The idea popped into my head a few weeks ago. It won’t leave the prototype stage because this combination of technologies – pure HTML, CSS and JS (although do I use one library to synthesise sounds) doesn’t robustly support what I wanted.

I aimed to turn a piece of text into an instrument, where moving the cursor over any letter which corresponds to a musical note – so, ‘c’, ‘d’, ‘e’, ‘f’, ‘g’, ‘a’, ‘b’ – would pluck the letter like the string of a harp, and play that note as a sound!

At the time, I was thinking of a few possibilities

  • adding audio feedback for testing web pages, so that a developer/designer could hear if an element was malformed or missing information (aspects which are often invisible to the eye)
  • sonification, a concept which I think is rapidly going out of date as people realise its enormous limitations, but which was about turning reams of data into continuous sound patterns that would somehow reveal something within the data, but which I think were usually just third-rate electronic music or else showed no more than a good graph could, and basically made clear that the humanities PhD system sucks in people who’d be better off elsewhere… sorry I seem to have gotten into a rant here
  • simple enrichment and adding magic to the experience of visiting a webpage

That last is out of favour for web design nowadays. Instead, minimalism, accessibility and function are the buzz words. Fair enough… but also ominously envisaging the web as merely where stressed and harried folk get updates from a corporate or government source, staring down at their little phone screen.

Well. My little toy isn’t going to do anything to overturn that paradigm. Still, let’s take a short tour of the challenges in making it work.

I used basic JavaScript mouse events to change elements in the webpage, modifying what’s called the Document Object Model; which is nothing more than how your browser perceives a page: as a hierarchy of bits containing other bits, any of which can be scripted to do stuff.

My script caused each paragraph to detect when the mouse was over it. Then it cut the paragraph into one-character chunks and placed each of these single letters into a <span></span> HTML tag, so that it became its own card-carrying member in the Document Object Model.

Not very elegant at all! Also, despite span tags being supposedly invisible, throwing in so many of them causes the paragraphs to twitch a little, expanding by a couple of pixels, which wouldn’t be good enough for a production page.

However, it works. I set each of the single-letter chunks to play a synthesized tone when the mouse goes over them, and that’s it. Also, when the mouse leaves that paragraph’s zone, I stitch the letters back together, the way it was.

The downsides are that any HTML tags used to format or structure the text tend to get damaged by the process. Usually resulting in piles of gibberish, or text disappearing cumulatively. It would be possible to improve that, but with a lot of manual work. And, the browser’s attempts to be clever by healing broken tags here actually cause a lot of difficulties.

Defining some new kind of object that held the text and knew the location of each letter, would be a better bet.

However, I’m turned off this avenue of enquiry for the moment, because dealing with audio in browsers is a pain. Not for the first time, musical and sensual uses of technology have been left in the gutter while visuals get all the investment.

There are two big problems with web audio. First, JavaScript doesn’t offer precise timing. I see precision, whether in first person computer games, input devices, or in this case reactivity of audio, as inherently empowering – inviting us as humans to raise our game and get skilled at something. Sadly, much of our most vaunted current technology crushes this human drive to excel and be stylish, with delays and imprecision: touchscreens, cloud services, bloated webpages…

Where was I? Yes, the second problem is that Google Chrome made it standard that web sites can’t play sound upon loading up, but only after the user interacts with them. Well meaning, but really shit for expressivity – and quite annoying to work around. My skillz are the main limitation of course, but even trying out two libraries meant to fix the issue, I couldn’t make my audio predictably avoid error messages or start up smoothly.

No tech company would forbid web pages from showing images until the user okays it. But sound is the second class citizen.

When I know my JS better, I’ll hopefully find a solution. But the sloppy timing issue is discouraging. Some demos of the library I used show that you can do some decent stuff, although the one I experimented with took a good idea – depict rhythm as a cycle – and managed to fluff it with two related interface gripes. They made the ‘swing’ setting adjustable for each bar of a multi-bar pattern – pointless and unmusical. And they made the sequencer switch from bar to bar along with the sound being played – theoretically simple and intuitive, but – especially with the above-mentioned time imprecision of web interfaces – actually resulting in loads of hits being clicked into the wrong bar. (And if I say a drum machine is hard to use, it probably is – I’ve spent so much time fooling around with software drum machines I ought to put it at the top of my CV.)

But what am I saying! That demo’s way more polished than mine.

Perhaps even a little too polished! Visually anyway. All of the examples on that site are rather slick and clean-looking, perhaps because, I believe, the whole library has some affiliation with Google.

Ah, I’m being a prick now and a jealous one too, but one scroll down that demos page would make any human sick. The clean grids. The bright colours. The intuitive visualisations – yes, technology now means that you too can learn music, it’s just a bit of fun! Practice, time feel, listening, gear, human presence – nah!! And then the serious, authoritative projects – generated-looking greyscale, science-y formal patterns and data…. bleh.

My next JavaScript project is an exploration of a visual style which I explicitly intend to raise a middle finger to those kind of polished, bland graphics. I’ll be taking lessons from my 90s/00s gaming past to experiment with pixel art but without the hammed-up console-nostalgia cutesiness.

And I’ll be using standard web technologies – JS, SVG – to make anything I come up with 100% reusable by non-programmers.

Thanks for reading!