Coding for sight and sound

Coding for sight and sound

In the last of a three part article exploring artistic practices using Free Software, Martin Howse examines the specific concerns of software artists creating real time visuals

Vision On
early all the artists whose work we’ve delved into in previous installments, straddle both audio and visual realms, as Farmer’s Manual, any audio or visual output is considered merely as the rather irrelevant by product of systems and processes. Despite this flattening of fundamentally different materials under the wheels of machinic process or watchword of raw data, it’s important to assess the specific demands and concerns of hacking visuals. Video does admit of a different way of working, and presents both artistic concerns and technical demands which perhaps make video a tougher medium to work with successfully. Above all, real time video still presents quite specific technological challenges when it comes down to grabbing, processing and throwing out live material. The sheer volume of data which uncompressed video presents at reasonable resolutions and colour depth, coupled with demanding, complex mathematics performed across such large data sets, make of video coding a daunting task. The 25 frames per second clock is always ticking in the developer’s head, and her code quite simply has to keep up with this frenetic pace. Hardware can only do so much, and artist-coders need to be up-to-speed with specifications and instruction sets, pushing machines to the limit with tricksy low-level code. More abstracted, feeling equally at home with both media and cross-pollinating or feeding through visual sources and structures with the physical and audible. If, as Randall Packer, artist and theorist, argues, contemporary multimedia work expresses Wagner’s notion of the Gesamtkunstwerk (or total work of art), then Free Software coder-artists are definitely masters of this form. As we saw last month, Erich Berger’s Tempest project, making excellent use of GEM (a graphical extension to Pd, or Pure Data, the supremely extendible artistic environment), presents one extremely novel, conceptually clean integration of both audio and visual material, using electromagnetic radiation from the monitor displaying images to generate live sound. Indeed, in the case of artists such as ap (Martin Howse, Jonathan Kemp), or applications such as Pd, little, if any, differentiation is made between the two fields, with sound and video material handled solely as data. And in heavily code-based initiatives such

semi-interpreted scripting languages such as Python, Lisp or Scheme are definitely out of the question in this hard core real time arena, but such limitations can often produce artistic workarounds which re-examine computational methodologies. On the other hand, technical issues do perhaps stifle the flexibility of some of the apps we’ll examine, in contrast to the supremely JACKed or piped pluggability of Free Software audio. Thankfully, Free Software initiatives such as the Livido project (see Jacking Video), launched during last year’s Piksel conference at Bek in Norway, aim to bring such flexibility into video work, seriously raising the game for video software on open platforms.

Early artists, such as Woody and Steina Vasulka, working with video in the 60s and 70s confined their experiments purely to the analogue realm. And artist-coders working outside budgets which could stretch to buying time on a sweet blue fridge-sized SGI box, would have to wait until the birth of the Amiga Video Toaster (now open sourced) for affordable real-time video manipulation. Indeed, the multimedia capabilities of the well designed Amiga, opened up the way for a whole new generation of artists to begin working creatively with visuals. Around low cost machines such as the Amiga, an underground movement, known as the Demoscene, emerged, providing a platform and

Just how do artists interact with computers and what are the most creatively fulfilling means?

52 LinuxUser & Developer

LinuxUser & Developer


Coding for sight and sound

Coding for sight and sound

but successful nonetheless. Live coders, grouped under the TOPLAP (Live Algorithm Programming and a Temporary Organisation for its Promotion) banner, have also explored the use of the early children’s programming language, Logo, devised by Seymour Papert for the production of live visuals. Dave Griffiths, a major player on the SuperCollider scene, uses Scheme, a relative of Logo via a shared Lisp-like grammar, to produce somewhat formalist, live coded 3D work within his own Fluxus community for coders to show off skills and ideas. The particular aesthetics and concerns of this scene shine through in the work of contemporary coders such as Jaromil of FreeJ fame. Video coding is still very much a skilled, virtuosic affair. Even modern hardware struggles to keep pace with the complex data manipulations dreamt up by today’s artist-coders. Artists are always bound to some degree by available technologies, pushing hard against these barriers and in some instances breaking through to find novel ways of working which then feed back into visualisation or creative industries. Free Software and a shared codebase supports this sometimes symbiotic relation, which shares much in common with academic research models. Indeed, many of the computational issues associated with the wide spectrum of video, have been addressed by academics, with historic papers presented at the annual Siggraph conference, a peculiar marriage of commerce and art. environment. Logo’s turtle graphics and instruction set, designed to command a robot or screen-based turtle to draw often recursive patterns, crop up again in SuperCollider and this neat, low tech approach also fits easily with Tom Schouten’s work on Packet Forth, within PDP, which he describes as “turtle graphics on steroids.”

If the digital audio sequencer is the rather dull mainstay of every sound studio, then Non Linear Editors (NLE) are the bread and butter of video work; a solid dependable tool which is used by necessity rather than for true artistic exploration. As such, apps such as the ambitious Cinelerra, coded more or less solely by Andraz Torri, unfortunately fall outside the remit of our investigation. However, as a good many natural NLE operations, such as transitions, equally well apply to live video processing, there does seem to be a growing trend towards crossover of NLE and live video apps, particularly suites which are geared more around the art of VJing; mixing, matching and creating visual material as an adjunct to DJs or live electronic audio. Andraz Torri is heavily involved in the Piksel initiative, and the LiVES (LiVES is a Video Editing System) application, coded by Gabriel Finch, aka Salsaman, is another case in point. LiVES started life as a simple, small GNU/Linux NLE, and it does still fulfill this function admirably, offering a decently intuitive GUI with cut and paste facilities, and support for a good range of input and encoded formats. However, to some extent LiVES has recently morphed into something of a well featured VJ tool, with a reasonable range of extendible effects, easy selection of sets of frames, and good pluggability with apps such as PDP. An NLE environment may not present the most intuitive interface for VJ work, but Salsaman, a major player in the Livido project, does seem more than adept at running live with LiVES. Hardcore VJs should also check out Veejay, from a team headed by crack coder Neils Elburg. VeeJay again presents an NLE approach to VJing, centring around the cut and paste of clips in edit lists. Clips can be recorded from existing clips, live streams or multiple sources. Veejay is particularly provocative when it comes down to switching clips, feeding back frame selections and changing playback speeds.

Farmers Manual push Pd extension GEM to the limit, flattening audio and visual data in glitched virtuosic performance

To quickly recap, Pd presents a visual environment allowing artists to connect math, control and generator objects in complex arrangements called patches which represent a work in progress. Patches can easily be reedited, re-run, and controlled by a huge range of external and networked sources. A flattened approach to audio, video and all data is the key to Pd, providing artists with the essential ability to throw stuff together, connect disparate data sources to complex processes, control structures

Artists are always bound to some degree by available technologies

Throw nearly 100 mixes, effects and frame blends into the toolkit, alongside chaining with EffecTV and VeeJay adds up to a powerful package which also boasts a decent level of control, with a handy hotkeyed console and remote access using either its own VIMS protocol, or OSC (Open Sound Control). FreeJ, from master coder Jaromil of dyne:bolic fame, is another such more accessible, VJ oriented app, which does offer a pleasant old school approach to the problems of working live with visuals. The latest FreeJ 0.7 iteration presents a range of interfaces to the well written underlying realtime video manipulation engine. A GTK2 interface is accessible, but not recommended for live use. The Vi style console with well documented hotkeys and completion, courtesy of S-Lang, is probably the most pleasant interface, with tab completion of basic commands and concise online help. FreeJ can also be scripted, using Javascript, accessed remotely with VJoE (VeeJay over Ethernet) or controlled by MIDI or joysticks. Hotkeys probably provide the fastest path to the VJ action, and FreeJ does present a flexible yet easy to use approach to video, simply breaking down the whole mix into layers and effects, both of which can be easily juggled and manipulated. Layers are pretty much self explanatory and can be drawn from live, pre-recorded (DIVX/AVI), text or generated sources. Blits or blends can be selected for each layer to allow mixing, and chains of effects are applied to selected layers. Effects, which throw in nearly all of Kentaro’s venerable EffecTV filters, can be positioned and re-positioned in the chain live, to alter the final result. FreeJ is fast, neat and reasonably expressive, though the choice of scripting language may not suit all tastes.

and outlets. Pd itself is more or less exclusively audio angled, though of course this doesn’t stop artists projecting their own patches, as they re-patch live, or even making use of Pd’s decidedly old school graphing functions. Until recently GEM, kicked off by Mark Danks and now developed primarily by Gunter Geiger and IOhannes m zmoelnig, was the choice for real-time graphics under Pd. GEM throws in a full multimedia toolkit, but does impose some limitations, dividing the audio and video worlds in a slightly unhealthy manner. Sure, users can control one medium from the other, but this doesn’t add up to a pure data approach. That said, artists such as Erich Berger, make good use of GEM in masterfully straddling both media. GEM, an external or add-on library for Pd, integrates a vast array of math, control and effect objects with a decent, complex OpenGL oriented set of objects. You can play with polygonal graphics, lighting, texture mapping, image processing, and camera motion. GEM boasts a huge number of mature objects, with recent iterations of the library adding a complete wrapper around the OpenGL set of functions which throws more than 250 new

The 25 frames per second clock is always ticking in the developer’s head, and her code quite simply has to keep up with this frenetic pace
objects into the mix. Objects are logically divided into controls, which includes mouse, keyboard and tablet access. The Gem-window object is also here, and this must be used to provide the rendering context for every patch. Gemhead specifies the start of the rendering chain. Manipulator objects provide for such things as scaling and vector translation and self explanatory Geos offer building blocks such as spheres or polygons. You’ll also find plenty of particle objects, Nongeos for lighting and powerful pix objects supporting a huge range of operations on pixel data. These include effects, composites, filters, and access to sources such as image or movie files. Abstract vector and maths operations round out this powerful external. As you can imagine GEM can take some getting used to, and a knowledge of the OpenGL way of doing things surely helps. As with nearly all Pd externals, the main source of

Technical constraints can work wonders for artistic creativity, and with an often common code base and set of programming concerns shared by Free Software artist-coders, it’s less likely that the bigger artistic picture will become obscured by bits, bytes and low level implementations. Collective projects such as Livido (Linux Video Dynamic Objects), which aims to create a standard, flexible plugin API for video apps, ensure that the burden of complex coding is well shared, with developers beavering away in their own field of expertise, and sharing new concepts and ways of working. Indeed, much of the contemporary codework with video does share a common base, with Kentaro Fukuchi of EffecTV fame, swopping code with Andreas Schiffler, coder of SDL_gfx, back in the early days when Kentaro was working within the Japanese entertainment industry, producing VJ style tools. As this code-base developed, Jaromil, producer of FreeJ and the excellent dyne:bolic distro, shared and developed material, and EffecTV code even finds its way into ap’s work, Yves Degoyon’s PiDiP extensions to PDP and the GEM external. Of course, there are many such strands of development, and this is just one history which is best expressed through the Piksel initiative at Bek. The roots of large scale projects stretch far and wide, with numerous packages, from Mplayer, through GStreamer to SDL acknowledged as inspirations, models or code donors. And shared or common concerns simply don’t deny the existence of mavericks, with playful artist-coders such as Tom Schouten, providing a completely different take on cross media chicanery, yet one which well integrates with a host of other apps and approaches.

At the other end of the spectrum many artists favour a low level approach above the slick graphics of more demanding applications. The ascii aesthetic, images produced by a basic character set, is ever popular and ever portable. Originally conceived as a means of outputting rough low bitrate graphics on slow hardware, and of distributing visuals with supremely low bandwidth, the ascii approach still finds its way into all the major Free Software packages for visual work, from PiDiP and gdapp to FreeJ. Full length ascii videos have been produced by the Ascii Art Ensemble, based in Slovenia, and the British artistic group ap have also produced an ascii video conferencing tool. Live coding, as explored in the last issue, also presents a supremely raw, low-level approach to visuals. Exposing the interface, or working methodology, of the artist is a common trope,

Jacking video
Throwing sound around, feeding it through a host of effects and apps, or even piping it across networks, is a trivial affair under GNU/Linux. The toolkit is mature and very much in place, allowing for all manner of playful artistic plumbing with JACK, OSC and simple Unix tools providing a solid infrastructure. Audio is so much easier to deal with. Try piping raw video data across a local network and you’ll see the issues immediately. And the networked transparency which audio enjoys is in tatters when it comes to working with video. With no common transports (vloopback is now deprecated), protocols or infrastructure, visual data is often imprisoned within one ringfenced app. Fine, if you’re a lone artist working solely with Pd, PDP and PiDiP, but what if you need to jam with others using apps such as Veejay? The problem can roughly be broken down into the twin issues of interoperability and plugin architecture. Given these, infrastructures can easily be created which provide that much needed transparency for applications. Thankfully, the Piksel video framework, a Free Software initiative which kicked off at last year’s conference at Bek, is attempting to address these thorny issues, pooling some of the best minds in the video coding scene on intensely active mailing lists and hardcore CVS. The Piksel aims to implement a library of plugins for dynamically loaded video processing apps and colourspace transforms. At the same time, a common set of control commands, a video version of OSC as it were, is under discussion and a library implementation is being attempted.

Flattening data with green screened efficiency, ap’s gdapp gets nodal with visuals

As we’ve seen throughout this series, Pd, accompanied by a huge raft of externals, really does present the most flexible artistic environment for constructing standalone apps, prototyping designs or purely for artistic exploration and experimentation with structures, concepts and methodologies.

54 LinuxUser & Developer

LinuxUser & Developer


Coding for sight and sound

Erich Berger whips up a storm, with image as pure sound thanks once more to the ever so versatile Pd and GEM

Desktop publishing

documentation is example patches which in this instance cover such areas as lighting, video, particles, and textures. Help patches are also readily accessible for each patch and a number of primers exist.

it via Emacs, or, of course, run it from Pd. PF communicates with Pd using standard Pd messages, and now implements all the PDP functionality, such as those new data types and objects, for Pd. PF could be described as just another take on Pd, and all these issues of pluggability do open up the fundamental question of interface; just how do artists interact symbol. Though currently only a handful of packet formats exist, for encoded images, textures, render buffers, matrices and even cellular automata, there’s no reason that PDP is restricted purely to the visual realm. It’s worth understanding author, Tom Schouten’s intriguing take on data packets, as this does give a good idea of how versatile PDP can be. Sure, you can dive into PDP and play with well documented sample patches, but these are just the tip of the conceptual iceberg. Data packets can store all sorts and types of data, offering both a raw buffer for bits and an all important symbol describing how it should be interpreted. For example, the symbol image/ P411/16/320/240/3 describes the video frame packet uses. As we’ll see, the very latest PDP pushes this concept even further, embedding Packet Forth (PF) code in the packet. This 0.13.0 version of PDP implements some radical changes in a brand new libpdp, though PDP can still can be compiled using the older PDP kernel and indeed this is the default. PDP 0.13.0 packs in the usual huge array of objects providing for all the usual sinks, sources, filters, transforms, pixel effects you’d expect from a fully featured multimedia library. PDP offers much more though, with optional support for embedded Guile, and a new take on data packets with the embeddable Packet Forth interpreter. Libpdp is highly experimental, but if you’re prepared to take risks and work through often dense documentation on developer mailing lists, the resulting artistic explorations can be extremely rewarding. PF really does push Pd’s extendibility to the limit. Indeed, PF leaves Pd behind, in that it can easily be used as an embedded scripting language in your own C apps, in the same manner as Guile for example. As Tom Schouten puts it, PF turns PDP into a tool for writing PDP. PF is a tough nut to crack, combining a nonstandard Forth, a highly minimal stack-based language, with some Lisp list operators and actions on pure data packets. What’s more important to understand is the beauty of the PF concept. It throws code into the data soup and enhances the supreme pluggability of Pd. Within data packets, meta-data acts as PF code. PF can be accessed as interpreter from the command line, or you can easily shoot code to with computers and what are the most creatively fulfilling means? These are questions which nearly all these Free Software apps attempt to answer, through diversity, community and mutual inspiration.

Scribus is the open source desktop publishing tool that comes with many Linux distributions. Martin Althoff takes Scribus out for a trial run

If GEM doesn’t serve to keep you busy with multimedia experimentation, PDP (Pure Data Packet), another huge Pd extension library which has itself spawned other externals, should certainly stretch your imagination. PDP, though grown from the fertile ground of GEM, perhaps presents a clearer way of working which is more in tune with Pd’s central philosophy. PDP is also more ambitious and offers an interesting and active development model. Though not exclusively dealing with visual material, PDP is more oriented towards working with real-time video, though OpenGL operations are supported using 3dp objects in PDP. It’s all about aesthetic choice and working philosophy, with GEM seeming perhaps more formal and more suited towards abstract, systems driven work. PDP inspires a messier, more ad-hoc way of working. Of course, with Yves Degoyon’s huge PiDiP extension library for PDP, you can have the best of both worlds, making use of gem2pd and pdp2gem objects to connect both PDP and GEM. PiDiP (the deliciously recursive PiDiP Is Definitely In Pieces) is all about adding connectivity, by way of diverse streaming objects, to the already powerful PDP platform, alongside video manipulation tools (again the full EffecTV roster) and control structures such as motion detection and colour tracking. PDP really maxes out Pd’s already insane pluggability, allowing for generalised objects which can be flattened using the PDP type conversion system. Indeed PDP takes Pd’s data flattening philosophy and runs with it, providing a brand new Pd atom called a data packet. What this means is that any data object now exists on the same footing as a float or

Key Links
Erich Berger: Piksel ap Ascii Art Ensemble TOPLAP Cinelerra LiVES Veejay FreeJ: Pd GEM PDP PiDiP EffecTV

Page layout with Scribus
onventionally, the role of laying out the pages of a magazine or journal is fulfilled by the graphics designer, working with the publishing process. Text is pasted on the page, and presented in different fonts in the context of illustrations and graphics. The role of the layout program is to allow manipulation of the ingredients, text, graphics and fonts, into the desired format. The result is saved as a PDF/ X file. PDF has become the standard format for print production. Virtually every magazine is delivered to its printer as a high-resolution PDF file that contains a mix of vector and raster data. For those without a direct involvement in desktop publishing, such software holds little of interest. People that do work with layout programs such as Scribus on Linux, whether out of sheer interest or to earn a living, treasure a clear workflow and precision in all aspects of the document design. Both of these criteria are met by Scribus, currently at version 1.2, the aKademy Edition. cork message-board to which you attach bits of paper, notes containing text or photographs, onto which you cannot write directly. For the sake of convenience Scribus optionally allows the auto-creation of a text frame covering the whole page when a new document is created. While it is possible to write and edit the text in the frames with the Scribus text editor, the more usual method would be to import the content from an external file. Scribus supports plain text and CSV files. Graphic files are similarly imported into graphic frames. Scribus supports a variety of standard graphic formats such as PNG, JPEG and TIF. To edit images, Scribus accesses The Gimp. The graphic frames allow the scaling of the images without the need to actually modifiy the image. But for quality reasons, large modifications should be left to a complementary graphics package such as The Gimp. from a fresh approach to workflow and access to a vast range of features. The open source alternative to these packages, Scribus, stands up well to direct comparison. Scribus has the additional attraction of being Free Software, and is available on a more economic platform, which should appeal to many publishing firms. Page layout programs are a vital element of The frames on a page can be visually placed, possibly aided by the adjustable snap-to-grid function, or measured exactly to the horizontal X and vertical Y coordinates needed. Measurements can be given in points, millimeters, inches and pica. The precise placement of objects, up to an accuracy of 1 micrometer (0.001 mm), is one of the the de facto standard tools, QuarkXPress or Adobe InDesign. QuarkXPress is relatively conservative, and somewhat dated, and has been around for many years. Despite its long history and wide adoption, QuarkXPress has not evolved to meet the ease-of use criteria expected of modern software tools, and suffers from awkward workflow. Over recent years, Adobe InDesign has gained favour, benefitting

An underground movement, known as the Demoscene, emerged, providing a platform and community for coders to show off skills and ideas

There are plenty of features to make Scribus a good workhorse application

So what does Scribus have to offer? Like other DTP programs, Scribus is frame orientated. Unlike a word processor, all document content such as text, tables and graphics are held in frames. The page is a container onto which these frames are placed, but cannot hold text by itself. You can visualise this by imagining a

56 LinuxUser & Developer

LinuxUser & Developer