echelon

A personal technology blog about software development, art and music.

Technology and Art

Oftentimes the merger of technology and art produces wonderful and fascinating things. It is maybe the kind of art I personally enjoy most since I can easily relate to it. Here are some examples I recently discovered:

Alexandre Farto is “Vhils”, a Portuguese-born artist who amongst other things carves faces into wall. This alone, while really great, does not really merge technology and art. In this video one can guess how the artist works.

But this video depicts Vhils’ art in an even cooler light: it gives the illusion, that the artwork is literally blasted in the wall. Truly amazing.

I discovered this through a “hack a day” article. The article does not say much about how it is done. There’s a lively discussion in the comment section of article.

People seem to agree, that the artist in some cases carves the sculpture in the wall, covers the work and later unmasks it. He might use rubberized explosive: flat sheets of solid but flexible material that can be cut to specific shape, bent around solid surfaces and glued or taped in place (Source: Wikipedia).

I have to agree with reader’s opinions that is is rather unlikely that these are controlled explosions and that the artists creates his work through designing the explosions. The work is nonetheless impressive. You can find more images like this an other works on the artist’s website.

The merger of art and technology is more apparent in the work of generative artists such as Michael Hansmeyer. In the work presented here, the artist designs columns using subdivision processes. He feeds his algorithm parameters that result in a distinct local variance in the patterns that are generated. This results in columns with millions of faces (image on the left).

The artists calls his work Computational Architecture. But generating a column model is only the first step of his process. In a second step the columns are actually produced as a layered model using 1mm sheet. That is 2700 sheets for a 2.7 meter high column. Cutting paths are calculated by intersecting the column with a plane and sheets are cut using a mill or a laser. The image on the right shows part of a fabricated column. I guess the color is a result of the fabrication process.

Read more about the artist’s process here including a video of a rendered column. More column images can be found here.

Good? Music?

I added an “art” and “music” section to the blog. I’d like to share two albums I currently enjoy a lot and listen to almost exclusively. There’s “Black Rainbow” by Aucan and “Peanut Butter Blues & Melancholy Jam” by Ghostpoet. Two very different but ingenious albums.

My favorite track from the Aucan album is “Heartless”. It’s really incredible. Listen to it here:

From the Ghostpoet album I currently most enjoy “Run Run Run” but I expect this to change over time since it’s an album that grows on you and that you have to listen a couple of times to really appreciate it. I don’t want to go into why I like these albums, others are light-years better at music critique than I am. I just want to say: it’s really good music. You should check it out.

A simple lock-free queue in C++

A while ago I needed a simple lock-free queue in C++ that supports a single producer as well as one consumer. The reason to use a lock-free data structure instead of a regular std::queue guarded by mutex and condition variable for me was to avoid the overhead of exactly those mechanisms. Also since I never came in contact with the interesting concept of lock-free data types, I thought this was an excellent opportunity to look into the matter.


You can never go too far, image by Baboon

Unfortunately (or fortunately) I could not find a production ready lock-free queue in C++ for my platform (Linux). There is the concurrent_queue class available in Visual Studio 2010, there’s the unofficial Boost.Lockfree library (code is here) but those options were not satisfying. I needed something simple that worked right out of the box.

Herb Sutter wrote an excellent article a while back about designing a lock-free queue. It’s straight forward and functional. The only flaw it has from my point of view is the use of C++0x atomics which I wanted to refrain from including in my project for now. There’s of course the unofficial Boost.Atomic library, but again I wanted something simple.

So I reread Herbs article, especially the passage that explains the requirements for a “lock-free-safe” variable: atomicity, order and the availability of a compare and swap operation. Digging into the Boost.Atomic code I found that it is rather simple to get atomicity and order when using gcc. I’m using version 4.4.3.

Atomicity can be ensured using the atomic functions of gcc and order can be ensured using a memory barrier trick:

type __sync_lock_test_and_set (type *ptr, type value)
asm volatile("" ::: "memory")

The first line is an atomic test and set operation and the second line is equivalent to a full memory barrier. With this functionality I was all set. Here is the modified code for the method the producer calls:

void produce(const T& t)
{
  last->next = new Node(t);                      // add the new item
  asm volatile("" ::: "memory");                   // memory barrier
  (void)__sync_lock_test_and_set(&last, last->next);
  while(first != divider)                       // trim unused nodes
  {
    Node* tmp = first;
    first = first->next;
    delete tmp;
  }
}

I replaced the atomic members with regular pointers but exchanged the assignments that require atomicity and order by calls to __sync_lock_test_and_set preceded by the full memory barrier trick. The function call must be cast to void to prevent a “value computed is not used” warning.

bool consume(T& result)
{
  if(divider != last)                        // if queue is nonempty
  {
    result = divider->next->value; 	              // C: copy it back
    asm volatile("" ::: "memory");                 // memory barrier
    (void)__sync_lock_test_and_set(&divider, divider->next);
    return true;          	                       // report success
  }
  return false;           	                    // else report empty
}

The full class can be downloaded here. Please note that the class lacks some important functionality if your use-case is non-trivial (a proper copy-constructor comes to mind).

What it is all about

There is a collection of definitions of lock-free and non-blocking and wait-free algorithms and data structures that can be interesting to read.

The idea of lock-free data structures is that they are written in a way that allows simultaneous access by multiple threads without the use of critical sections. They must be defined in a way that at any point in time any thread that has access to the data sees a well defined state of the structure. A modification was either done or not done, there is no in-between, no inconsistency.

This is why all critical modifications to the structure are implemented as atomic operations. It also explains why order is important: before critical stuff can be done, preparations have to be finished and must be seen by all participants.

On Supercomputers

Supercomputers are an integral part in today’s research world as they are able to solve highly calculation-intensive problems. Supercomputers are used for climate research, molecular modeling and physical simulations. They are important because of their capacity as they provide immense but cost-effective computing power to researchers. It’s power can be split across users and tasks, allowing many researchers simultaneous access for their various applications. Supercomputers are also essential because of their capability. Some applications might require the maximum computing power to solve a problem – running the same application on a less capable system might not be feasible.

Cray 1 Core
Cray 1 Core, image by carrierdetect

Supercomputers themselves are an interesting research topic and excite scientists of many branches of computer science. I myself have been working in the area of tool development for high performance computing, an area closely related to supercomputers. I therefore follow the current development in this area and even dare to speculate about what is going to happen next.

Recent Development

We have seen how the gaming industry and high performance computing stakeholders started working together to develop the Cell microprocessor architecture that is being used both for game consoles and supercomputers. Recently, a similar collaboration caused an outcry in the global supercomputing community when China finished their Tianhe-1A system that achieved a peak computing rate of 2.507 petaFLOPS making it the fastest computer in the world. It is a heterogeneous system that utilizes Intel XEON processors, Nvidia Tesla general purpose GPUs and NUDT FT1000 heterogeneous processors (I could not find any information about the latter).

While these computers are impressive, they are built in what I would call the conventional way. These systems are really at the limit of what is feasible in terms of density, cooling and most importantly energy requirements. Indeed, the biggest challenge for even faster systems is energy.

Upcoming Systems

Right now the Chinese system got the upper hand but IBM is building a new system in Germany called the SuperMUC (German). While using commodity processors, this new system will be cooled with water – a concept I suspect we will see a lot in the future. It is expected to drastically reduce energy consumption.

Another computer that is probably going to dwarf everything we have seen so far is Sequoia system by IBM. Utilizing water cooling like the German system, it is a Blue Gene/Q design with custom 17-core 64bit power architecture chips. The supercomputer will feature an immense density with 16,384 cores and 16 TB RAM per rack. As this system is aimed to reach 20 petaFLOPS it is a great step toward the exascale computing goal. Personally I would love to get my hands on a rough schematic of the Blue Gene/Q chip as it somewhat reminds me of the Cell processor: it also has 1 core to run the Linux kernel and a number of other cores for computation. What I would like to know is if the cores share memory or if they have dedicated user programmable caches that are connected via a bus.

Frostburg, The Connection Machine CM-5
The Connection Machine CM-5, image by Ryan Somma

The role of ARM

I’ve been wondering for a while now if ARM will try to get a foot into the HPC-market. ARM Holdings is a semiconductor (and software) company that licenses its technology as intellectual property (IP), rather than manufacturing its own CPUs. There are a few dozen companies making processors based on ARM’s designs including Intel, Texas Instruments, Freescale and Renesas. In 2007 almost 3 billion chips based on ARM designs were manufactured.

While ARM is considered to be market dominant in the field of mobile phone chips, they just recently started to break into other markets, especially servers and cloud-computing – also including high performance computing. Their new Cortex-A15 processor features up to 4 cores, SIMD extensions and a clock frequency of up to 2.5GHz while being very power-efficient. We will have to see how this plays out but I can imagine that for example the Exascale Initiative is very interested in something like this.

Your own embedded supercomputer (sort of)

A processor similar to the new Cortex-A15 is the OMAP 4 CPU by Texas Instruments. It is a Cortex-A9 dual core processor with SIMD extensions. While targeted at mobile platforms, this CPU is very similar to the upcoming high performance chip since it supports the same instruction set. There exists a cheap, open source development board for aforementioned Coretex-A9 called the Pandaboard. It sells for less than $200 and is a complete system that runs Ubuntu. In my opinion, this can be seen as single node of a high-performance but power efficient computer cluster.

Open Kinect Prize

As stated in my 3D Vision for Embedded Systems post, Microsoft’s Kinect could be an extremely useful tool to give robots 3D vision. Now the fine people from Adafruit offer a prize for it. If you can provide open source software that gives access to the Kinect video signal (RGB) and distance values they will pay you 1000$. A tempting offer. I’m looking forward to seeing the results.

Update: The guys from ifixit.com got their hands on a Kinect and took it apart. The article shows detailed images of everything and even identifies all integrated circuit chips found inside the device.

Second update: Adafruit increased their offer to now $3000 and it looks like the problem isalready solved, at least partially. Check out this youtube video of the Kinect displaying RGB-D information on Windows 7. The depth information looks somewhat weird though. New updates can be found here. This video shows what the infrared projection of Kinect looks like. Pretty cool!

Third update: It is done! There now exists an open source driver for Microsoft’s Kinect. Adafruit announced the winner of the $3000. And more importantly the code can be found here.

Diffie-Hellman key exchange in Javascript

The Diffie-Hellman key exchange protocol (RFC2641) is an efficient way for two parties (that do not know each other beforehand) to establish a shared secret. This secret can then be used for encrypted communication between the two parties. Because of the simplicity of the algorithm it is a feasible to implement it in Javascript. It could serve as a way to communicate between to participants of a social network in a host-proof way (meaning the host can not access the messages that are exchanged between Alice and Bob).

The following straight forward implementation is based on the ingenious BigInt Javascript library created by Leemon Baird:

function d_h()
{
  // Alice tries to send to Bob, they agree on a 256 bit prime
  var p = str2bigInt(
    "10883804965994030335610375728683210724614077" +
    "5791152305372594007835539736018383", 10, 80);
  var g = str2bigInt("2", 10, 80);       // and on a generator

  // Alice chooses her secret randomly
  var a = str2bigInt(
    "3103830701216420526524336327874353922197141754" +
    "5159778383387464250", 10, 80);

  // Bob chooses his secret randomly
  var b = str2bigInt(
    "26264611999731746945409581849451656136966431" +
    "516270225144913718599547", 10, 80);

  // calculate the public key
  var A = powMod(g,a,p); var B = powMod(g,b,p);

  // A and B are exchanged publicly
  // Alice and Bob can compute the same key from A and B
  var a_sec = powMod(B, a, p);
  var b_sec = powMod(A, b, p);

  alert(bigInt2str(a_sec, 10) + '\n ' + bigInt2str(b_sec, 10));
}

Update: Please note that the following code might produce an insecure generator. An alert reader pointed out that the values it returns for g are too small.

In case you wonder how to choose p and g – so did I. Turns out a generator for those numbers is part of the Crypto++ library. The following application uses this library to calculate p and g with different length.

#include <iostream>
#include <vector>
#include <tclap/CmdLine.h>
#include "dh.h"
#include "osrng.h"

using namespace std;
using namespace CryptoPP;
using namespace TCLAP;

int main(int argc, char** argv)
{
                                                   // read command line
  CmdLine cmd("Diffie Hellman p g generator", ' ', "0.1");
  UnlabeledMultiArg<unsigned int> m("bv", "bits and values", true, "");
  cmd.add( m ); cmd.parse(argc, argv);
  vector<unsigned int> mv = m.getValue(); mv.push_back(1);
  unsigned int bv = mv[0]; unsigned int nv = mv[1];

  AutoSeededRandomPool arngA;                       // generate p and g
  RandomNumberGenerator& rngA =
    *dynamic_cast<RandomNumberGenerator *>(&arngA);
  for(unsigned short i=0; i<nv; i++)
  {
    DH dhA(rngA, bv);
    cout << "p=" << dhA.GetGroupParameters().GetModulus() << " g=";
    cout << dhA.GetGroupParameters().GetSubgroupGenerator() << "\n";
  }
  return 0;
}

The application uses the TCLAP library to parse the command line. The first parameter should be the width of the prime number in bits, the second optional parameter specifies how many primes and generators should be calculated.

Here’s a list of p and g I generated with different widths:

128 bits:
p=197221152031991558322935568090317202983 g=2

256 bits:
p=11245712998331706449413325803449175679094351102802336690101496856041
0379195027 g=3

512 bits:
p=85591083890203292422929753448312002013227700164244056883767436387780
3988137714894694780639823858404151944019992782250585325661689168497741
6276395927762259 g=3

1024 bits:
p=10756096062425499020105845514001321706994933124811434157429824859705
5584898126321093330768462486995125563469663163251404817709163750190539
8111908385982976186475492023836046174671146008757104643109463540714363
4989690852152224155785251701387082100575086970999263664414942032660939
2283245879753842682719140993487 g=2

2048 bits:
p=28401819833194672263220074318568239913240629340614134316461681527129
7400190276361407578313119949940155208450277590596255133024483132860231
2683364685437571458516690608818388566379408009734093619889676676688482
4330230759478149039933887325452487112982841213248410034360015394086822
8492443771741195501623918057531604439427869721645027989688343068984123
6647850119363273567137904693671110310979494733094653291727319950103948
4827099108375018229688415849095088940888827138097109161920916553046346
9800257941704054817481593649080741474714321422084101095199976587101477
59474207550469840261684005212634493995003979085986141372247 g=2

Those numbers do not have to be calculated every time but in fact can be hard-coded in your application.

3D Vision for Embedded Systems

3D Vision is an integral part of every autonomous robot. Getting it right is not trivial and certainly not cheap. We struggled with this problem ourselves when trying to build an autonomous Mikrokopter drone. Cheap systems are not very reliable, good systems are disproportionately expensive and often very heavy. However I believe there may be hope yet for usable and cheap 3D vision systems for hobbyist and hackers alike. Granted, Microsoft is a rather unexpected reason for such hope but their Project Natal (now Kinect) might just be what we have been waiting for (image by litheon).

It’s a structured-light 3D scanner developed by PrimeSense. While being used for measuring the three-dimensional shape of an object I think it should be possible to use it as a 3D vision system for indoor robots. So I would really like to see hackers taking this cheap device apart and re-purposing it for home-brew 3D vision. In their paper “A low cost 3D scanner based on structured light“, Rocchini et al. describe how to build one and this incredible instructables article explains how to build one also and even provides source code.

Of course there are other exiting technologies out there, a comprehensive list of 3D vision devices can be found at acroname.com (including prices). The devices range from expensive but effective and impressive Time-of-flight cameras (vendors: [1] [2] [3]) to LIDAR systems. The latter were very successfully used in DARPA’s Urban Challenge autonomous driving competition. A very interesting video that describes how LIDAR works in detail is available at blip.tv.

Of course there’s also the good old ultrasonic and infrared sensors that are cheap and have low resolution and are moderately reliable but they can certainly be very useful.

Sphinx Cheatsheet

I find myself working with the Sphinx Documentation Generator a lot recently. It is a great tool to easily create intelligent and beautiful documentation and it is based on reStructuredText. I put together a cheat sheet of all the commands I frequently use. There are a number of very good ones out there ([1] [2] [3] [4]) but I find it more convenient to work with my own.

You can access the PDF from here and the source (OpenOffice Writer document) here. If you find it useful or would like an item added, feel free to leave a comment.

Infobox Icons

I like to scribble with Inkscape while listening to great music or radio plays to relax and clear my head. Today I created a couple of icons that I plan to use in documents to mark special boxes:

It’s a question, an error, a warning, an information and an idea icon. I currently use them in software documentation using the Shpinx Python Documentation Generator. It’s a very good tool to create beautiful documentation especially for software projects but I also use it to publish my game design document. Here’s an example of how the icons could be used:

I based the icons upon images from the Tango Icon Gallery. You may download an archive containing the source (Inkscape .svg) and PNGs of different size here. Use them as you wish, if you find them useful I’d be very happy if you left a comment here.

Back to Browser Games

I’m back thinking about games, browser games to be exact. It’s been a while since I thought about games or wrote code for one – it’s been about 1 year since I stopped developing a prototype I called ‘Prototype 4′ which is a web based Master of Orion clone. I was working on the game full time for about a month but then had to stop because I had to take up a job to get some money and I also had to finish my diploma thesis. It progressed fairly far further than any other of my prototypes in fact: I created some nice graphics and a layout, a cool javascript user interface, a very efficient yet flexible back end library in PHP and a C++ battle calculator. Here are 2 screenshots of what the game looks like:

Prototype 4 Screenshot 1 Prototype 4 Screenshot 2

This was supposed to be a very literal copy of Master of Orion that you could play online with up to 15 other players on one map. Yet now I have something different in mind and I also have a different approach. It will only be a hobby project and I don’t plan to get immediate results, I want it to progress and evolve in my mind until I’m at a point where I can start writing code. Writing the code should be just handcraft at that point as I have most of the skills and knowledge to develop such a game.

What I’m doing now is writing a design document. The design will be highly influenced by the Foundation Series, one of the greatest science fiction series ever written. The first version of the design document can be found here. I still want the game to have a Master of Orion look and feel to it and it will certainly be a 4X type of game but I want different factions to be completely different (comparable to the different races in Starcraft) in the way they are played. I would like to see a “First Foundation” faction that relies on technology, economy and firepower and a “Second Foundation” faction that relies solely on the power of their minds, e.g. mind control. More about that can be found int the design document. A third faction is also possible but I have no concrete ideas about that yet. As the game design and the game itself progresses I will occasional write articles about it here on this blog.



« Newer PostsOlder Posts »