Skip to content
September 8, 2009

Visualizing Music: New Blog Launched!

visualizing music blog

Paul and I have just launched a new blog:
http://visualizingmusic.com/

from the about page:

As the world of online music grows, tools for helping people find new and interesting music in these extremely large collections become increasingly important. In this blog we survey the state-of-the-art in visualization for music discovery in commercial and research systems.

Paul and I wanted to create an ongoing, up-to-date resource of all the different instances of musical “discovery” visualizations that we have come across.  We’re defining this as broadly as we can, while avoiding musical signal visualizations, such as all the iTunes/winamp plugins.  Essentially, music discovery visualizations should help you understand relationships between songs, and hopefully help someone identify new and relevant content that they had not heard or considered before.

There is an incredible variety of techniques used to express musical relationships with images, rather than words.  There’s many people who believe the best way to communicate about music is not through words.  As the famous saying goes: “Writing about music is like dancing about architecture”.

Be sure to check it out, and if you have any suggestions for new visualizations, or comments in general, don’t hesitate to contact either Paul or myself.

July 15, 2009

sculpt3d

Update: This tool is now available on cran.

Researchers in academia and the industry consistently use visualizations to better understand their data.  The standard two dimension scatter-plot is a staple of many exploratory data analyses.  However, there are many cases where two dimensions are not enough. For instance in the plot below, I’ve set up a pairwise plot between three distributions (x, y, and z).  The plots are duplicated on opposite sides of the diagonal (showing x vs. y …or… y vs. x).   I find it extraordinarily difficult to perceive the three dimensional structure of this data solely with the series of two dimensional plots shown here.

3d scatter plot

Luckily, R has a package called rgl that lets me plot in three dimensions using OpenGL.  It even allows for simple interactions with the mouse, like rotation and zooming.

rgl

This allows for a much better sense of the underlying data… but It’s still a bit limiting.  Many times, there is “interesting structure” in the three dimensional representation that I would like to focus in on, or perhaps there’s just a lot of junk that I want to get rid of.  Many times this structure is not isolated in one or two dimensions, but only becomes apparent through rotating and zooming the three dimensional display.

RGL has a useful method called “select3d” that involves selecting points in the plot, but it’s a bit tricky to use.  Using select3d involves typing into the R console, then clicking and dragging on the plot.  This produces output from the select3d function.  However, the output of the function is… another function!  It is then necessary to apply this produced function on the original data to determine which data points fall inside the selection range…. and then it’s completely up to you what you do at that point… do you filter them out?  Crop them? Color them differently than the others?  It becomes necessary to keep a further supply of methods at hand that can perform these routines on the output of select3d.

This is still all a bit too much to keep in my head for extended periods of time.  It’s also a pain to constantly switch back and forth between the three dimensional plot environment and the console in order to pare down the data.

So, I’ve cooked up a little GUI tool bar that makes selecting, labeling, cropping, and deleting points in the RGL plot much simpler.  It’s called “sculpt3d”, since it’s focused on altering and shaping the underlying data.  In order to launch the toolbar, you first install a small library, and then make a method call:   sculpt3d(x,y,z) where x, y, and z are the three dimensions you are interested in plotting.

I put together a small video that shows me playing around with it:

In the video, you can see how I choose a selection color (a mint green color that I thought would be noticeable against the bright rainbow colored points).  Then I crop the data on this selection.  This is a bit jarring because the crop function automatically zooms in on the selected points, and reverts them back to their original color.  The delete function works in a similar way, and is a little easier to follow.

Once I have a smaller collection of datapoints, I might be interesting in labelling them (assuming I passed labels as an argument to sculpt3d).  I can toggle the labels with a click of the “Label” button.

If I’m interested in saving the results of the pruning and cropping, I can access the currently selected points by calling the function sculpt3d.selected(), or the currently visible scene by sculpt3d.current(). This returns the same sort of logic vector that select3d does, so I can now use it to filter my dataset, save it under a new name, and come back to it later on.  Furthermore… since it’s in R, and uses a cross-platform gui (GTK+ via RGtk2 and Glade), it’s possible to use this tool and the data on any platform I want to.

Currently, I only have this tool on a local webserver, and it’s still a bit rough.  However, I thought I’d make it available.  To try it out, enter the following in R’s console:

install.packages(rgl)
install.packages(RGtk2)
rep='http://ethos.informatics.indiana.edu/~jjdonald/r'
install.packages(sculpt3d)

It’s currently source-only, so if you’re on windows, you’ll need to install RTools in order to compile everything.  You’ll also need the GTK+ framework and Glade (which should get installed automatically with RGtk2).

However, once that is done, you can check the demo out by entering:

library(sculpt3d)
demo(sculpt3d)

Thanks to Daniel Adler and Duncan Murdoch for rgl, and Michael Lawrence and Duncan Temple Lang for RGtk2.  Between the four of them, they’ve produced a lot of great stuff for R.

June 18, 2009

HaXe Demonstration

I recently gave a presentation on haXe at the Strands offices in Corvallis and Seattle.  I think it raised a few eyebrows, but probably ended up raising more questions than it answered.  This was to be expected, as haXe is pretty unique in its ability to target multiple web-related platforms.  The devil is in the details, and so I thought it was best to post everything here on the blog so that interested individuals can go over the specifics in a more involved way.

The demo that I showed was actually two parts: A simple “hello world” trace example, which I compile to  swf, php, and javascript.  Then, a more complex example that compiles some code to each of those targets, and then uses conditional compilation to build a web page that displays output from each target all at once (The PHP target references the javascript source, a javascript method references the swf file).

This, to me, is perhaps the most satisfying aspect of haXe:  The ability to treat each target as a minor variation of an API, rather than as a set of totally unrelated platforms.  Developing, targetting, and displaying results simultaneously is a strange feeling.  All of a sudden, javascript, php, and swf are operating off of the same script, and you are totally in control.

You can download the source as an archive here, or browse the source here.  Once you compile the Demo2 file, you should see something like this.

There’s a README in the source that’ll explain what’s going on.

If you’re totally new to haXe, you’ll want to download the compiler from the haxe.org download page.  After installing, you can run the build1.hxml – build4.hxml examples by simply typing in:

haxe build1.hxml

For each of the build files.  For the build.hxml file (non-numbered), you will need to edit it so that it points and outputs to a working web folder.  If you use TextMate, you can download the textmate bundle for haXe, and use the editor to browse and build files.

May 23, 2009

Injecting Methods into HaXe with "using"

Nicolas just added a new keyword to the haXe compiler called “using”.  In short, this keyword lets you inject additional methods into instances of types.

“using” explained

The haXe “using” keyword  is a special import command for classes.   The command will import the given class normally.  However, it has special behavior if the imported class has any static functions accepting an argument.

The “using” keyword takes any static functions defined in the class, and adds those static methods to the argument type, as if the argument type defined the method itself.  In this sense, it behaves like a sort of implicit mixin, or trait.  However, it’s also very different from these approaches, and to the best of my knowledge, nothing like this exists in any other language.

Explaining what’s happening in simple prose is actually a bit cumbersome, so here’s some example code.  First, we’ll give a simple haXe example without “using”.  The first class is a Demo class that contains the main function:

// in Demo.hx
class Demo{

public static function main(): Void
{
trace(BarStringClass.addBar('foo'));
}

}

The second is a class that contains additional methods:

//in BarStringClass.hx
class BarStringClass{
public static function addBar(str:String) :String
{
return str + ' bar';
}
}

If we compile the code using main() from Demo, we’ll get “foo bar”.  We’re simply using the static class from BarStringClass to process the string and tracing it.  Pretty simple.

Now let’s look at what we can do to Demo.hx with “using”.  We’ll use the same BarStringClass from before, but we’ll change Demo slightly:

//in Demo.hx
using BarStringClass; // here's the 'using' keyword in use
class Demo{

public static function main(): Void
{
trace('foo'.addBar());
}

}

haXe now lets us access BarStringClass static functions as if they were functions of the String class.  In this fashion, the standard base classes can be augmented with additional methods without the need to extend them explicitly.

Notice how in this example it wasn’t even necessary to specify the “str:String” argument required by addBar(). When the haXe compiler injects the methods from BarStringClass, it will automatically fill in the current instance as the first argument… turning BarStringClass.addBar(‘foo’) into ‘foo’.addBar().  All of the first argument Types of the “using” classes will be augmented in this way.

So in this case, BarStringClass had one static function (addBar), and the first argument type was String.  Therefore, every String instance now gets the addBar() function.  You could have many different Static functions in a “using” class, and each of the first argument classes will get injected in the same way.

“using” with types and/or classes

The other nice thing about this approach is that it works with the more broadly defined Types/typedefs, and not just Classes. So, my IterTools functions that work (mainly) with Iterable Types can be injected with “using”, and then you can do things like:

for ( i in [1,2,3,4].cycle(5) ){
trace(i);
}

…to iterate through the numbers 1 through 4 five times. “cycle()” and the rest of the functions that operate on Iterables would also then work with Lists, FastLists, Hashes, and so forth.  When you add “using IterTools;” at the top of a class file, anything that can iterate can now “cycle()”.

Caveats

There are two significant caveats with “using”.  The first is that the injected methods are not available through reflection.  This means they’re not part of the “run-time” class instance.  The second caveat is that it can be very easy to overload several instances of a single method name (from multiple “using” classes).  Currently, haXe’s behavior is to only use the first matching static method from a “using” class.

“using” can be habit forming

Some of the positives to the “using” keyword approach are:

  1. You can manage classes externally from code that defines them, seamlessly adding in functionality by augmenting classes and typedefs.
  2. Anything you can do through the “using” keyword can be done by using static classes, so there is very little ‘magic’ going on… just some clever reformatting of function calls.
  3. There’s a ton of static classes out there (in addition to my own) that already contain a lot of good practical functionality.  You can easily reuse these classes as “using” classes, or write your own that handle specific needs.
  4. Method completion works with –display.  So once you include the “using” keyword with an appropriate class, you can instantly see the extra available methods by triggering a code-completion request inside your editor (assuming the editor does –display code completion).

I’m assuming “using” will be added in the next version of haXe (2.04).  Till then, you can play around with it by downloading and building from the CVS sources.

7/26 Update: “using” is now officially part of haXe (2.04).

May 10, 2009

Tim Ferriss and the 4 Hour Work Week

So, there’s this Tim Ferriss guy that’s some kind of productivity guru.  He’s written a book called “The 4 Hour Work Week” which gives you tips on how to streamline the amount of “work” that you do in a day.

In the interest of self disclosure, I didn’t read the book, because I tend to be more interested in what people do than what they say to do. The cliff notes versions I’ve seen on line seem to imply that he has rediscovered the Pareto Principle, or the incredibly well worn 80/20 rule, along with a host of other efficiency principles that seem incredibly hackneyed.  I’m not sure what his unique twist is, maybe he’s good at telling stories.

It turns out Tim’s claim to fame is that he’s gotten very good at finding success in niche activities on technicalities, and then marketing the hell out of them.

  1. He’s a Chinese Kickboxing champion, which he achieved by shoving the smaller opponents out of the ring, and winning through their disqualification.
  2. He’s an expert at Argentine tango, where he has set a record for spinning.  This isn’t really a technical feat, it was more of a novelty act for the Kelly Ripa show.
  3. He lectures at Princeton’s electrical engineering group (on entrepreneurship).

I think that very few people can actually benefit from what he talks about, mainly because he proposes “outsourcing” minute details of daily planning (to personal assistants in India… seriously!).  I don’t know if he’s thought about marketing his book in India (who would they outsource to?).  He also is against liberal use of cell phones and e-mail.

I want to mark this up as another “the secret to success is selling secrets to success” type of situations.  I could care less about yet another productivity guide, I’d rather hear about how he was able to network and promote himself.  That seems to be something he actually knows a thing or two about.  From what I hear, his self promotion drive irked a lot of people, and certainly took much more than 4 hours a week.  However, I have to hand it to him, it seems to have been very successful.

May 7, 2009

Duck Typing and "Roles", Continued

My previous article talked a bit about Perl’s new “roles” method for managing typing.  In the article, I tried to address some of the criticism levied against duck typing, and in doing so perhaps misrepresented Perl Roles as being “rigid”.

Another blogger by the name of Sam Crawley gives an overview of the way that roles provide more flexibility in a recent post, however there is still a significant difference in approaches.

Duck Typing and Reflection

In both the original post on duck types and Perl Roles, as well as Sam’s rebuttal of my previous article, there is mention of the “can” method.  This method is used like:

$dog_or_tree->can( 'bark' ); 

This method tests to see if an object has the requisite field (presumably so that you can call a corresponding method, or access a property), and is very similar to the intent of duck typing.  However, it’s important to note that using this “can” method is not the same as duck typing.  The method is actually what’s known as a runtime class reflection, and not a compile-time type check.  This is an important distinction, because  true duck-typing will make the compiler check the types once before compilation, while reflection requires that the types be checked every time the code is run.  On many platforms (VM’s such as Adobe’s AVM2) reflection calls are quite slow, and should be avoided.

It seems that both authors are conflating these approaches, and unfairly criticizing duck typing for the inefficiencies of managing types through reflection:

If you want to get stricter, and if your language supports this, you can even check the arity or types of the allowed signatures of the method — but look at all of the boilerplate code you have to write to make this work. That’s also code to check only a single method.

You don’t need to insert “boilerplate code” in your classes for conventional duck typing approaches.  You write one “typedef” of required methods, as in a haXe example:

typedef Duck ={
   function walks(location:Position, direction:Vector<Float>):Waddle<Step>;
   function talks(say:String):Quack<Sound>;
}

And typically put it somewhere in your compiler class path.  Now you can accept “Ducks” as arguments anywhere, and they can be checked by the compiler.  From a typedef, we can assert that the Duck walks and talks appropriately (producing “Waddle” and “Quack” collections of “Steps” and “Sounds”).  The arity, return, and argument types let us describe the behavior in great detail.  The level of discrimination that is available only depends on the specificity of the classes, and any duck type assertions occur externally to class specific code.  After we write our Main classes, and create the Duck typedef, we could perhaps differentiate between different ducks, such as those that “molt() : Void”, and we wouldn’t need to update any existing class code outside of creating a new typedef:

typedef MoltingDuck = {
     > Duck,
     function molt():Void
}

Now we can make sure we’re dealing with molting ducks, that behave exactly like ducks in every other way.

Roles vs. Duck Typing, revisited

However, for a final comparison of Roles and Duck Typing, it is perhaps fair to say that Roles are more powerful in a sense.  Roles actually cover a lot of ground, it’s just different ground than what Duck Typing covers.  Roles encompass two major forms of class specification:  interfaces and mixins. The former lets you define methods and properties that must be implemented by the host class, while the latter define methods and properties that are implemented by the Role itself (the methods are specified in the Role description)  In this way, you can even use Roles as building blocks for class functionality in lieu of inheritance.  It seems that this is a pretty revolutionary way to approach OOP, so more power to them.

However, the  type checking performed by Roles is more rigid and structured than duck typing, at least in terms of what happens at the compiler level. You still must declare a Role inside of a class description (a la interfaces) for compile time checking.  For interpreted languages, it may not be that important to distinguish between type checking that occurs at the compiler vs. the checks that occur at run-time.  However, this distinction can be very important for other languages.

I can appreciate the need for articles to promote the idea of Roles within and without the Perl community, but I think it can also be done without unfairly criticising other approaches.  Despite the inaccuracies, the discussion of Roles has gotten me interested in Perl again, and nothing else has done that in years.

May 5, 2009

Duck Typing and "Roles" in Object Oriented Programming

Update:  I have a more recent post on this subject here.

I recently read an article talking about the new “roles” system in Perl. Roles are classes of objects that perform the same set of functions.  So, if you had an “Eagle” object, or a “Sparrow” object, perhaps both would have the field “feathers”, and the function “fly”.   Each object might then differ in the size of the beak, or maybe only the Eagle has the method “swoop”.  Perl roles would let you tag a collection of methods (in this case, “feathers” and “fly”) as performing a certain role (in this case, probably “flying_bird”).  This is pretty useful, because if you wanted to write a (simple) bird watching simulator, you could handle any type of bird object as long as it handled that method and that property.  If the method names are slightly different (for instance, perhaps the “Condor” object  has the method “soar” instead of “fly”), but behave the same, Roles can resolve this.

Know Your Role

Perl’s role system is in contrast to duck typing that I use a lot in haXe.  Duck typing skips the process of setting up explicit roles, and just tries to match the field/method names explicitly.  So, any object that can “fly” and has “feathers” can be considered as a “flying bird” without the need of declaring roles in each object.  The matching parameters for a duck type are typically given in a typedef, which is just a list of field names and types.

The article states that roles have the following advantages:

  1. Roles are explicitly given for classes… in their case they use the example of a field “bark”, which might belong to a “tree” or a “dog” object… duck typing perhaps could confuse them.  Roles would not.
  2. Roles are more flexible… you have more fine grained control over what constitutes or is compatible with a role.

Better Examples

However, I’m not sold. The example they give (“bark” belonging to both “trees” and “dogs”) is not a good one.  “Tree bark” seems like it would be a property, while “Dog bark” seems like it would be a method/action.  This could easily be detected by most duck type checking languages, and would clear up a great deal of field naming confusion right away.

The “old cliché” that they reference (if it walks like a duck, and talks like a duck, then it is a duck) actually gives a better example.  In this case, duck typing is occurring on two fields, rather than one.  The chances that classes have two fields that are identically named, yet pertain to unrelated objects is much lower.  In the article they say that  “roles add context back”, but I would argue that context can get added back by being more specific with typedefs.

Using Natural Language Instincts

This ability to manage specificity through the selection and addition of representative symbols is an integral part of our natural language abilities.  We use it every time we need to search the web (If I wanted to specify that I’m looking for dog-like bark things, I might query “bark woof”, rather than just “bark”.  The specification is only as complex as it needs to be, and it relies on human knowledge to decide which terms best describe/discriminate the necessary object properties.  It also doesn’t rely on specifically describing the roles beforehand, or explicitly indicate which objects perform which role.  Duck typing does rely on programmers to differentiate their classes semantically through field names, which I consider good coding practice (i.e. don’t have two fundamentally different classes that have the same field names, and don’t have similar classes with completely different field names).

In the end, it boils down to managing complexity.  Perl roles will force you to specify all of your roles explicitly, and require that appropriate object class code indicate a role behavior where necessary.  This is “bad” complexity imho, since it involves a lot of boilerplate and class code markup.  On the other hand, duck typing will force you to be very careful and specific with how you define a typedef, and occasionally force you to change how fields are named.  In certain cases, it might keep you from treating certain classes as related when necessary, or it might confuse two unrelated classes… all on the basis of whether fields match or not.

Managing Type Context with Duck Typing

Certain duck type situations can cause false-negative or false-positive  typing conflicts:

  1. Classes should have the same basic typedef, but have mismatched field names (false-negative typing):  This is usually the result of bad coding style for the classes.  Use this as an opportunity to rename the mismatched field names to something more consistent.  Sometimes getter/setter mechanisms interfere with a field match.  If one class uses a getter/setter for a property field, consider adopting the same behavior on the other class.
  2. Classes are not related, but the typedef selects both of them (false positive typing):  See if there’s another field that can be added to the duck type check.  Usually this will let you discriminate between the two classes.

There still could be certain situations where duck typing fails, and in these cases, I typically fall back on class reflection and other methods to get things working.  However, these cases are very rare for me.

In conclusion, both approaches have drawbacks, but I find that duck typing allows me to handle complexity in more familiar natural language terms, while roles are more rigid and top-down.  The former is better if I’m working on my own projects, but perhaps roles really shine when you have to work with larger classes that you don’t have control over or want to modify extensively.

I’m surprised that Perl went with such a strong top-down approach for determining object class types.  It really seems to go against its loosely-styled ad-hoc nature.  Perl has become somewhat stagnant in recent years, so perhaps this is an attempt to shake things up a bit.  If so, I have doubts whether it’s a step in the right direction.

Follow

Get every new post delivered to your Inbox.