State of the Intern
It’s been so long since I’ve written a blog post (one months I think) that I’m literally at a loss of how to best write this massive update. So I’ll just start…. I’ve been working hard at my internship out in Oregon with Musicstrands, and I’ve been making some good progress on manipulating the data that MusicStrands has in order to make sense out of it, and utilize it to drive recommendations for other songs. The system has stemmed in part from my “Goggle” (currently down) iterative search engine visualization methods, but has grown into something much larger and more complex. I’ve been unable to post about what exactly I’m doing due to NDA reasons. However, it has generated enough interest that MusicStrands is going to start using these visualizations to promote the company, and since the visualizations are eventually going to be featured in the L.A. Times, I can share an example with you as a sort of sneak preview. The visualizations themselves are really only the tip of the iceberg. This method of actually rendering nodes (songs in this case) using multi-dimensional scaling techniques is well known. It’s the underlying algorithm determing the nature of their relationships that’s really interesting (imho). The motivation behind the relationships of the songs borrows heavily from Chris Anderson’s “Long Tail” articles, on the nature of an “optimal” recommender system. Basically the visualization technique generates an easily recognizable tapestry of popular music according to “Long Tail” aspects of music, but it can be configured to render a “personal” tapestry of the kinds of popular music that an individual enjoys, including songs that may not be present in his or her own library. You can see in the picture above how the songs automatically arrange themselves into easily identifiable “pseudo-genres” of different musicians and musical styles. Because these cluster-like genres are generated dynamically, they won’t restrict the labelling of artists or songs, and will instead reflect the natural clustering of music at any point in time. For instance you don’t have to say “I like Jennifer Lopez and Hip Hop Music”, but you could theoretically find songs that include a mix of these two types of music. The recommender would recommend a duet between Jennifer Lopez and 50-cent, rather than, say, a duet between Jennifer Lopez and Marc Anthony (which no one would want anyways). I’ve highlighted the various noteworthy clusters in the picture, and their orientation next to each other will often expose a “grey area” of collaboration or a mix of styles.