Make sure to get your point across
One of the problems with writing cross-discipline research papers is figuring out “how to sell” your idea to the reviewers. Lately, I’ve been getting questions and criticisms on some of my techniques that imply that I’m not getting some of the important points across. I thought I’d try and address them here as practice 🙂
What is unique, novel or innovative about this recommendation mapping technique?
First of all, it is important to realize the technique is comprised of two (arguably) novel components: The network visualization/embedding process, and the interaction process.
MDS as a general visualization technique for networks is not novel. However, it is important to realize the technique I’m working on does not use the “entire network” of network elements for visualization in a conventional sense, but rather a small subgraph. Conventional subgraph visualization techniques tend to treat the subgraph no differently than the global network. However, my technique handles the node weighting and embedding process much differently. It does so by weighting nodes based on their participation ratio, akin to the notion of entropy in the field of information theory. The resulting visualization emphasizes any significant local features of the subgraph, rather than showing features that are part of the global network (features consistent with hubs, etc.)
The “interaction feature” of my work is also fairly new, although perhaps not completely novel. Even though the embedding is different, this technique suffers the same “occlusion” problem prevalent in MDS. The way I handle this is through a method of dynamic node “repulsion” according to direct user interaction with a mouse or cursor. This method allows the user to spread out dense clusters of nodes without the use of an interaction modality (zooming, clicking/dragging, etc.). While this technique won’t work at any arbitrary density, it has proven useful for “nuisance” occlusion that occurs with sets of 2,000 or so network nodes. In general, I think that bringing both of these techniques together holds a lot of promise, especially in the field of music recommendation (my focus).
Why is the visualization more informative/useful to a user than a ranked list of results?
This is an argument I’m going to back up with a quantitative study of user behavior. However, the general idea is that these embeddings are a far more rich source of information than a simple list of items. The user will have a better understanding of higher level thematic contexts (clusters/shaped features) that exist in the results with this technique, and these features will be specific to the context that the user is investigating. This type of clustering information obviously cannot occur in lists, and is valuable both for identifying specifically relevant information, as well as determining parameters to include/exclude in subsequent searches. As a side note, the main problem here is that people seem to want “categorizations” to guide their search. I think this is an idea whose time has more or less passed, especially in the realm of exploratory search. For one thing, categorizations of novel and important content is often not present or possible. Categorizations are useful as condensed partitioned generalizations of content, but they can become obstacles or shibboleths to exploratory search. Music information retrieval suffers from this problem. Genres are the obvious categorical descriptors for music. However, an ignorance of the arbitrary terminology and labels of genres can impede an individual that is trying to find music they like. Furthermore, genre labels like “rock/pop” that most tracks are labeled with are virtually useless as categories, since they define such a broad array of music (try filtering on rock/pop using my recommender interface… most of the time it’s not that useful). In the end I’m arguing for a more free-form association method for the underlying data. This sort of data is easy to collect (analyzing sets of associated songs in playlists), arrange into essential data structures (networks of songs/people/artists, etc.), visualize (using many network viz tools), and arguably presents a clearer picture of how our society understands and responds to popular music as a cultural artifact. It’s important to think of these networks as aggregates of global listening behavior. As an example, consider another comment:
… Can you explain to me why “Snoop Dogg” is more like “Kayne West” than “Nelly” ? – they’re all Africa American rappers
The answer is: In this context and at this particular point in time, Kanye West is associated more strongly with Snoop Dogg than with Nelly on associated playlists and listening behavior.
Wouldn’t it be better to give users control over this rather than just munging everything together?
One of the things I’d like to add to these interfaces is more control over non-dimensional/non-relational pieces of information related to the individual tracks. This information includes the obvious features of track, artist, album, year of release, genre, bpm, etc. These would be pretty useful, especially since they would give folks some sort of “bearing” on the results using words that they are familiar with. However, searching/filtering by “genre/year/artist” is not necessarily a new concept, and therefore it hasn’t been a focus for me in the implementation of this interface.
However, I think it’s important to realize that the many of the separations and distinctions we make between songs are by and large arbitrary, both when it comes to how we associate songs on our playlists, and when we analyze and compare the underlying acoustic features of songs. Therefore, the notion of “control” over search/exploration using these features is tenuous at its core, since it can end up limiting information retrieval as well. Enforcing categorical descriptors in an exploratory tool is “hardwiring” them into its use pattern. What happens when these categorical descriptors change? The notion of “rock and roll” is very different now than it was 50 years ago.
An alternate method being utilized by many is the “folksonomy” approach towards categorization. This method creates associations between arbitrary songs based on terms that individual users have applied, rather than partitions/hierarchies of songs based on categories prescribed by experts/authorities. However, this method suffers from many of the same problems as genre descriptors. Terms don’t do any good unless people know what they mean. Furthermore the meaning of terms can change independently of the content they describe, and the meaning of the content can change independent of the terms used to describe them. Twenty years from now, I will wince as all my favorite songs growing up are labeled with the term “oldies”. I’m not arguing for “stability” in the method that songs are considered and indexed, but rather that the we not rely wholly on a layer of abstraction (categorical or even “folksonomical” terminologies) for indexing songs, and that the notion of exploring “song associations” via network associations is worthy of further research.
The user does have a surprising amount of control over what information is returned given the underlying network neighborhood extraction model. Even though there are millions of songs, the dominant selection patterns for playlists fall into a much lower set of “trends” at any given point i ntime. Identifying which pattern the user subscribes to (via their playlists), and then giving them a “local view” of the related data (the unrealized neighbors with strong connections) gives them likely candidates for a valid recommendation. People are like snowflakes in that they’re all different… they just often happen to be “more” similar to a distinct group of people… at least in music listening behavior. In fact the similarities they share are the basis for the evolution and dynamics of genres. In the end, these genres will reflect the attitudes and attentions of our culture… in other words, the genre will always follow the culture. As a side note Jay-Z and Linkin Park (hip-hop and modern rock/metal) realized that they shared the same fan base, and decided to do a cross over album. I think this is a perfect example of the genre “coming to the people”, rather than vice versa.
As a final note, criticism always serves the important role of illuminating the critics point of view. Several other articles were pointed out that either support or relate to some of the ideas I’m getting at. For instance, this paper uses MDS as a user interaction analysis technique for information retrieval. Even though they don’t use their MDS technique as a visualization interface, they make a claim for prominent “user profiles of behavior” that exist among the innumerable choices that are possible.
This paper uses a scatter plot of images returned from an image retrieval result, with differing layout and interaction strategies. The most interesting result (for me) was the poor qualitative user response given for their fish-eye lens approach, and the high qualitative user response given for their slider approach. Since my method is very similar to fish-eye, this is something I’ll need to test as well.