I had the miserable luck of coming down with the flu recently, and I’m still recovering. However, it’s given me two days of relative reprieve from most of my usual duties, and all things considered, I don’t think I really missed much. I did get a chance to do a lot of reading. Finishing Pinker’s excellent Language Instinct, and getting through most of Barabasi’s Linked.
I also managed to get a better understanding of LaPlacian matrices, and finally brushed up on partial derivatives and partial differentials. Most of these readings are geared towards a better understanding of network theory, both in terms of visualization and general network metrics. The main problem I keep bumping into is a constant confusion in terminology. I read one pertinent definition, only to have a key equation represented in general “physics” terminology, with which I’m not versed. Luckily, I find an explanation of the terminology in another page (Wikipedia gets high marks from me based on its usefullness here).
Network theory is really sort of an odd bird. Networks themselves are difficult for computers as well as people, to parse. Scale free networks, of the type commonly described by Barabasi et. al, are incredibly difficult to parse due to the existence of “hub nodes”. These hub nodes contain an extremely large amount of connections. Therefore, it can become difficult to “flatten” a network for visualization in the same way one would flatten a lattice. Furthermore, it is inefficient to describe the network connections as one or more matrices, mainly because the hub nodes prevent the graph from becoming ( even marginally) disjoint, but the vast majority of row, column pairs will turn out to be null. However, there are piles and piles of operators and coding conventions for matrices, particularly for routines that uncover salience and orthogonality in the data. Phycisicts usually work in terms of Euclidean, or Minkowski space. However, the inherent dimensionality in networks just does not lend itself towards presentation on a two or even three dimensional lattice (read: a two dimensional screen/paper). However, this is the medium it must be presented in to take advantage of our most powerful multidimensional sense organ (stereoscopic vision).
I guess I’m glad I took these couple of days “off”, as they’ve helped underline some of the “impasses” surrounding network theory, and I still feel that the network plotting technique I’ve developed for MusicStrands is validated.
The main notion that I don’t believe network theorists have latched onto yet is “entropy”. Researchers will uncover a (yet another) scale free network. They offer some summary statistics of the data (clustering co-efficient, log-log connection plots, etc), and basically leave it at that. I think we need to move past that to get a better understanding of a network under consideration. Once we’ve defined a network structure characterstic (random, small world, or scale free), we’ve essentially laid down a baseline expectation for the behavior of such a network. We can exploit this fact by finding out how/why the observed network deviates from these expectations in certain localized cases. Currently, I believe that this can be accomplished best through good ol’ fashioned matrix calculations, but I can definitely see an opportunity for a network centered algorithmic approach. I am aware of the Boost Graph Library for C++, and I suppose that’s the tool I should try to be using. If only someone would write a perl wrapper for it 😦