Archive for April, 2007

Visualizing graphs based on adjacency matrices using GraphViz and Perl

Saturday, April 28th, 2007

Here is another chapter of my love-hate relationship with Perl: A couple of people recently asked me about drawing graphs, like the guide-tree diagrams I’m generating as part of my diploma thesis, and they were amazed how nice and easy to use the GraphViz tools and the corresponding Perl module are. So here is the example I’ve come put together. It doesn’t really tell you anything that you couldn’t get from the excellent documentation, but people don’t like reading documentation no matter how well written it is, and I suppose it does solve a rather common problem. -So here is the code:

use GraphViz;

#create your adjacency matrix and nodes, do whatever else you need done.
my @admat = (
[1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[1, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0],
[1, 0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0],
[0, 1, 1, 1, 0, 1, 0, 0, 1, 0, 0, 0],
[0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 1, 1, 0, 1, 0, 0],
[0, 0, 0, 1, 0, 0, 1, 0, 1, 1, 1, 0],
[0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 1],
[0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 1],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1],
);

my @nodes=(’1′, ‘2′, ‘3′, ‘4′, ‘5′, ‘6′, ‘7′, ‘8′, ‘9′, ‘10′, ‘11′, ‘12′);

#create a new graph
my $graph = GraphViz->new();

#add all your nodes to the graph
foreach (@nodes) {
$graph->add_node($_);
}

#create edges as defined by your adjacecncy matrix
for($i=0; $i< @nodes; $i++){
for($j=0; $j<@nodes; $j++){
if($admat[$i][$j] eq 1){
$graph->add_edge($nodes[$i] => $nodes[$j]);}
}
}

#render the graph to a png-file
$graph->as_png(”pretty.png”);


It produces this picture:

Hello Planet SBLUG!

Monday, April 16th, 2007

Now that I’m on the planet, I’ll try to be less lazy when it comes to blogging. I’ve had a pretty neat idea about projecting one distance matrix onto another one. This allows you to cluster two partially overlaping datasets with different distance measures. In my case, it should allow me to cluster protein sequences and structures into the same tree. The hope is that I can use distance information from (more reliable) structure alignments in order to compensate the noise in alignments of remotely similar sequences. Once I’ve come up with a more formal description of the algorithm and a first implementation, I’ll post a more detailed description.
Also, it looks like wurst will be uploaded to CPAN soon, so the SALAMI-GUI might be released after all.