[BiO BB] help !

jeffames ames at alpha.ces.cwru.edu
Fri Dec 7 12:15:50 EST 2001

On Fri, 7 Dec 2001, Tanya Vavouri wrote:

> The problem is : I have sequences in groups that may overlap and I am trying
> to get all the groups that overlap together.

I'm not sure this would be fast, but....  Create some ordering for the
elements of the group (e.g., A = 1, B = 2, ...).  Then convert each group
into a binary string representing the membership of the group.  So your
first group, A B C, becomes 1110000... (to the end of your alphabet).

Then the main algorithm would be something like this:  (pseudocode)

loop: for each group i
	for each group j > i
		if (binarystring[i] & binarysring[j])
			join(i, j)
			goto loop

and the join function would just be something like binarystring[newgroup]
= binarystring[i] | binarystring[j], then delete the ith and jth

This assumes you have a finite alphabet, and that the number of members
per group won't be extremely smaller than the alphabet size.

It might be faster to record that i and j should be joined, but keep
processing the loops instead of doing the joining right away and
having to restart the loops.

More information about the BBB mailing list