[BiO BB] help !

Fri Dec 7 13:15:19 EST 2001

The following python code is an implementation of the algorithm you're
looking for:

def add_component(components, c):
    nc = len(components)
    i = 0
    for comp in components:
        if intersects(comp, c):
            components[i] = combine(comp, c)
            if i == nc - 1:  return components
            components[i:] = add_component(components[i+1:], comp)
        i += 1

    if i == nc: components.append(c)

    return components

Here, 'components' is a list of 'supergroups', i.e. lists of objects
(whatever you're grouping), and 'c' is a list you want to add to it. 
This function calls 'intersects(a, b)' which should return true if list
a and list b overlap, and 'combine(a, b)' which should return the union
of list a and list b.

E.g.:

supergroups = []  # empty list
for group in groups_to_combine:
    supergroups = add_component(supergroups, group)

On Fri, 2001-12-07 at 07:33, Tanya Vavouri wrote:
> Help !
> 
> I have a very simple problem that I'm trying to solve using Perl...but I've 
> spent quite a bit of time thinking about it and I have only come up with a 
> very slow solution (too slow!). I'm sure that there must be a fast and easy 
> solution to this problem so if you have any ideas please let me know.
> 
> The problem is : I have sequences in groups that may overlap and I am trying 
> to get all the groups that overlap together. So, lets say that every 
> sequence is represented by a letter and I have groups such as:
> 
> Group 1 : A B C
> Group 2 : D E F K
> Group 3 : G H A
> Group 4 : L M H R S
> 
> So, I want to write a program to give me only 2 "super" groups from the 
> above, ie
> 
> Supergroup I  : A B C G H L M R S
> Supergroup II : D E F K
> 
> Any ideas would be very VERY welcome !
> 
> Thanks in advance,
> Tanya
> 
> _________________________________________________________________
> Get your FREE download of MSN Explorer at http://explorer.msn.com/intl.asp
> 
> 
> _______________________________________________
> BiO_Bulletin_Board maillist  -  BiO_Bulletin_Board at bioinformatics.org
> http://bioinformatics.org/mailman/listinfo/bio_bulletin_board