[BiO BB] About clustering genes to gene family

Zheng Fu zfu at cs.ucr.edu
Fri Aug 8 13:10:27 EDT 2003


How to differentiate the fist case(complex cluster) and the
second(distantly related with homolog).

And where can I find the information about GENEFAMMER?

Thank you.


On Thu, 7 Aug 2003, Dan Bolser wrote:

> What you describe can occur for 2 good reasons...
>
> You are forming a 'complex cluster', created by *multiple domain*
> proteins...
>
> A has domains in common with B,
> B has domains in common with C.
>
> A and C have no domains in common, and hence no homology.
>
> I.e.
>
> A: |------W------/-----X-----|
> B:                       |------x-----/-----Y-------|
> C:                                         |------y-------/--------hello
> mum!------|
>
> OR
>
> A and C are too distantly related for sequence searches to uncover their
> true homology. However, sequence B is *intermediate* to A and C,
> having homology to both...
>
>          B
>         /   \
>       /       \
>     /           \
>  A              C
>
> NB: Sequence similarity is not a metric, as it does not obey triangular
> equality.
> (I think it is metric at high levels of similarity though?)
>
> In this case you have used the transitive nature of sequence similarity
> to uncover
> distant homology via an intermediate sequence.
>
> Jong Park and Sarah Techimann worked on both these ideas, and has
> created a
> family clustering package called GENEFAMMER, Specifically DIVCLUS breaks up
> complex clusters into domain families. Transitivity is implemented
> (kinda) in psiblast /
> hmm models, all three of which are used in PFAM, so you might want to
> look there
> for your families.
>
> Or you could insist your allignments cover 90% of the shortest sequence,
> and then
> cluster using single linkage.
>
> Dan.
>
>
> Zheng Fu wrote:
>
> >Hi everyone,
> >
> >Does anyone know how to clustering genes to a gene family based on the
> >sequence alignments.
> >For two genes, we can define a threshold to seperate the homolog and
> >non-homolog. But for three or more genes,how to define the homologs?(Such
> >as Gene A and Gene B has high alignment score, A and C also has high sore,
> >but B and C doesn't have high socre, can we say ABC are homologs?
> >
> >Thank you.
> >
> >Carol
> >
> >
> >
>
>
> _______________________________________________
> BiO_Bulletin_Board maillist  -  BiO_Bulletin_Board at bioinformatics.org
> https://bioinformatics.org/mailman/listinfo/bio_bulletin_board
>

-- 
Love & Peace
Http://www.cs.ucr.edu/~zfu




More information about the BBB mailing list