[Biodevelopers] Re: splice advice...
Joseph Landman
landman at scalableinformatics.com
Wed Aug 27 15:43:47 EDT 2003
Hi Dan:
I am assuming that your arrays contain not HSP's, but some sort of
object representing the HSP ala BioPerl. Is this correct?
See
http://doc.bioperl.org/releases/bioperl-1.2/Bio/Tools/BPlite/HSP.html
for a way to handle some of the HSP processing. This might help you
simplify the expression of what you are doing...
Joe
On Wed, 2003-08-27 at 14:56, Dan Bolser wrote:
> Hello, splice to see you etc.
>
> I am trying to write a *simple* "best HST in family "
> algorithm in perl.
>
> My raw materials are SCOP queries against target sequences.
>
> I get each set of hits for each protein in turn, sorted
> by P_START (Hsp_query-from).
>
> I then go through the list and remove any pair of sequences
> with more than $THRESH AA overlap (if they come from the same
> scop family).
>
> This list removal involves lots of splicing, which is O(N) with
> list size.
>
> I figure I could avoid all that splice if I just use pointers
> to array positions, but I can't work out how to do this...
>
> Maby splicing is the least of my optimzation problems....
>
> __SKIP__
>
> preamble
>
> @hsps = array of HSP hashes, for a particular protein
> each HSP can be from several SCOP sequences.
>
> __RESUME__
>
> my @result; # Final HSP's
>
> TOP:while (@hsps){ # NB: Ordered by Hsp_query-from
> # (for optimzation).
>
> my $p = 0; # Current HSP pointer.
>
> MID:for (my $j=$p+1; $j<@hsps; $j++){ # Overlap slider.
>
> # Family overlap only!
>
> next MID if
> $hsps[$p]->{SCCS} != $hsps[$j]->{SCCS};
>
> # Optimization.
>
> if ( $THRESH >
> $hsps[$p]->{P_END} - $hsps[$j]->{P_START} ){
>
> shift @hsps;
> next TOP;
> }
>
> # Pick best of pair (removing the other from the list).
>
> if ( $hsps[$p]->{E_VALUE} > $hsps[$j]->{E_VALUE} ){
> splice (@hsps, $p, 1);
> $j--;
> $p = $j;
> }
> else {
> splice (@hsps, $j, 1);
> $j--;
> }
> }
> push @result, splice(@hsps, $p, 1);
> }
> print "OK\n\n";
>
> __END_ISH__
>
> Whaddya think?
> Any better way?
>
> Cheers,
>
>
>
> On Wed, 27 Aug 2003, sekhar kavuru wrote:
>
> > Dear Joseph,
> >
> > Iam a Perl Developer with BioInformatics Certification.
> >
> > Recently I developed a software package using BioPerl/ EnsEmbl to create a Perl/Html based database interface to access Genome data from EnsEMBL and SwissProt.
> > The Browser I developed enables users to query ENSEMBL database based on either CloneId or Chromosome Number.
> >
> > If you need any assistance or help please feel free to write to me.
> >
> > Regards
> >
> > Sekhar
> >
> > biodevelopers-request at bioinformatics.org wrote:
> > Send Biodevelopers mailing list submissions to
> > biodevelopers at bioinformatics.org
> >
> > To subscribe or unsubscribe via the World Wide Web, visit
> > https://bioinformatics.org/mailman/listinfo/biodevelopers
> > or, via email, send a message with subject or body 'help' to
> > biodevelopers-request at bioinformatics.org
> >
> > You can reach the person managing the list at
> > biodevelopers-admin at bioinformatics.org
> >
> > When replying, please edit your Subject line so it is more specific
> > than "Re: Contents of Biodevelopers digest..."
> >
> >
> > Today's Topics:
> >
> > 1. Re: [BiO BB] perl scripting assistance (Joseph Landman)
> >
> > --__--__--
> >
> > Message: 1
> > From: Joseph Landman
> > To: BiO BB
> > Cc: biodevelopers
> > Date: 26 Aug 2003 21:06:11 -0400
> > Subject: [Biodevelopers] Re: [BiO BB] perl scripting assistance
> > Reply-To: biodevelopers at bioinformatics.org
> >
> > Try the biodevelopers group on bioinformatics.org ...
> >
> > On Tue, 2003-08-26 at 13:06, Tristan J. Fiedler wrote:
> > > Are any bulletin boards / discussion groups available for obtaining tips
> > > in scripting with perl?
> > >
> > > Thank you.
> >
--
Joseph Landman, Ph.D
Scalable Informatics LLC
email: landman at scalableinformatics.com
web: http://scalableinformatics.com
phone: +1 734 612 4615
More information about the Biodevelopers
mailing list