[BiO BB] how to work on two txt files simultaneously by handle corresponding lines from each file

Alex Zhang mayagao1999 at yahoo.com
Wed Jul 20 21:19:17 EDT 2005


Dear Jose,

Thank you very much!

Best Regards,
   Alex

--- Jose Maria Gonzalez Izarzugaza <biopctgi at yahoo.es>
wrote:

> Hello Alex,
> 
> I think that what you want is to modify long1 with
> short1, long2 with 
> short2 and so on.
> 
> I recommed you to replace your 2 loops with this
> one.
> 
> for ($seq=0;$seq<scalar @long;$seq++){
>     $short=$short[$seq];
>     $long=$long[$seq];
>     $offset = int(rand(length($long)%193));
>      substr($long,$offset,length($short),$short);
>      printf "%3d", $offset+1;
>      print "\n", $long, "\n";   
>     }
> 
> Good Luck!
> Txema
> 
>    
> Alex Zhang wrote:
> 
> >Dear All,
> >
> >Sorry to bother you again.
> >
> >I have two txt files to handle. One is
> >"short_sequences" and the other
> >one is "long_sequences". The "short_sequences"
> holds
> >100 short sequences (8 nucleotide long) and 100
> long
> >sequences (200 nucleotide long) in the
> >"long_sequence".
> >
> >For example, the first short sequence is "TTGACATA"
> >and the first long sequence is
>
>"GAATCATATATTAGTCTCCACATACTCCGTTCGTGACCCATTACCCTTTCGGGAGA
>
>GCCACAGCAACTGTAGATCTCGAAGTTGACAGGGGCAACTAGAGGCCTCAGAATTCT
>
>CACTCTTGAGGAGAGAAGTCTAAGACCTACAGTATGGTCGGGTTAGTTTTTGTTCCGTC
> >GAACCTTGGACTAACCACTGTCTGGATA".
> >
> >Basically, I want to generate a random position as
> a
> >starting site to replace a substring
> >in the long sequence with a short sequence. In this
> >example, we can choose a starting site
> >as 5th nucleotide in the long sequence, after
> >replacing using "TTGACATA", the replaced
> >long sequence is
>
>"GAATTTGACATAAGTCTCCACATACTCCGTTCGTGACCCATTACCCTTTCGGGAGA
>
>GCCACAGCAACTGTAGATCTCGAAGTTGACAGGGGCAACTAGAGGCCTCAGAATTCT
>
>CACTCTTGAGGAGAGAAGTCTAAGACCTACAGTATGGTCGGGTTAGTTTTTGTTCCGTC
> >GAACCTTGGACTAACCACTGTCTGGATA".
> >
> >Then I want replace the 2nd long sequence with the
> 2nd
> >short sequence and then repeat this over and over
> >again until the last long sequence is reached and
> >replaced. I think the only problem is that the
> >starting site should not be larger than 193.
> >Otherwise, there are
> >not enough nucleotides in the long sequence for
> >replacement.
> >
> >Furthurmore, I want to keep track the starting
> >replacement site for each long sequence.
> >
> >
> >I am copying my code in the below. 
> >******************************************
> >use strict;
> >use warnings;
> >
> >my (@short, @long, $offset); # the 'short' array
> will
> >hold the short
> >                            #sequences while 'long'
> >array the long sequences
> >
> >open(FILE1, '<', "short_sequences.txt") || die
> "Can't
> >open short_sequences.txt: $!\n";
> >while(<FILE1>){
> >   chomp;
> >   push(@short, $_);
> >}
> >close FILE1; #Close the file
> >
> >open(FILE2, '<', "long_sequences.txt")  || die
> "Can't
> >open long_sequences.txt: $!\n";
> >while(<FILE2>){
> >   chomp;
> >   push(@long, $_);
> >}
> >close FILE2; #Close the file
> >
> >
> ># replacement
> >foreach my $short(@short){
> >   foreach my $long(@long){
> >       $offset = int(rand(length($long)%193));
> >       substr($long,$offset,length($short),$short);
> >       printf "%3d", $offset+1;
> >       print "\n", $long, "\n";
> >
> >   }
> >}
> >********************************************
> >
> >But I just realized that there is a problem for the
> >two
> >loops. The problem is that each short sequence will
> be
> >used to replace all long sequences not the
> >corresponding one. 
> >
> >So I seek your suggestions on how to handle two
> files
> >simultaneously for my case. 
> >
> >Thank you very much and look forward to your reply!
> >
> >Best Regards,
> >    Alex
> >
> >__________________________________________________
> >Do You Yahoo!?
> >Tired of spam?  Yahoo! Mail has the best spam
> protection around 
> >http://mail.yahoo.com 
> >_______________________________________________
> >Bioinformatics.Org general forum  - 
> BiO_Bulletin_Board at bioinformatics.org
>
>https://bioinformatics.org/mailman/listinfo/bio_bulletin_board
> >
> >  
> >
> 
> _______________________________________________
> Bioinformatics.Org general forum  - 
> BiO_Bulletin_Board at bioinformatics.org
>
https://bioinformatics.org/mailman/listinfo/bio_bulletin_board
> 


__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 



More information about the BBB mailing list