[BiO BB] fasta to harsh table bioperl

Zhong Huang zhong.huang at jefferson.edu
Tue Feb 28 00:41:40 EST 2006


hi, 

Can anyone suggest me a simple way to convert multiple sequences fasta 
(in Bio::SeqIO object) into harsh table (sequence annotation as key, 
sequence as value)? 


The fasta file looks like this: 



>gi|9049352|dbj|BAA99407.1| 3-methylcrotonyl-CoA carboxylase
biotin-containing subunit [Homo sapiens] 
MAAASAVSVLLVAAERNRWHRLPSLLLPPRTWVWRQRTMKYTTATGRNITKVLIANRGEIACRVMRTAKKLGVQT-
VAVYSEADRNSMHVDMADEAYSIGPAPSQQSYLSMEKIIQVAKTSAAQAIHPGCGFLSENMEFAE 


>gi|4504067|ref|NP_002070.1| aspartate aminotransferase 1 [Homo sapiens] 
MAPPSVFAEVPQAQPVLVFKLTADFREDPDPRKVNLGVGAYRTDDCHPWVLPVVKKVEQKIANDNSLNHEYLPIL-
GLAEFRSCASRLALGD 

I want to have the harsh table %seqharsh to hold sequences like this: 


#       my %seqharsh = ('seq1', MAAASAVSVL......', 
#                          'seq2', MAPPSVFAEVPQ......,); 


My code is like this: 


my $seqio = new Bio::SeqIO(-format => $format, 
                                 -file   => $file); 


while ( my $seq = $seqio->next_seq ) { 
  if( $seq->alphabet ne 'protein' ) { 
                confess("Skipping non protein sequence..."); 
                next; 
  } 


#write code here to assign each entry into harsh %seqharsh 
        my $seqharsh{$seq->primary_id} = $seq->seq(); 
                bla bla bla 


Thank you very much for your help! 


zhong 






More information about the BBB mailing list