[BiO BB] Poly A tail length - script help please
Joseph Landman
landman at scalableinformatics.com
Tue Sep 9 19:57:34 EDT 2003
First one is free ...
#!/usr/bin/perl
use strict;
my ($directory,$directory_handle,$file, at files,$sequence);
my ($file_handle,$poly_a_tail,$rseq);
$directory = "./"; # directory to open
if (!(opendir $directory_handle,$directory))
{
die "FATAL ERROR: Unable to open directory = ".$directory."\n";
}
# select only the .seq files
@files = grep { /\.seq$/ } readdir($directory_handle);
# loop over these selected files
foreach $file (@files)
{
# try to open the file
if (!(open($file_handle,"< ".$file)))
{
# if we cannot open it, warn the user, and skip to the next file
warn "Warning: unable to open file = ".$file."\. Skipping\.\n";
next;
}
else
{
# assume one line per file, or we will have to modify this
chomp($sequence=<$file_handle>);
# now time to bring out the heavy artillery
$rseq=reverse $sequence; # poly-a is now at the head
$rseq =~ /^([AN]+)\w+$/; # match A's and/or N's at the front
$poly_a_tail = $1; # return the match ...
printf "%i %s\n",length($poly_a_tail),$file; # tell the world ...
close($file_handle);
}
}
On Tue, 2003-09-09 at 17:00, Tristan Fiedler wrote:
> Thanks for the scripting tips! I have a 'counting' issue which I need to
> quickly resolve. A typical sequence input file (5 - 700 bases) looks like
> :
>
> AGTAGTCGATCATNATANCTANTACNACTACTAACTATGCTAGNNAATATAAAAAAAAANAAA
>
> I have over 500 files, named *.seq. I would like to create a script which :
>
> a. runs through all the files,
> b. counts the length of the 'poly A' tail (defined as the longest stretch
> of A or N)
> c. sends the output to a file, eg.
>
> 25 1.seq
> 87 2.seq
> 13 3.seq
>
> Example valid poly A tails :
>
> AAAANANANANAAANNAAAAAA
>
> AAAAAAAAAAAAAA
>
> NNNNNNNNNNNNN
>
> AAANNNNNNNNNNNAAAAAAAAA
>
> Thank you so much for your expertise!
>
> Tristan
--
Joseph Landman, Ph.D
Scalable Informatics LLC
email: landman at scalableinformatics.com
web: http://scalableinformatics.com
phone: +1 734 612 4615
More information about the BBB
mailing list