[BiO BB] remove CTL-M and Buying a bioinformatics workstation
Iddo Friedberg
idoerg at burnham.org
Wed Sep 3 17:00:18 EDT 2003
Tristan Fiedler wrote:
> Dear Bio Gurus!
>
> Two quick questions :
>
> 1. could someone please assist me in writing a shell script (awk, sed,
> etc.) which would use a loop to run thru about 1000 files (filenames all
> end in '.seq') and remove all occurences of control-M, resulting in a file
> containing the sequence on a single line.
>
> Currently each file looks similar to :
>
> % cat -v seq_018_G05.seq
> AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA^M
> AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAGGGGGGGG^M
> TTTTTTTTTTTTTTTTCCCAAAAAAAAAAAAA^M
>
Sounds like you need the dos2unix utility. Comes bundled in with Linux,
in case you are working on another OS, you can download it free.. use
Google to find it.
>
> 2. We are planning to buy a workstation for our local (~3 labs producing
> sequences from an ABI sequencer) genomics needs (lots of blast runs,
> database management, standard bioinformatics software), and were planning
> on getting something like :
>
> 4 GB RAM (is this enough for doing local blast searches against genbank?)
Definitely, that's what I have, haven't had any issues. BLAST/PSI-BLAST
is not that memory-intensive actually.
> 2 x 3 GHz Xeon processors (how about Mac OSX?)
The more processors, the merrier. BLAST parallelizes nicely. Regarding
OS: I'm partial to Linux, but that's me.
> 400 GB storage
>
You can always add more, and 400 is ample for starters.
>
> Thank you - and feel free to reply directly to me (not waste bb resources).
>
> Cheers!
>
>
>
--
Iddo Friedberg, Ph.D.
The Burnham Institute
10901 N. Torrey Pines Rd.
La Jolla, CA 92037
USA
Tel: +1 (858) 646 3100 x3516
Fax: +1 (858) 646 3171
http://ffas.ljcrf.edu/~iddo
More information about the BBB
mailing list