[BiO BB] remove CTL-M and Buying a bioinformatics workstation

Iddo Friedberg idoerg at burnham.org
Wed Sep 3 17:00:18 EDT 2003

Tristan Fiedler wrote:
> Dear Bio Gurus!
> Two quick questions :
> 1.  could someone please assist me in writing a shell script (awk, sed,
> etc.) which would use a loop to run thru about 1000 files (filenames all
> end in '.seq') and remove all occurences of control-M, resulting in a file
> containing the sequence on a single line.
> Currently each file looks similar to :
> % cat -v seq_018_G05.seq

Sounds like you need the dos2unix utility. Comes bundled in with Linux, 
in case you are working on another OS, you can download it free.. use 
Google to find it.

> 2.  We are planning to buy a workstation for our local (~3 labs producing
> sequences from an ABI sequencer) genomics needs (lots of blast runs,
> database management, standard bioinformatics software), and were planning
> on getting something like :
> 4 GB RAM  (is this enough for doing local blast searches against genbank?)

Definitely, that's what I have, haven't had any issues. BLAST/PSI-BLAST 
is not that memory-intensive actually.

> 2 x 3 GHz Xeon processors (how about Mac OSX?)

The more processors, the merrier. BLAST parallelizes nicely. Regarding 
OS: I'm partial to Linux, but that's me.

> 400 GB storage

You can always add more, and 400 is ample for starters.

> Thank you - and feel free to reply directly to me (not waste bb resources).
> Cheers!

Iddo Friedberg, Ph.D.
The Burnham Institute
10901 N. Torrey Pines Rd.
La Jolla, CA 92037
Tel: +1 (858) 646 3100 x3516
Fax: +1 (858) 646 3171

