gp_seq2prot

GP

2000

NAME

gp_seq2prot - translate DNA/RNA to protein sequence

SYNOPSIS

gp_seq2prot [-p] [-c code_file ] [-q] [-v] [-d] [-h] [inputfile] [outputfile]

OPTIONS

-p

print the codon table and exit

-c code_file

load codon table modifications from file code_file

-v

Prints the version information.

-d

Prints lots of debugging information.

-h

Shows usage information.

inputfile

file to proces; if not given, will use standard input

outputfile

file to write the data to; if not given, will use standard output

DESCRIPTION

gp_seq2prot tries to translate a DNA / RNA sequence into protein sequence. As a default, it uses the standard translation table common for most organisms, but you can load your own defaults from a file using option -c. The format of this file is quite simple: all empty lines or lines starting with a "#" are ignored; all other lines are supposed to contain a three letter codon, space, and one letter amino acid code or '0' for a STOP codon. You can see the full codon table with the option -p.

The sequences are supposed to begin with a start codon, but their ends may vary: if no stop codon is found before the sequence ends, a warning message is generated, but the output will not be aborted.

EXAMPLES

Translate sequences from file all.orfs.fasta into protein using the mycoplasma translation table modification stored in file myco.cdn:

gp_seq2prot -c myco.cdn all.orfs.fasta

Sample file containing modifications of the translation table:



# This is the codon table for Mycoplasma pneumoniae
# As you see, you have only to put down the codons which differ from the
# standard code. Blank lines, tabs, and lines starting with a '#' are
# always skipped.
UGA W

DIAGNOSTICS

All Genpak programs complain in situations you would also complain, like when they cannot find a sequence you gave them or the sequence is not valid.

The Genpak programs do not write over existing files. I have found this feature very useful :-)

BUGS

I'm sure there are plenty left, so please mail me if you find them. I tried to clean up every bug I could find.

AUTHOR

January Weiner III <january@bioinformatics.org>