gp_seq2prot tries to translate a DNA / RNA sequence into protein sequence. As a default, it uses the standard translation table common for most organisms, but you can load your own defaults from a file using option -c. The format of this file is quite simple: all empty lines or lines starting with a "#" are ignored; all other lines are supposed to contain a three letter codon, space, and one letter amino acid code or '0' for a STOP codon. You can see the full codon table with the option -p.
The sequences are supposed to begin with a start codon, but their ends may vary: if no stop codon is found before the sequence ends, a warning message is generated, but the output will not be aborted.
gp_seq2prot -c myco.cdn all.orfs.fasta
# This is the codon table for Mycoplasma pneumoniae # As you see, you have only to put down the codons which differ from the # standard code. Blank lines, tabs, and lines starting with a '#' are # always skipped. UGA W
All Genpak programs complain in situations you would also complain, like when they cannot find a sequence you gave them or the sequence is not valid.
The Genpak programs do not write over existing files. I have found this feature very useful :-)
I'm sure there are plenty left, so please mail me if you find them. I tried to clean up every bug I could find.
January Weiner III <january@bioinformatics.org>