gp_randseq

GP

2000

NAME

gp_randseq - generate random DNA sequences from a master sequence

SYNOPSIS

gp_randseq [-l length] [-number] [-q] [-v] [-d] [-h] [inputfile] [outputfile]

OPTIONS

-n value

set the number of random sequences to value

-l value

set the random sequence length to value

-m

instead of cutting out sequences at random positions in the genome, computate the Markov chain probabilities for nucleotides and generate sequences basing on that.

-v

Prints the version information.

-d

Prints lots of debugging information.

-h

Shows usage information.

inputfile

file to proces; if not given, will use standard input

outputfile

file to write the data to; if not given, will use standard output

DESCRIPTION

gp_randseq cuts out random sequences from a larger sequence. It is useful in genomic comparisons (e.g. what will be the distribution of a certain parameter in my set of sequences as compared to random sequences). The probability distribution of the sequence start should be linear.

DIAGNOSTICS

All Genpak programs complain in situations you would also complain, like when they cannot find a sequence you gave them or the sequence is not valid.

The Genpak programs do not write over existing files. I have found this feature very useful :-)

BUGS

I'm sure there are plenty left, so please mail me if you find them. I tried to clean up every bug I could find.

AUTHOR

January Weiner III <january@bioinformatics.org>