> ... > 2) If I have two large genomes that need a lengthy blast, how can I > split that up? > ... > Even a valid hit can have some repeat in it ... > ... > However, I'm after a generalized solution that doesn't require special > knowledge of the sequences. > ... Disclaimer first: I don't know if this comment applies to your particular situation. So much for apologies. I've had several mid-sized script-n-hack projects start with exactly this question: "How do I BLAST one genome against another?" When we got to the root of it, the biological questions of interest demanded a variety of approaches. Here are two examples: 1) Find me putative orthologs between these two chromosomes. ---------------------------------------------------------- - This broke down into 1) Find the genes 2) Find the orthologs. - In this case it makes a lot of sense to filter out low complexity sequence up front, hit each chromosome with a suite of gene-finders...including a blastx vs. a well annotated protein dataset like swissprot. From that, we get a set of possible genes in each chromosome. Now the problem is more recognizable as a job that BLAST might be good at. 2) Show me the large scale genomic events that provide evidence for evolutionary relation between these two specific chromosomes. ----------------------------------------------------------------- - Here, we do NOT want to get rid of low complexity or repetitive elements. A straight-ahead "overlapping chunks -> blastn -> dot-plot" approach gives what is wanted. 3) Show me the paralogs (duplicated genes within a single genome) and... You get the idea. -Chris Dwan Center for Computational Genomics and Bioinformatics University of Minnesota