On Thu, 17 Mar 2005 07:37:40 -0800, Ian Korf <iankorf at mac.com> wrote: > What genome does the BAC come from? What are you trying to do exactly? The data are from tomato and potato, and as there is no way to predicts genes well, we use blast to get a first rough look at the data. > You didn't answer that. By the way, there's a really good book on BLAST > from O'Reilly & Associates that discusses these issues in great detail. I know of your book, just haven't had a chance to buy & read it yet. Maybe I should explain myself better so you all can help me better. What we try to do is get a rough idea of what genes are present on an newly sequenced and assembled BAC. The normal way would be to use gene prediction software to predict the genes, and blast those genes. But because there aren't good models (yet) for these genomes, we need another way to get a quick look. When one BLASTs a large query, in our case 65K, the probability of hitting a well preserved gene is large. And as those genes will give a lot of hits, the rest of the genes will not show up, unless you set the number of hits to show very high. But setting the number of results high makes the end-user unhappy, as they will have to wade through a lot of the same data to see the more interesting bits. What I would like is a method to limit the number of hits per region, so for every hit you inly see the first 10 or so. NCBI BLAST has such an option (-K), but as I already said, it doesn't and apperently never will work. I haven't been able to find a solution yet, maybe somebody can point me in the right direction ? -- With kind regards, Jan