On Thursday, March 10, 2005 at 12:48 -0600, Dinanath Sulakhe wrote: > Thank You everyone for the suggestions .. > > I have decided to use -b 100 option with blastall, and removing -F F option > as we currently don't need that. > and -v 200 option for formatdb. > > We are using the NR (non-redundant database from NCBI) as the DB for blast > which is currently about 1 GB with ~2.3 Million sequences in it, so what > number would be appropriate for "-v" option in formatdb ? the lowest value so that you no longer run out of memory > > Does fragmenting the DB affect the speed of blast computations? This is from a benchmark I did for in November 2002. I don't remember what query/db this was, most likely blastn, or possibly blastp. #frag 'wall time' 1 787.29 2 791.69 4 794.67 5 798.521 8 794.01 10 805.812 15 809.324 16 803.7 20 814.032 25 821.86 30 827.643 31 821.38 35 838.594 40 856.84 45 879.683 50 910.147 55 933.658 60 949.782 63 940.81 65 963.216 69 987.196 74 985.41 80 996.06 83 1008.153 86 1014.527 90 1026.124 94 1042.814 Splitting up queries involves much less overhead. I threw up a figure here, no idea where the original data is. http://flyex.ams.sunysb.edu/~lcarey/nquery_files.gif -Lucas