In maintaining a library of blast databases a major obstacle is lack of embedded version info. For example I have a database as a FASTA file, there is another version (perhaps?) on the local biomirror in tar.gz format. (Curse them.) And for most databases both formats are available on NCBI. Are they the same? Which is newest? What is the difference? Sizes and dates provide some indication, but to actually compare them is a non-trivial bit of computing. What is needed is some label string inside the files, even if it's just "NCBI nr 03/02/04" The FASTA file format needs to allow comments so this info can be attached to them indivisibly. I propose that the FASTA format be extended so that programs using it: 1) Strip and store as a comment anything on a line after a # sign. 2) Ignore lines with nothing [but whitespace] left after stripping. As far as blast is concerned this would involve modifying formatdb so it takes all such comments and includes them in the existing ".nal" file. No change needed to the main blastall binary as this file already contains # comments. We are free to develop header fields as soon as this mechanism exists to attach them. Your comments? michaelj -- Michael James michael.james@csiro.au System Administrator voice: 02 6246 5040 CSIRO Bioinformatics Facility fax: 02 6246 5166