Some biological databases actually come in versions, for example; we are up to the TIGR4 rice genome and swisprot UniProtKB/Swiss-Prot Release 50.0 of 30-May-2006 Others just change daily, NCBI:nr NCBI:nt etc. All this effort creates a problem for repeatability, the blast results you get next week won't quite be the ones you got today. It seems to me that the situation would be improved by tagging results "BLAST against ncbi.nih.gov nr 2006-06-05 000" This means we need to come up with a versioning scheme and for anything without, I'd suggest something as simple as issuing_authority database date 3_digit_release_number eg ncbi.nih.gov nr 2006-06-05 000 For uniqueness, use the internet name for issuing_authority. The database is the filename stripped of all qualifiers Remove things like .gz .00.tar.gz The date in ISO format! 3 more digits to ensure uniqueness. Such a scheme would also be a big win for us database administrators. We could start to weave it through the tangled web of different providers and formats so we actually know the original issuing authority for the file we are downloading. What do you think? michaelj -- Michael James michael.james at csiro.au System Administrator voice: 02 6246 5040 CSIRO Bioinformatics Facility fax: 02 6246 5166 No matter how much you pay for software, you always get less than you hoped. Unless you pay nothing, then you get more.