Dan Bolser once suggested the use of a software packaging system like RPM for providing updates to DBs containing multiple flat files. It's especially appealing if it's done in combination with a downloader like yum, and I think it's something that Bioinformatics.Org might pursue. It may be relevant to your suggestion, since package managers are aware of version numbers and can revert an installed package to an old version. Large DBs contained in a single file would be problematic, though. Cheers, Jeff Michael James wrote: > Some biological databases actually come in versions, > for example; we are up to the TIGR4 rice genome and > swisprot UniProtKB/Swiss-Prot Release 50.0 of 30-May-2006 > > Others just change daily, NCBI:nr NCBI:nt etc. > > All this effort creates a problem for repeatability, > the blast results you get next week > won't quite be the ones you got today. > > It seems to me that the situation would be improved > by tagging results "BLAST against ncbi.nih.gov nr 2006-06-05 000" > > This means we need to come up with a versioning scheme > and for anything without, I'd suggest something as simple as > issuing_authority database date 3_digit_release_number > eg ncbi.nih.gov nr 2006-06-05 000 > > For uniqueness, use the internet name for issuing_authority. > > The database is the filename stripped of all qualifiers > Remove things like .gz .00.tar.gz > > The date in ISO format! > > 3 more digits to ensure uniqueness. > > > Such a scheme would also be > a big win for us database administrators. > We could start to weave it through the tangled web > of different providers and formats > so we actually know the original issuing authority > for the file we are downloading. > > What do you think? > michaelj > > -- J.W. Bizzaro Bioinformatics Organization, Inc. (Bioinformatics.Org) E-mail: jeff at bioinformatics.org Phone: +1 508 890 8600 --