It seems much of this could be addressed by a svn repository. I know I'd sure appreciate typing 'svn update nt'. What was in your prototype? ----- Original Message ---- From: Joe Landman <landman at scalableinformatics.com> To: "Clustering, compute farming & distributed computing in life science informatics" <bioclusters at bioinformatics.org> Sent: Sunday, June 4, 2006 10:33:40 PM Subject: Re: [Bioclusters] Versioning databases Sounds nice. I had thought of also (somehow) saving diffs in a db so you could generate the test db you used previously. Don't know if there is interest in this, but we had a prototype of this a few years ago. Joe Michael James wrote: > Some biological databases actually come in versions, > for example; we are up to the TIGR4 rice genome and > swisprot UniProtKB/Swiss-Prot Release 50.0 of 30-May-2006 > > Others just change daily, NCBI:nr NCBI:nt etc. > > All this effort creates a problem for repeatability, > the blast results you get next week > won't quite be the ones you got today. > > It seems to me that the situation would be improved > by tagging results "BLAST against ncbi.nih.gov nr 2006-06-05 000" > > This means we need to come up with a versioning scheme > and for anything without, I'd suggest something as simple as > issuing_authority database date 3_digit_release_number > eg ncbi.nih.gov nr 2006-06-05 000 > > For uniqueness, use the internet name for issuing_authority. > > The database is the filename stripped of all qualifiers > Remove things like .gz .00.tar.gz > > The date in ISO format! > > 3 more digits to ensure uniqueness. > > > Such a scheme would also be > a big win for us database administrators. > We could start to weave it through the tangled web > of different providers and formats > so we actually know the original issuing authority > for the file we are downloading. > > What do you think? > michaelj > > -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics LLC, email: landman at scalableinformatics.com web : http://www.scalableinformatics.com phone: +1 734 786 8423 fax : +1 734 786 8452 cell : +1 734 612 4615 _______________________________________________ Bioclusters maillist - Bioclusters at bioinformatics.org https://bioinformatics.org/mailman/listinfo/bioclusters -------------- next part -------------- An HTML attachment was scrubbed... URL: http://bioinformatics.org/pipermail/bioclusters/attachments/20060604/6fa01036/attachment.html