Don, we exclude FASTA from our BLAST database. We use rsync because of its no-whole-file function. Why download the entire 50+ gig database every night when all we need are the changes? Is there an FTP client that supports just changes in the files? Don Gilbert said: > > How high is demand for mirroring the FASTA/ subfolder of > ftp://ftp.ncbi.nlm.nih.gov/blast/db/ ? I'll be happy > to consider adding it to bio-mirror.net. On the other hand, > it is a large data chunk, and will add to network copy load > which now is stretched for the blast-format tar files. > We have almost continuous ftp copying from ncbi:/blast/db/ now > due to the almost daily data turnover, ftp timeouts, and such. > > Those who want source data could instead use the > Genbank dataset -> fasta, at a lower cost. E.g. only > a few Genbank/WGS subsets are updated daily, whereas > the whole 18 GB blast wgs.fasta is updated daily. > > Rsync is a nice tool, but has a much higher server side > CPU cost than FTP. Those of you running into rsync errors at NCBI > would probably have better luck using an FTP mirroring > tool. > > - Don Gilbert > -- d.gilbert--bioinformatics--indiana-u--bloomington-in-47405 > -- gilbertd at indiana.edu--http://marmot.bio.indiana.edu/ > _______________________________________________ > Bioclusters maillist - Bioclusters at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/bioclusters > -- Jeremy Mann jeremy at biochem.uthscsa.edu University of Texas Health Science Center Bioinformatics Core Facility http://www.bioinformatics.uthscsa.edu Phone: (210) 567-2672