BIRCHv3.50
From Bioinformatics.Org Wiki
[return to Release To Do List]
Contents |
BIRCH
customdoc.py - Should always ignore changes in lines beginning with http, https, ftp etc. Otherwise, it breaks links to things like ftp:///ftp.cc.umanitoba.ca/psgendb..... Does htmldoc.py also need to be fixed? No.
Upgrade to Java 1.8
BioLegato 1.0.4 - recompiled using Java 1.8
- Can flamingo and albacore be upgraded to Java 1.7?
Introducing blreads
Mechanism for setting max. CPUGeneral table functions- File/Directory functions
DeleteUnzip/bzunzipcd - change directory
- Sequence trimming and quality
trim_galore - rewrite scripts and PCD so that paired read files can be used as input.fastq_pair - add binary and PCD menu- Seqkit
- For big files, seqkit stats can take a long time.
Add email notification to this function. Also, add to the hints a note that as the # of CPUs increase, the load on RAM also increases, because SeqKit uses pigz to do decompression through an I/O stream for each file. It could be that things will go faster if we use a smaller number of CPUs. Some experimentation is in order.
- For big files, seqkit stats can take a long time.
- Genome assembly and evaluation
- Pollux - automatically running FastQC on corrected reads doesn't work
- Transcriptome assembly and evaluation
- SOAPdenovo-Trans -
Installed for linux-x86_64. There are major problems involving static compilation and the inability to find libraries for osx-x86_64, and there appears to be no interest by the authors to support OSX. For now, SOAPdenovo-Trans will only be on linux-x86_64.
- SOAPdenovo-Trans -
- Tutorials
Hints - short help items in menus that tell you what to select, what is requiredCan we get BioLegato overwrite function to work?Hisat, Bowtie2, Cufflinks, StringTie - Current state of Tuxedo tools (Hisat supersedes TopHat)
http://www.ccb.jhu.edu/software/hisat/index.shtml
Testing
Dataset: Genome tutorial
Linux | Mac OSX | ||||
brassica (16 Gb) | flamingo | ccxx | peacock (8 Gb) | aliana (8 Gb) | |
Genome | |||||
Trimmomatic | + | + | + | ||
trim_galore | + | + | + | ||
pollux | + | + | + | ||
SOAPdenovo2 | + | NA | NA | ||
ABySS | + | + | + | ||
Spades | + | through K55 | through K55 | ||
Quast | + | + | + |
Dataset: Transcriptomics tutorial
Linux | Mac OSX | ||||
brassica (16 Gb) | flamingo | ccxx | peacock (8 Gb) | aliana (8 Gb) | |
Transcriptome | |||||
Trimmomatic | + | + | |||
trim_galore | + | + | + | ||
Rcorrector | + | + | + | ||
SOAPdenovo-Trans | + | NA | NA | ||
Trinity | + | NA | NA | ||
RNASpades | + | through K49 | |||
transrate | + | + |
Dataset: Differential expression tutorial
Linux | Mac OSX | ||||
brassica (16 Gb) | flamingo | ccxx | peacock (8 Gb) | aliana (8 Gb) | |
Differential Expression | |||||
gff2gtf | + | + | |||
extract exons | + | + | |||
extract splice sites | + | + | |||
Hisat2-build | + | + | |||
Hisat2 | + | + | |||
stringtie | + | + | |||
stringtie merge | + | + | |||
gffcompare | + | + | |||
stringtie -B | + | + |
BLAST/FASTA overhall
birchdb - some of the command line arguments for blastdbkit.py listed in the database are wrong eg. --addfiles should be --add. Fix these.
blpalign
None of the phylogeny programs work. They run, but we get an "unexpected end of file" error. WTF?- Other tasks, such as REFORM work, so it appears to be limited to phylogenies
- all phylogeny programs work on CCL. Is this something specific to brassica?
- Does this affect blnalign as well?
This is an old old bug leftover from GDE. The first part of the shell command for all the Phylip programs read as follows:
tr ""~"" ""-"" < %in1% | sed ""s/[\:\_]CDS/_/"" | readseq -a -f12 -pipe | sed ""s/ YF//1"" > %in1%.tmp;
the "| sed ""s/ YF//1" part would delete " YF" from any sequence in the Phylip alignment file. This would occur rarely ie. YF preceeded by a space, which is why it was not discovered until now. Anyway, the code has been fixed for all Phylip programs in blnalign and blpalign.