[BiO BB] CD-HIT Support Request
Silas Labedo
silaslabedo at gmail.com
Fri Mar 12 02:10:57 EST 2010
Hi Everyone,
I am writing a clustering program in java that calls cd-hit for New,
Incremental, and Hierarchical clustering. The program works fine for New
clustering, however when I attempt to call cd-hit from within the java
code for Incremental clustering, I get errors. The error logs are
attached below. 3 different errors occured that I don't understand and
are the reason for which I am seeking your assistance.
This error occurred when trying to execute cd-hit for hierarchical
clustering from the linux command prompt.
****************************************************************************************
[root at ip-10-194-215-223 cd-hit]# ./psi-cd-hit-local.pl -i hierarchical93
-o hierarchical90 -c 0.90
Name "main::formatdb_no" used only once: possible typo at
./psi-cd-hit-local.pl line 1712.
Name "main::known_singles" used only once: possible typo at
./psi-cd-hit-local.pl line 1873.
Name "main::longest_ide" used only once: possible typo at
./psi-cd-hit-local.pl line 966.
Name "main::known_single" used only once: possible typo at
./psi-cd-hit-local.pl line 1873.
[root at ip-10-194-215-223 cd-hit]#
[root at ip-10-194-215-223 cd-hit]# perl psi-cd-hit.pl -i hierarchical93 -o
hierarchical90 -c 0.3
Name "main::reformat_seg" used only once: possible typo at psi-cd-hit.pl
line 65.
Name "main::restart_seg" used only once: possible typo at psi-cd-hit.pl
line 62.
Can't exec "formatdb": No such file or directory at
.//psi-cd-hit-local.pl line 1723.
Can not formatdb at .//psi-cd-hit-local.pl line 1724.
[root at ip-10-194-215-223 cd-hit]# vi hierarchical90.log
[root at ip-10-194-215-223 cd-hit]# vi hierarchical90.out
[root at ip-10-194-215-223 cd-hit]#
The following outputs and subsequent errors occurred during incremental
clustering executed from within a java code.
****************************************************************************************
Cluster CMD: C:\cd-hit-windows\cd_hit_2d.exe -i "C:/cluster
files/unipath_2010-3-9.clstr" -i2 "c:/cluster
files/unipath_2010-3-9.fasta" -o "C:/cluster files/unipath_2010-3-9" -c
0.9 -n 5 -d 50
Mar 9, 2010 6:55:31 AM Here is the standard output of the command:
total seq in db1: 0
total seq in db2: 15732
longest and shortest : 0 and 99999999
Total letters: 0
Mar 9, 2010 6:55:57 AM Process Exit Value : 1
Mar 9, 2010 6:55:57 AM Here is the standard error of the command (if any):
Fatal Error
Memory
Program halted !!
**************************************************************************************
Cluster CMD: C:\cd-hit-windows\cd_hit_2d.exe -i "C:/cluster
files/clusteroutput.clstr" -i2 "c:/cluster files/aric_2010-3-10.fasta"
-o "C:/cluster files/aric_2010-3-10" -c 0.9 -n 5 -d 50
Mar 10, 2010 6:51:01 AM Here is the standard output of the command:
total seq in db1: 79
total seq in db2: 4776
longest and shortest : 48 and 11
Total letters: 1061
Sequences have been sorted
longest and shortest : 34350 and 11
Total letters: 3167574
compute index table for first database
Reading swap
Comparing with SEG 0
..........1000 compared 0 clustered
..........2000 compared 0 clustered
..........3000 compared 0 clustered
..........4000 compared 0 clustered
.......
4776 compared 0 clustered
writing non-redundant sequences from db2
writing clustering information
program completed !
Total CPU time 103
Mar 10, 2010 6:51:23 AM Process Exit Value : 0
Mar 10, 2010 6:51:23 AM Here is the standard error of the command (if any):
The attached fasta file was generated along with the last part of the
output/error log. The contents of the fasta file are very un-familiar
and I would be very grateful if someone can help me understand it.
Thanks you all.
Regards.
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: aric_2010-3-10.clstr
URL: <http://www.bioinformatics.org/pipermail/bbb/attachments/20100312/b95f6113/attachment.ksh>
More information about the BBB
mailing list