From christoph.gille at charite.de Wed Feb 9 06:49:42 2005 From: christoph.gille at charite.de (Dr. Christoph Gille) Date: Wed, 9 Feb 2005 12:49:42 +0100 (MET) Subject: [BiO BB] MacOSx : I need to compile some programs Message-ID: <34103.141.42.56.114.1107949782.squirrel@webmail.charite.de> I am developing STRAP STRAP, program for protein alignments. It uses many other programs. Some of them are written in C/C++ and must be compiled for all different target machines. If you have a Mac OSX, could you please compile a few programs. This should not take long, because they have Makefiles. Or does somebody know how I can make executables for Mac using my linux i386 computer ? Thanks for your help Christoph From virajj at lycos.com Wed Feb 9 18:23:56 2005 From: virajj at lycos.com (vijaya raj) Date: Wed, 09 Feb 2005 18:23:56 -0500 Subject: [BiO BB] sign-against-software-patents Message-ID: <20050209232356.09453E5BC7@ws7-2.us4.outblaze.com> hi guys... the european commission is trying to legalise software patents..which means...un-imaginable consequences... chek out here.. please sign against this dangerous venture....pass it on to your friends... http://www.noepatents.org/index_html?LANG=en read this article...knoppix under threat... http://www.knoppix.org/ do sign against this monster attempt... vijay -- _______________________________________________ Find what you are looking for with the Lycos Yellow Pages http://r.lycos.com/r/yp_emailfooter/http://yellowpages.lycos.com/default.asp?SRC=lycos10 From Susan_Braunhut at uml.edu Thu Feb 10 17:15:50 2005 From: Susan_Braunhut at uml.edu (Susan Braunhut) Date: Thu, 10 Feb 2005 17:15:50 -0500 Subject: [BiO BB] Bioinformatics Conference- Keynotes, Exciting New Research-Free Message-ID: <002401c50fbe$14e85320$6501a8c0@shiloh> Laboratories of Innovation: Bioinformatics at The University of Massachusetts April 29th, 2005 8:30 AM- 5:00 PM with a reception following the conference Please join us for a day conference highlighting the innovative work in bioinformatics taking place on the five campuses of the University of Massachusetts Keynote speakers Dr. John N. Weinstein, M.D., Ph.D. Head, Genomics & Bioinformatics Group, LMP, CCR, National Cancer Institute Dr. Mark Gerstein, Ph.D. Associate Professor of Biomedical Informatics, Yale University Attendance is free but registration is limited Lunch and reception will be provided Deadline for registration is April 15th, 2005. Register at: http://cs.uml.edu/ivpr/meeting.html For additional information contact Dr. Georges Grinstein 978-934-3627 or grinstein at cs.uml.edu -------------- next part -------------- An HTML attachment was scrubbed... URL: From nuraini at cs.usm.my Wed Feb 16 04:01:27 2005 From: nuraini at cs.usm.my (Nur'Aini Abdul Rashid) Date: Wed, 16 Feb 2005 17:01:27 +0800 Subject: [BiO BB] NEIGHBOUR program from PHYLIP suite References: <1095872087.4151ae57d6da3@hongyu.org> Message-ID: <001501c51406$1cee0f50$4482cf0a@MyBox2> Dear All, I'm trying to generate a phylogenetic tree using NEIGHBOUR, a public domain program from the PHYLIP software suite... but I get the error the diagonal element of row 1 of distance matrix is not zero.. but it is zero... When I try to use upper triangular data matrix I got the message The infile is of the wrong type.. I created the file using Microsoft.net C ++ software . It is a text file.. Please help. Nora. _______________________________________________ > BiO_Bulletin_Board maillist - BiO_Bulletin_Board at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > From idh at poulet.org Fri Feb 18 09:15:56 2005 From: idh at poulet.org (Yannick Wurm) Date: Fri, 18 Feb 2005 15:15:56 +0100 Subject: [BiO BB] Refined nucleotide BLAST matrix Message-ID: <9ADB6198-81B7-11D9-A931-000D93712582@poulet.org> Hi, for a specific need in my lab, we are looking for an implementation of nucleotide sequence alignment program which would be more flexible than standard BLAST. The reason is that we have sequenced dna fragments which have been submitted to chemical modifications which differentially affects different nucleotides. To help identify these sequences, we need to be able to fine-tune the matrix used for scoring. Thus, for example when calculating the "score" of an aligment, C->A and C->T could be given different weights. According to ebi.ac.uk, the WU-blast matrix is: A T G C A 5 T -4 5 G -4 -4 5 C -4 -4 -4 5 We want to be able to specifiy inidividual values to something like the following example: A T G C A 2 T -4 8 G -8 -10 3 C -2 -1 -3 10 To my surprise, BLAST does not have this liberty, despite the fact that different scoring matrices are used for proteins. I couldn't find anything on Google either. Would anyone one the list have a clue? Or do I need to get dirty messing with BLAST's source? Thanks in advance, Yannick . . . . . . . . . . . . . . . . . . yannick.wurm at unil.ch +41.21.692.4157 PhD student, Departement of Ecology and Evolution Universit? de Lausanne, Switzerland From danny at amelang.net Fri Feb 18 12:05:29 2005 From: danny at amelang.net (Daniel Amelang) Date: Fri, 18 Feb 2005 10:05:29 -0700 Subject: [BiO BB] Refined nucleotide BLAST matrix In-Reply-To: <9ADB6198-81B7-11D9-A931-000D93712582@poulet.org> References: <9ADB6198-81B7-11D9-A931-000D93712582@poulet.org> Message-ID: <42162059.5070503@amelang.net> Hi Yannick, Forgive me if I misunderstood your question, but I think what you're looking for is the '-DNAMATRIX=custom_matrix.mat' option of clustalw. I'm using it for a project of mine and it works great. Let me know if you need help getting it to work. Sometimes clustalw can be picky about the format of the matrix. Dan Amelang Yannick Wurm wrote: > Hi, > for a specific need in my lab, we are looking for an implementation of > nucleotide sequence alignment program which would be more flexible > than standard BLAST. > The reason is that we have sequenced dna fragments which have been > submitted to chemical modifications which differentially affects > different nucleotides. > > To help identify these sequences, we need to be able to fine-tune the > matrix used for scoring. Thus, for example when calculating the > "score" of an aligment, C->A and C->T could be given different weights. > > According to ebi.ac.uk, the WU-blast matrix is: > A T G C > A 5 > T -4 5 > G -4 -4 5 > C -4 -4 -4 5 > > We want to be able to specifiy inidividual values to something like > the following example: > A T G C > A 2 > T -4 8 > G -8 -10 3 > C -2 -1 -3 10 > > > To my surprise, BLAST does not have this liberty, despite the fact > that different scoring matrices are used for proteins. I couldn't find > anything on Google either. > > Would anyone one the list have a clue? Or do I need to get dirty > messing with BLAST's source? > Thanks in advance, > > Yannick > > . . . . . . . . . . . . . . . . . . > yannick.wurm at unil.ch > +41.21.692.4157 > PhD student, Departement of Ecology and Evolution > Universit? de Lausanne, Switzerland > _______________________________________________ > Bioinformatics.Org general forum - > BiO_Bulletin_Board at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > > From idh at poulet.org Fri Feb 18 12:33:05 2005 From: idh at poulet.org (Yannick Wurm) Date: Fri, 18 Feb 2005 18:33:05 +0100 Subject: [BiO BB] Refined nucleotide BLAST matrix In-Reply-To: <42162059.5070503@amelang.net> References: <9ADB6198-81B7-11D9-A931-000D93712582@poulet.org> <42162059.5070503@amelang.net> Message-ID: <25C9CB04-81D3-11D9-A931-000D93712582@poulet.org> Thanks for your feedback Dan, I guess that could be what I'm looking for. However, I don't want multiple alignments so I don't think clustal will do the trick. Basically, I have one interesting sequence for which I would love to find a homologue in the Honey bee or Drosophila genome. A standard BLAST does not show a significant homology. My fragment was submitted to chemical treatment which replaces As by Cs. My sequence thus has many Cs, some of which were already C in the original sequence, some of which were As. This makes finding a perfect alignment difficult. So when scoring the "distance" between my sequence and a sequence from a genome database, I would like to have a lower penalty for A-C than for T-C. Doing this should increase my chance of finding a homologue. -yannick. On 18 f?vr. 05, at 18:05, Daniel Amelang wrote: > Hi Yannick, > > Forgive me if I misunderstood your question, but I think what you're > looking for is the '-DNAMATRIX=custom_matrix.mat' option of clustalw. > I'm using it for a project of mine and it works great. Let me know if > you need help getting it to work. Sometimes clustalw can be picky > about the format of the matrix. > > Dan Amelang . . . . . . . . . . . . . . . . . . yannick.wurm at unil.ch +41.21.692.4157 PhD student, Departement of Ecology and Evolution Universit? de Lausanne, Switzerland From pmr at ebi.ac.uk Fri Feb 18 12:40:32 2005 From: pmr at ebi.ac.uk (Peter Rice) Date: Fri, 18 Feb 2005 17:40:32 +0000 Subject: [BiO BB] Refined nucleotide BLAST matrix In-Reply-To: <42162059.5070503@amelang.net> References: <9ADB6198-81B7-11D9-A931-000D93712582@poulet.org> <42162059.5070503@amelang.net> Message-ID: <42162890.2040205@ebi.ac.uk> > Yannick Wurm wrote: >> for a specific need in my lab, we are looking for an implementation of >> nucleotide sequence alignment program which would be more flexible >> than standard BLAST. >> >> To help identify these sequences, we need to be able to fine-tune the >> matrix used for scoring. Thus, for example when calculating the >> "score" of an aligment, C->A and C->T could be given different weights. >> >> To my surprise, BLAST does not have this liberty, despite the fact >> that different scoring matrices are used for proteins. I couldn't find >> anything on Google either. >> >> Would anyone one the list have a clue? Or do I need to get dirty >> messing with BLAST's source? Avoid messing with the BLAST sources! You can get around this - you also need this trick to handle nucleotide ambiguity codes (for example to compare patent sequences which use codes other than 'N'. You have to cheat though. 1. Build your blast database as protein 2. Give your matrix a name that matches one of the blast protein matrix names (!) 3. Put in the matrix values you want 4. Remember that you are now using blastp (protein search) so you can only use a short wordsize - I am guessing you have short sequences anyway so this should not be a problem 5. Remember that BLAST does local alignment. 6. Remember that your scores will be making some wrong assumptions about using proteins. You should still find the hits you are looking for. This cheat was published (by NCBI if I recall correctly) some time back. Sorry, I can't track down the reference. Hope this helps, Peter Rice From kalita at rabbit.uccs.edu Mon Feb 14 16:14:52 2005 From: kalita at rabbit.uccs.edu (J. Kalita) Date: Mon, 14 Feb 2005 14:14:52 -0700 (MST) Subject: [BiO BB] Announcing BIOT-05: Second Bioinformatics and Biotechnology Symposium In-Reply-To: <20041210170009.41E7DD1F2F@www.bioinformatics.org> References: <20041210170009.41E7DD1F2F@www.bioinformatics.org> Message-ID: <49813.128.198.172.116.1108415692.squirrel@www.cs.uccs.edu> Location : Colorado Springs Symposium Date: August 15-16, 2005 Paper or Abstract Submission Date: March 14, 2005 ------- IOT-05: The Second Biotechnology and Bioinformatics Symposium, will be held in Colorado Springs, Colorado on August 15 and 16, 2005. The Web site is http://bioinfo.uccs.edu . Please click on "Symposium 2005 (BIOT-05)" on the top left menu to get to the Web site for the symposium. The symposium's objective is to showcase research and development activities in: * Bioinformatics and Computational Biology, and * Biotechnology * Impacts of Biotechnology, Bioinformatics and Computational Biology and to promote future interdisciplinary activity and research in these areas. Abstracts and papers are solicited in these three areas, broadly interpreted. Efforts that attempt to solve biology-based problems using computational, mathematical, engineering and other means are suitable. In addition, abstracts and papers dealing with community impacts of Bioinformatics, Computational Biology and Biotechnology are appropriate. Thus, abstracts and papers in areas such as technology transfer, legal, business, and social impact issues are also invited. BIOT-05 will accept abstracts and papers for either podium presentation or poster presentation. Please look at the Web site for more details. All submissions of abstracts and papers, whether for podium or poster presentation, will be reviewed. Printed preceedings will be published containing all accepted abstracts and papers. You can see a complete copy of last year's proceedings at http://bioinfo.uccs.edu/index.php?module=pagemaster&PAGE_user_op=view_page&PAGE_id=49&MMN_position=79:41 . A partial list of topics for the Symposium include Analysis of complex biological systems, Artificial or synthetic biological systems, Bioenergetics, Biomedical research, Biotechnology, Cellular function, Commercial applications of biotechnology and bioinformatics, Comparative genomics, Data mining, Databases, Evolution models, Functional genomics, Genetics, Genomics, High content analysis, High performance computing, Industrial applications of biotechnolgy and bioinformatics, Legal impacts of biotechnology and bioinformatics, Mathematical and computational models of cellular systems, Mathematical models of biophysical processes, Mathematical physiology, Microarray analysis, Molecular function, Molecular sequence and structure, Neural circuits modeling, Pathways, Pattern recognition, Phylogenetics, Physiology, Population biology, Promoter analysis and discovery, Protein structure and analysis, RNAi analysis, Sequence alignments, SNPs, Social impacts of bioinformatics and biotechnology, Systems biology, Technology transfer, Theoretical and mathematical biology, Venture capital for biotechnolgy and bioinformatics industry, Visualization. Like BIOT-04, BIOT-05 will bring together scientists, engineers and other practitioners from diverse fields. Different fields have different requirements for conference submissions. For example, in Computer Science, it is traditional to require submission of full papers for review several months prior to a conference or symposium. In Mathematics, often even an abstract is usually not required. In Biology, usually short abstracts are required. However, most other fields simply request that an abstract or an extended abstract be submitted for review. Since this cross-over meeting will bring together people from fields where the publishing paradigms is different, authors are encouraged to submit abstracts, or papers, as appropriate within their subfield. The Symposium proceedings will contain abstracts as well as papers. The length of your submission determines the maximum length of your publication in the proceedings; if you want a paper in the proceedings then you must submit a paper for review. Thus, there are two possible ways to participate in BIOT-05. First, you must decide whether you want to submit * an extended abstract for review (2 pages) , or * a full paper for review (up to 6 pages). An extended abstract or a paper must report significant research results, findings or advances within its own field. However, since the symposium is geared toward a diverse audience of biologists, computer scientists, chemists, engineers, technology transfer individuals, graduate students, professors, industry individuals, etc., the papers or extended abstracts must be presented in a lucid manner accessible to such individuals. Please look at the Web site for more details. IMPORTANT DATES * Submission Deadline: March 14, 2005 (Two pages of Extended Abstracts or 6 pages of Full Papers). This may be extended by two or three weeks if needed. * Acceptence Decision: April 25, 2005 * Revised Camera Ready Extended Abstracts and Full papers due after revision: June 6, 2005 * Symposium Date: August 15-16, 2005 Please contact Dr. J. Kalita at jkalita at uccs.edu if you have any questions about BIOT-05. From sve02594 at yahoo.com Wed Feb 16 09:45:04 2005 From: sve02594 at yahoo.com (Seema Verma) Date: Wed, 16 Feb 2005 06:45:04 -0800 (PST) Subject: [BiO BB] MS computer science(bioinformatics) Message-ID: <20050216144505.94822.qmail@web50507.mail.yahoo.com> I am wondering if anyone can suggest me about the advantage of doing MS computer science with specialization in bioinformatics. if one has already PhD in biology(molecular biology). I would appreciate your rsponse. Thanks __________________________________ Do you Yahoo!? Read only the mail you want - Yahoo! Mail SpamGuard. http://promotions.yahoo.com/new_mail From Yannick.Wurm at unil.ch Fri Feb 18 09:13:20 2005 From: Yannick.Wurm at unil.ch (Yannick Wurm) Date: Fri, 18 Feb 2005 15:13:20 +0100 Subject: [BiO BB] Refined Blast? Message-ID: <3E591300-81B7-11D9-A931-000D93712582@unil.ch> Hi, for a specific need in my lab, we are looking for an implementation of nucleotide sequence alignment program which would be more flexible than standard BLAST. The reason is that we have sequenced dna fragments which have been submitted to chemical modifications which differentially affects different nucleotides. To help identify these sequences, we need to be able to fine-tune the matrix used for scoring. Thus, for example when calculating the "score" of an aligment, C->A and C->T could be given different weights. According to ebi.ac.uk, the WU-blast matrix is: A T G C A 5 T -4 5 G -4 -4 5 C -4 -4 -4 5 We want to be able to specifiy inidividual values to something like the following example: A T G C A 2 T -4 8 G -8 -10 3 C -2 -1 -3 10 To my surprise, BLAST does not have this liberty, despite the fact that different scoring matrices are used for proteins. I couldn't find anything on Google either. Would anyone one the list have a clue? Or do I need to get dirty messing with BLAST's source? Thanks in advance, Yannick . . . . . . . . . . . . . . . . . . yannick.wurm at unil.ch +41.21.692.4157 PhD student, Departement of Ecology and Evolution Universit? de Lausanne, Switzerland From maria.mirto at unile.it Sat Feb 19 08:25:12 2005 From: maria.mirto at unile.it (Maria Mirto) Date: Sat, 19 Feb 2005 14:25:12 +0100 (CET) Subject: [BiO BB] final CFP: HiPCoMB-2005 Message-ID: <3331.193.204.74.245.1108819512.squirrel@webmail2.unile.it> ******************************************************************************** We apologize if you received multiple copies of this Call for Papers. Please feel free to distribute it to those who might be interested. ******************************************************************************** ___________________________________________________________________ CALL FOR PAPERS 1st IEEE Workshop on High Performance Computing in Medicine and Biology (HiPCoMB-2005) held in conjunction with The 11th International Conference on Parallel and Distributed Systems (ICPADS 2005) Fukuoka Institute of Technology (FIT), Fukuoka, Japan July 20-22, 2005 HiPCoMB-2005 Home Page: http://www.pdcl.wayne.edu/HiPCoMB-2005 (Apologies if you receive multiple copies of this Call for Papers) _______________________________________________________________________ IMPORTANT DEADLINES: Paper submission: February 28, 2005 Author Notification: April 07, 2005 Camera-Ready Papers: April 21, 2005 WORKSHOP INFORMATION: The First Workshop on High Performance Computing in Medicine and Biology (HiPCoMB-05), held in conjunction with ICPADS 2005 in Fukuoka City, Japan brings together researchers in computer science and engineering, medicine, and biology that use high performance computing to solve computationally expensive problems in medicine and biology. The workshop will provide a forum for presenting and exchanging new ideas and experiences in this area. Topics of interest include high performance algorithms, systems, architecture, and tools for the following: (but are not limited to the following list) * Microarray Analysis * RNAi Analysis * Systems Biology * Computational Genomics * Comparative Genomics * DNA Assembly, Clustering, and Mapping * Gene identification and annotation * Computational Proteomics * Evolution and Phylogenetics * Protein Structure Predication and Modeling * Medical Image Processing * Computer Assisted Surgery * Computational Medicine Modeling * Computational Biology Modeling * Augmented Reality * Medical Informatics SUBMISSION INFORMATION: Talks will be accepted on the basis of a paper of approximately 15 single-column pages that describes the work, its significance, and the current status of the research. Submit one electronic copy of the paper in PostScript or PDF format by February 15, 2005. Please visit the workshop home page for submission instructions. Notification of acceptance will be given by March 22, 2005, and camera-ready papers will be due April 21, 2005. Accepted papers will be given guidelines in preparing and submitting the final manuscript(s) together with the notification of acceptance. All accepted papers will be presented at the workshop and included in proceedings that will be distributed at the workshop. In addition, authors of selected papers from the workshop will be invited to submit extended versions of their papers for publication in a special issue of the International Journal of Bioinformatics Research and Applications. GENERAL INFORMATION: GENERAL CO-CHAIRS: Laurence T. Yang St. Francis Xavier University, Canada email: lyang at stfx.ca Albert Zomaya University of Sydney, Australia email: zomaya at it.usyd.edu.au PROGRAM CO-CHAIRS: Vipin Chaudhary Wayne State University, USA email: vipin at wayne.edu Andrei Doncescu LAAS, National Center for Scientific Research, France email: adoncesc at laas.fr Yi Pan Georgia State University, USA email: pan at cs.gsu.edu PROGRAM COMMITTEE MEMBERS: David Abramson, Monash University, Australia davida at csse.monash.edu.au Enrique Alba, University of Malaga, Spain eat at lcc.uma.es Srinivas Aluru, Iowa State University, USA aluru at iastate.edu http://www.ece.iastate.edu/~aluru Shahid H. Bokhari, University of Engineering & Technology, Pakistan shb at acm.org Vincent Breton, CNRS/IN2P3, LPC Clermont-Ferrand, France breton at clermont.in2p3.fr Kevin Burrage, University of Queensland, Australia kb at maths.uq.edu.au Amitava Data, University of Western Australia, Australia datta at csse.uwa.edu.au Hans de Sterck, University of Waterloo, Canada hdesterck at math.uwaterloo.ca Mario Rosario Guarracino, ICAR-CNR, Italy mario.guarracino at cps.na.cnr.it http://pixel.dma.unina.it/~mariog/ Ryoko Hayashi, Advanced Institute of Science and Technology (JAIST), Japan ryoko at jaist.ac.jp Matthew He, Nova Southeastern University, USA hem at nsu.nova.edu Alfons Hoekstra, University of Amsterdam, The Netherlands alfons at science.uva.nl Xiaohua (Tony) Hu, Drexel University, USA thu at cis.drexel.edu http://www.cis.drexel.edu/faculty/thu Chun-Hsi Huang University of Connecticut, USA huang at engr.uconn.edu Arun Krishnan, Bioinformatics Institute, Singapore arun at bii.a-star.edu.sg Joseph Landman, Scalable Informatics, LLC landman at scalableinformatics.com Wenjun Li, UT Southwestern Medical Center, USA liwenjun2k at yahoo.com Yiming Li, National Chiao Tung University, Taiwan ymli at mail.nctu.edu.tw Robert L. Martino, National Institutes of Health, USA Robert.Martino at nih.gov Maria Mirto University of Lecce, Italy maria.mirto at unile.it Michael Mascagni, Florida State University, USA mascagni at fsu.edu Martin Middendorf, University of Leipzig, Germany middendorf at informatik.uni-leipzig.de Giri Narasimhan, Florida International University, USA giri at cs.fiu.edu Jun Ni, University of Iowa, USA jun-ni at uiowa.edu Sergei Petoukhov, Russian Academy of Sciences, Russia petoukhov at hotmail.com Pascal Poulet, French West Indies UNiversity, France Pascal.Poullet at univ-ag.fr Youxing Qu, Univresity of Georgia, USA youxing at csbl.bmb.uga.edu Nagiza Samatova, Oak Ridge National Lab, USA samatovan at ornl.gov Bertil Schmidt, Nanyang Technological University, Singapore ASBSchmidt at ntu.edu.sg Tony Solomonides, University of the West of England, UK Tony.Solomonides at uwe.ac.uk El-Ghazali Talbi, LIFL, France talbi at lifl.fr Daming Wei, University of Aizu, Japan dm-wei at u-aizu.ac.jp Tiffani Williams, University of New Mexico, USA tlw at cs.unm.edu C. M. Yang, Nankai University, China yangchm at nankai.edu.cn Yanqing Zhang, Georgia State University, USA yzhang at cs.gsu.edu Bingbing Zhou, University of Sydney, Australia bbz at it.usyd.edu.au Information related to ICPADS 2005 is available at the official ICPADS 2005 Web site: http://www.takilab.k.dendai.ac.jp/conf/icpads/2005/ ICPADS 2005 is Co-sponsored by IEEE Computer Society TCDP and TCPP, and Fukuoka Institute of Technology (FIT), in cooperation with Fukuoka City, IPSJ (Information Processing Society of Japan) SIGDPS, IEEE Taipei Section, IEEE HonKong Section, SCAT, and AOARD/AOR. -- ============================================================ Maria Mirto PhD student, Center for Advanced Computational Technologies via per Monteroni, 73100 Lecce (Le), ITALY ph: +39 0832 297304 fax: +39 0832 297279 ============================================================ From yogeshshetty2000 at yahoo.com Sun Feb 20 00:28:26 2005 From: yogeshshetty2000 at yahoo.com (yogesh shetty) Date: Sat, 19 Feb 2005 21:28:26 -0800 (PST) Subject: [BiO BB] help Message-ID: <20050220052826.66428.qmail@web61102.mail.yahoo.com> hello can any body help me. I need a source code written in java for a clustering tool. thanking you --------------------------------- Do you Yahoo!? Yahoo! Search presents - Jib Jab's 'Second Term' -------------- next part -------------- An HTML attachment was scrubbed... URL: From Yannick.Wurm at unil.ch Mon Feb 21 08:33:38 2005 From: Yannick.Wurm at unil.ch (Yannick Wurm) Date: Mon, 21 Feb 2005 14:33:38 +0100 Subject: [BiO BB] Refined nucleotide BLAST matrix In-Reply-To: <42162890.2040205@ebi.ac.uk> References: <9ADB6198-81B7-11D9-A931-000D93712582@poulet.org> <42162059.5070503@amelang.net> <42162890.2040205@ebi.ac.uk> Message-ID: Thank you Peter - what a surprising hack! Thats the way we'll go. Thanks again, -yannick On 18 f?vr. 05, at 18:40, Peter Rice wrote: >> Yannick Wurm wrote: >>> for a specific need in my lab, we are looking for an implementation >>> of nucleotide sequence alignment program which would be more >>> flexible than standard BLAST. >>> >>> To help identify these sequences, we need to be able to fine-tune >>> the matrix used for scoring. Thus, for example when calculating the >>> "score" of an aligment, C->A and C->T could be given different >>> weights. >>> >>> To my surprise, BLAST does not have this liberty, despite the fact >>> that different scoring matrices are used for proteins. I couldn't >>> find anything on Google either. > > You can get around this - you also need this trick to handle > nucleotide ambiguity codes (for example to compare patent sequences > which use codes other than 'N'. > > You have to cheat though. > > 1. Build your blast database as protein > > 2. Give your matrix a name that matches one of the blast protein > matrix names (!) > > 3. Put in the matrix values you want > > 4. Remember that you are now using blastp (protein search) so you can > only use a short wordsize - I am guessing you have short sequences > anyway so this should not be a problem > > 5. Remember that BLAST does local alignment. > > 6. Remember that your scores will be making some wrong assumptions > about using proteins. You should still find the hits you are looking > for. > > This cheat was published (by NCBI if I recall correctly) some time > back. Sorry, I can't track down the reference. > > Hope this helps, > > Peter Rice . . . . . . . . . . . . . . . . . . yannick.wurm at unil.ch +41.21.692.4157 PhD student, Departement of Ecology and Evolution Universit? de Lausanne, Switzerland . . . . . . . . . . . . . . . . . . yannick.wurm at unil.ch +41.21.692.4157 PhD student, Departement of Ecology and Evolution Universit? de Lausanne, Switzerland . . . . . . . . . . . . . . . . . . yannick.wurm at unil.ch +41.21.692.4157 PhD student, Departement of Ecology and Evolution Universit? de Lausanne, Switzerland . . . . . . . . . . . . . . . . . . yannick.wurm at unil.ch +41.21.692.4157 PhD student, Departement of Ecology and Evolution Universit? de Lausanne, Switzerland From idoerg at burnham.org Mon Feb 21 12:55:07 2005 From: idoerg at burnham.org (Iddo Friedberg) Date: Mon, 21 Feb 2005 09:55:07 -0800 Subject: [BiO BB] help In-Reply-To: <20050220052826.66428.qmail@web61102.mail.yahoo.com> References: <20050220052826.66428.qmail@web61102.mail.yahoo.com> Message-ID: <421A207B.6030001@burnham.org> Weka is a machine learning tool in Java which I believe has some clustering algorithm as well: http://www.cs.waikato.ac.nz/ml/weka/ ./I yogesh shetty wrote: > hello > can any body help me. I need a source code written in java for a > clustering tool. > > thanking you > > ------------------------------------------------------------------------ > Do you Yahoo!? > Yahoo! Search presents - Jib Jab's 'Second Term' > > > >------------------------------------------------------------------------ > >_______________________________________________ >Bioinformatics.Org general forum - BiO_Bulletin_Board at bioinformatics.org >https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > > -- Iddo Friedberg, Ph.D. The Burnham Institute 10901 N. Torrey Pines Rd. La Jolla, CA 92037 USA Tel: +1 (858) 646 3100 x3516 Fax: +1 (858) 713 9930 http://ffas.ljcrf.edu/~iddo ========================== The First Automated Protein Function Prediction SIG Detroit, MI June 24, 2005 http://ffas.burnham.org/AFP From sourangshu at csa.iisc.ernet.in Mon Feb 21 12:48:40 2005 From: sourangshu at csa.iisc.ernet.in (Sourangshu Bhattacharya) Date: Mon, 21 Feb 2005 23:18:40 +0530 (IST) Subject: [BiO BB] help In-Reply-To: <20050220052826.66428.qmail@web61102.mail.yahoo.com> References: <20050220052826.66428.qmail@web61102.mail.yahoo.com> Message-ID: Hi, Check out : www.cs.waikato.ac.nz/ml/weka/ Sourangshu Sourangshu Bhattacharya PhD Student, Dept. of Computer Science & Automation, IISc, Bangalore. http://people.csa.iisc.ernet.in/sourangshu On Sat, 19 Feb 2005, yogesh shetty wrote: > hello > can any body help me. I need a source code written in java for a clustering tool. > > thanking you > > > --------------------------------- > Do you Yahoo!? > Yahoo! Search presents - Jib Jab's 'Second Term' From milimetr at webmail.cmdik.pan.pl Tue Feb 22 04:01:10 2005 From: milimetr at webmail.cmdik.pan.pl (MACIEJ PIETRZAK) Date: Tue, 22 Feb 2005 10:01:10 +0100 Subject: [BiO BB] Re: BiO_Bulletin_Board Digest, Vol 4, Issue 8 In-Reply-To: <20050221173421.50974D1FE8@www.bioinformatics.org> References: <20050221173421.50974D1FE8@www.bioinformatics.org> Message-ID: <20050222085349.M13674@webmail.cmdik.pan.pl> Hi, U can also try TESS: http://www.cbil.upenn.edu/tess/ with your own strings or matrix Regards Maciej Pietrzak dep. of Endocrinology MRC PAS WARSAW, POLAND milimetrcmdik.pan.pl ---------- Original Message ----------- From: bio_bulletin_board-request at bioinformatics.org To: bio_bulletin_board at bioinformatics.org Sent: Mon, 21 Feb 2005 12:34:21 -0500 (EST) Subject: BiO_Bulletin_Board Digest, Vol 4, Issue 8 > ------------------------------ > > Message: 3 > Date: Fri, 18 Feb 2005 15:13:20 +0100 > From: Yannick Wurm > Subject: [BiO BB] Refined Blast? > To: bio_bulletin_board at bioinformatics.org > Message-ID: <3E591300-81B7-11D9-A931-000D93712582 at unil.ch> > Content-Type: text/plain; charset=ISO-8859-1; format=flowed > > Hi, > for a specific need in my lab, we are looking for an implementation of > nucleotide sequence alignment program which would be more flexible than > standard BLAST. > The reason is that we have sequenced dna fragments which have been > submitted to chemical modifications which differentially affects > different nucleotides. > > To help identify these sequences, we need to be able to fine-tune the > matrix used for scoring. Thus, for example when calculating the "score" > of an aligment, C->A and C->T could be given different weights. > > According to ebi.ac.uk, the WU-blast matrix is: > A T G C > A 5 > T -4 5 > G -4 -4 5 > C -4 -4 -4 5 > > We want to be able to specifiy inidividual values to something like the > following example: > A T G C > A 2 > T -4 8 > G -8 -10 3 > C -2 -1 -3 10 > > To my surprise, BLAST does not have this liberty, despite the fact that > different scoring matrices are used for proteins. I couldn't find > anything on Google either. > > Would anyone one the list have a clue? Or do I need to get dirty > messing with BLAST's source? > Thanks in advance, > > Yannick > > . . . . . . . . . . . . . . . . . . > yannick.wurm at unil.ch > +41.21.692.4157 > PhD student, Departement of Ecology and Evolution > Universit? de Lausanne, Switzerland > From idh at poulet.org Wed Feb 23 03:25:18 2005 From: idh at poulet.org (Yannick Wurm) Date: Wed, 23 Feb 2005 09:25:18 +0100 Subject: [BiO BB] Fwd: [blast-help] Refined nucleotide BLAST matrix Message-ID: <74ba7e5b39c948f2fc204aca2c7a402c@poulet.org> And so this is the reference Peter mentioned, as kindly indicated by Wayne Matten at NCBI. @article{States1991Improved-Sensit, Abstract = {Scoring matrices for nucleic acid sequence comparison that are based on models appropriate to the analysis of molecular sequencing errors or biological mutation processes are presented. In mammalian genomes, transition mutations occur significantly more frequently than transversions, and the optimal scoring of sequence alignments based on this substitution model differs from that derived assuming a uniform mutation model. The information from sequence alignments potentially available using an optimal scoring system is compared with that obtained using the BLASTN default scoring. A modified BLAST database search tool allows these, or other explicitly specified scoring matrices, to be utilized in computationally efficient queries of nucleic acid databases with nucleic acid query sequences. Results of searches performed using BLASTN's default score matrix are compared with those using scores based on a mutational model in which transitions are more prevalent than transversions.}, Author = {David J. States and Warren Gish and Stephen F. Altschul}, Date-Added = {2005-02-23 09:14:28 +0100}, Date-Modified = {2005-02-23 09:15:41 +0100}, Journal = {METHODS: A Companion to Methods in Enzymology}, Url = {http://blast.wustl.edu/doc/ntmats.pdf}, Month = {August}, Number = {1}, Pages = {66-70}, Title = {Improved Sensitivity of Nucleic Acid Database Searches Using Application-Specific Scoring Matrices}, Volume = {3}, Year = {1991}} Thanks again! -yannick Begin forwarded message: > From: "Matten, Wayne (NIH/NLM)" > Date: 22 f?vrier 2005 21:23:29 GMT+01:00 > To: 'Yannick Wurm' , > "'blast-help at ncbi.nlm.nih.gov'" > Subject: RE: [blast-help] Refined nucleotide BLAST matrix > > Hello, > ? > I believe the reference that Peter mentions is this one: > ? > http://blast.wustl.edu/doc/ntmats.pdf > ? > Peter summed up the "hack" very well. You might need other commandline > options; turning off the low complexity filter comes to mind.?But you > can get blastp, within blastall, to run as long as you format the > database as a protein database and use a matrix name already in the > /data directory.?You might also get some ideas from here: > ? > ftp://ftp.ncbi.nlm.nih.gov/blast/matrices/ > ? > e.g., NUC4.4. > ? > > Best regards, > Wayne > > <><><><>>><>>>>><><>>><> > Wayne Matten > NCBI User Services . . . . . . . . . . . . . . . . . . yannick.wurm at unil.ch +41.21.692.4157 PhD student, Departement of Ecology and Evolution Universit? de Lausanne, Switzerland From hjm at tacgi.com Fri Feb 25 16:23:40 2005 From: hjm at tacgi.com (Harry Mangalam) Date: Fri, 25 Feb 2005 13:23:40 -0800 Subject: [BiO BB] tacg ver 4.1 now at sourceforge In-Reply-To: <20050220052826.66428.qmail@web61102.mail.yahoo.com> References: <20050220052826.66428.qmail@web61102.mail.yahoo.com> Message-ID: <200502251323.40295.hjm@tacgi.com> People have been bugging me to release recent work on tacg and while I've been happy to oblige on a case by case basis, I haven't update the sourceforge release for a (very long) while. Now I have. tacg is a GPL'ed command-line program for *nix that performs many of the common routines in pattern matching in biological strings. It was originally designed for restriction enzyme analysis and while that still forms a core of the program, it has been expanded to fill more roles, sort of a 'grep' for DNA. Like grep, it's relatively small, fast (5-30x comparable GCG apps), memory efficient (>10x more efficient than the EMBOSS apps), has tons of options, but has relatively crude text output (except for the PDF plasmid maps). See http://tacg.sourceforge.net for details. -- Cheers, Harry Harry J Mangalam - 949 856 2847 (vox; email for fax) - hjm at tacgi.com <>