[Biodevelopers] Urbigene Package

Pierre LINDENBAUM plindenbaum at yahoo.fr
Mon May 26 04:56:38 EDT 2003


hello all,

The Urbigene Package contains modest C++ tools for
molecular biology I wrote at the INTEGRAGEN company.
As a subset of those tools do not present any
commercial interest so I've been allowed to release it
to the scientific community as an open source package
under the GNU General Public License (GPL). You'll
find sources for parsing blast results in XML format,
the new versions of the CloneIt program, filters for
FASTA sequences, for PRIMER3 output... etc...

(There are also programs that are not dedicated to
biology but may be of general interest. For example
PIVOT creates cross tables from delimited files,
GeneticProg tries to find an equation that fits
experimental values, etc...)

The package is available at:

                  http://www.urbigene.com


Usage Example
Consider the following script:


#This script takes as input the chromosome 22 from the
goldenpath
#It then digests the whole chromosome by NotI
#cuts the boundaries by 6 bases,
#keeps fragments between 100 bases and 10Kb,
#keeps fragments containing a CA repeat,
#keeps fragments where %GC is between 40 and 60%,
#just keeps the 10 first sequences,
#converts the sequences as an input for primer3
#launches primer3
#converts the amplified fragments to FASTA
#blast those fragments against the whole goldenpath
#retains BLAST HSP where score is lower than 10 or
greater then 50
#converts the output to text
#transforms this text to XML
#keeps the 50 first lines
#

BIN=./bin/
${BIN}/fastaretrieve -chr 22 -entry
/env/ig/pubdb/mirror/golden_path/14nov2002/chromosomes/entry_points.csv
|
${BIN}/fastadigest -e NotI |
${BIN}/fastacrop -5 6 -3 6 |
${BIN}/fastasize -m 100 -M 100000 |
${BIN}/fastaslice -e 5000 -n 10000 |
${BIN}/fastafind -s CACACACACACA -print T |
${BIN}/fastagc -min 40 -max 60 -sort T |
${BIN}/fastahead |
${BIN}/fasta2primer3 -max-stgy 1 -gc-min 20 -gc-max 80
-max-size 2000 | 
primer3 |
${BIN}/primer3tofasta | 
blastall -e 10 -p blastn -d
/env/ig/pubdb/blastdb/GP10apr2003/gp10apr2003 -m 7 |
#${BIN}/blastlisp -e
'or(lt(hsp.score(),10),gt(hsp.score(),50))' |
${BIN}/blast2txt |
${BIN}/text2xml |
head -n 50 > demo.txt


Result will be 



//iteration
######################################################################
22:0-47748584(+)|restriction_fragment[NotI(35516844)-NotI(35580318):63482]|crop_5(6)crop_3(6)|size_filter(100-100000)|slice(50000-59999)|gc(41.53%)|pcr_0(34-884:851
pb) primer_left(TTCCAAAGTGCTGGGATTATAG)
primer_right(TCTGGGATTTTCCAGAGGTATAG)	len:851

####################################################################>
build33|chr22|slice(37101000-37250999)	len:150000
Object:94394-95244	Query:1-851	830	0
                                                      
  .....>       	build33|chr22|slice(37101000-37250999)
len:150000	Object:47208-47286	Query:682-761	32
1.48067e-07
 ..>                                                  
               	build33|chr22|slice(37101000-37250999)
len:150000	Object:138640-138677	Query:3-40	30
2.31183e-06
 <..                                                  
               	build33|chr22|slice(37101000-37250999)
len:150000	Object:88694-88734	Query:43-3	29
9.13492e-06
 <..                                                  
               	build33|chr22|slice(37101000-37250999)
len:150000	Object:113943-113983	Query:43-3	29
9.13492e-06
 <.                                                   
               	build33|chr22|slice(37101000-37250999)
len:150000	Object:16320-16349	Query:32-3	26
0.000563575
 <..                                                  
               	build33|chr22|slice(37101000-37250999)
len:150000	Object:76801-76838	Query:40-3	26
0.000563575
 <..                                                  
               	build33|chr22|slice(37101000-37250999)
len:150000	Object:104040-104080	Query:43-3	25
0.0022269
 <.                                                   
               	build33|chr22|slice(37101000-37250999)
len:150000	Object:132737-132766	Query:32-3	22	0.137388
 <.                                                   
               	build33|chr22|slice(37101000-37250999)
len:150000	Object:71167-71196	Query:32-3	22	0.137388
 <.                                                   
               	build33|chr22|slice(37101000-37250999)
len:150000	Object:142101-142130	Query:32-3	22	0.137388
 <.                                                   
               	build33|chr22|slice(37101000-37250999)
len:150000	Object:127879-127904	Query:28-3	22	0.137388
 .>                                                   
               	build33|chr22|slice(37101000-37250999)
len:150000	Object:82934-82963	Query:3-32	22	0.137388
 <.                                                   
               	build33|chr22|slice(37101000-37250999)
len:150000	Object:53573-53602	Query:32-3	22	0.137388
 <.                                                   
               	build33|chr22|slice(37101000-37250999)
len:150000	Object:62190-62219	Query:32-3	22	0.137388
 .>                                                   
               	build33|chr22|slice(37101000-37250999)
len:150000	Object:33547-33576	Query:3-32	22	0.137388
 ..>                                                  
               	build33|chr22|slice(37101000-37250999)
len:150000	Object:94134-94171	Query:3-40	22	0.137388
 <..                                                  
               	build33|chr22|slice(37101000-37250999)
len:150000	Object:95472-95509	Query:40-3	22	0.137388
 .>                                                   
               	build33|chr22|slice(37101000-37250999)
len:150000	Object:121477-121506	Query:3-32	22	0.137388
(...


Enjoy
Pierre Lindenbaum PhD





___________________________________________________________
Do You Yahoo!? -- Une adresse @yahoo.fr gratuite et en français !
Yahoo! Mail : http://fr.mail.yahoo.com



More information about the Biodevelopers mailing list