[ViewVC] Log of: owl/trunk/proteinstructure/CiffilePdb.java

Links to HEAD:	(view) (annotate)
Sticky Revision:	(Current path doesn't exist after revision 950)
Sort logs by:

Revision 356 - (view) (annotate) - [select for diffs]
Modified Sat Oct 13 15:55:41 2007 UTC (17 years, 4 months ago) by duarte
File length: 28335 byte(s)
Diff to previous 355

Fixed bug: wasn't tokenising well when first field in line was quoted

Revision 355 - (view) (annotate) - [select for diffs]
Modified Fri Oct 12 18:42:37 2007 UTC (17 years, 4 months ago) by duarte
File length: 28315 byte(s)
Diff to previous 336

FIXED BUG: now doesn't fail with records that are delimited with \n; ;\n
Method tokeniseFields is now completely rewritten: is what does all the magic of parsing all the oddities of the mmcif format
Using RandomAccessFile to open the file only once and then seek to the positions we need to scan at each point. Might be slower due to the RandomAccessFile that does no buffering. Also maybe because the new tokenisation is not very optimal
Now parseCifFile does the whole parsing calling also the submethods instead of calling them in the constructor

Revision 336 - (view) (annotate) - [select for diffs]
Modified Tue Oct 2 16:14:20 2007 UTC (17 years, 4 months ago) by stehr
File length: 27667 byte(s)
Diff to previous 326

extracted constant NULL_CHAIN_CODE from ...Pdb classes, added copy() methods to NodeSet and EdgeSet, added some functionality to NodesAndEdges, new class SimilarityGraph

Revision 326 - (view) (annotate) - [select for diffs]
Modified Thu Sep 20 14:49:55 2007 UTC (17 years, 5 months ago) by duarte
File length: 27654 byte(s)
Diff to previous 319

Removed class AA and replace it by AAinfo, which reads contact types from separate file contactTypes.dat
New class ContactType which contains atoms for each contact type and residue. A static object for each contact type is loaded into AAinfo upon reading the contactTypes.dat file
Changed all references accordingly

Revision 319 - (view) (annotate) - [select for diffs]
Modified Mon Sep 17 16:10:32 2007 UTC (17 years, 5 months ago) by stehr
File length: 27885 byte(s)
Diff to previous 317

added constructors for loading from online pdb

Revision 317 - (view) (annotate) - [select for diffs]
Modified Thu Sep 13 16:09:10 2007 UTC (17 years, 5 months ago) by duarte
File length: 22872 byte(s)
Diff to previous 315

Fixed some comments

Revision 315 - (view) (annotate) - [select for diffs]
Modified Thu Sep 13 08:13:40 2007 UTC (17 years, 5 months ago) by duarte
File length: 22939 byte(s)
Diff to previous 314

Now parsing each element in different methods (re-opening the file). Parsing first pdbx_poly_seq_scheme so we get the chainCode that we can use for reading the rest
Now taking care of cases where struct_sheet_range is not a loop element
In tokeniseFields now also unquoting double-quoted strings
Tested on a set of 12000 entries

Revision 314 - (view) (annotate) - [select for diffs]
Modified Wed Sep 12 14:50:48 2007 UTC (17 years, 5 months ago) by duarte
File length: 21070 byte(s)
Diff to previous 311

Checking number of fields per line in loop elements and throwing exception if count is not correect
Doing tokenisation of lines through new function that takes care of possible quoted string with spaces
New exception CiffileFormatError
Checking 1st line of cif file has correct format: data_1xxx, if not throwing exception

Revision 311 - (view) (annotate) - [select for diffs]
Modified Thu Aug 30 17:31:38 2007 UTC (17 years, 5 months ago) by duarte
File length: 18680 byte(s)
Diff to previous 310

Fixed buf: sometimes struct_conf can be non-loop elements, now also taking care of that particular case

Revision 310 - (view) (annotate) - [select for diffs]
Modified Thu Aug 30 16:00:08 2007 UTC (17 years, 5 months ago) by duarte
File length: 17003 byte(s)
Diff to previous 309

Bug with '?' in auth_seq_num was not really fixed. Now should be fine: behaviour is the same as PdbasePdb

Revision 309 - (view) (annotate) - [select for diffs]
Modified Thu Aug 30 15:55:53 2007 UTC (17 years, 5 months ago) by duarte
File length: 16999 byte(s)
Diff to previous 308

Fixed bug: needed to read alt locs in advance in another scan of the file because the order of the elements in the cif file is not guaranteed. As read of atom_site needs of alt locs, we need to do first the parsing of atom_sites_alt

Revision 308 - (view) (annotate) - [select for diffs]
Modified Thu Aug 30 14:54:38 2007 UTC (17 years, 5 months ago) by duarte
File length: 16090 byte(s)
Diff to previous 307

Fixed bugs:
- was reading HETATM lines as well as ATOM in atom_site
- auth_seq_num with '?' not taken now when populating the pdbresser2resser map (same behaviour as in PdbasePdb)
- now using chainCodeStr and auth_asym_id to identify chains in pdbx_poly_seq_scheme, struct_conf and struct_sheet_range. atom_site is not guaranteed to appear in file before all the others so we can't rely on having read a chainCode (asym_id) when parsing the other elements

Revision 307 - (view) (annotate) - [select for diffs]
Modified Thu Aug 30 10:41:55 2007 UTC (17 years, 5 months ago) by duarte
File length: 15526 byte(s)
Diff to previous 306

Now taking indices for fields from parsed field names. Still only minimal testing

Revision 306 - (view) (annotate) - [select for diffs]
Added Thu Aug 30 09:09:24 2007 UTC (17 years, 5 months ago) by duarte
File length: 12814 byte(s)

First implementation of mmCIF file parser. Tested minimally.

This form allows you to request diffs between any two revisions of this file. For each of the two "sides" of the diff, enter a numeric revision.