On Tuesday 06 May 2003 06:47 pm, nicos@itsa.ucsf.edu wrote: > Only had a quick glance, but the methods: > isFromFile() > setSource() > findNextLabel() > readtoLabel() > might be needed (unchanged) in many other parsers. Should we have a > parser class that can be extended by the specific parsers, or is that > getting too ugly? I was thinking about that - it might be a good idea overall. A lot of the parsers will end up doing very similar things, so I imagine there'll be a fair amount of code re-use that we can do. I don't know how much of the code will necessarily be the same, but the methods could be included in an "abstract class" (i.e. the functions empty but existing) if nothing else. > I'll try to get started on the Genbank parsers soon. That will extend > the array returned by the parsers dramatically,so we should have a > careful look at that array once the Genbank parser is done. Also, I'll > probably take the approach to have the findNextInfile in that parser read > a whole record into an array,and do the actual parsing only in the array. I think MOST of the fields in GenBank have a direct representative in Serge's seq object, so for that at least I don't think there'll be too much to do in seq_factory and so on. > Sean, do you want to have a go at the clustalw parser too? Heh...just committed it - I BELIEVE both clustalw and clustalx use the same file structure (at least, as far as the way the parser can tell). There should now be an updated parse.inc.php (with clustal auto-detect), the filetype parser for clustal, and an updated test.php which also tests clustal. Oh, and a lamin.aln file to test with... Let me know if you spot any problems.