[Biodevelopers] NCBI XML

Alex Milowski alex at milowski.com
Thu Jan 16 00:11:40 EST 2003


On Wednesday, January 15, 2003, at 08:56 PM, Joe Landman wrote:

> Hi Alex:
>
>   I just googled about for it.  Here is the NCBI page, they say to get
> the toolbox, and read some of the data within it.
>
> http://www.ncbi.nlm.nih.gov/Sitemap/Summary/asn1.html
>

Thanks.

Sheesh... this stuff is really verbose.  For example, the follow 
encodes a
single date instance *every* time you use a date:


     <Date>
       <Date_std>
         <Date-std>
           <Date-std_year>2002</Date-std_year>
           <Date-std_month>8</Date-std_month>
           <Date-std_day>14</Date-std_day>
         </Date-std>
       </Date_std>
     </Date>


Why not:

    date="2002/8/14"

    or

    <date>2002/8/14</date>

    or

   <date year="2002" month="8" day="14"/>

?

Then this XML wouldn't be over 3GB in size!  Actually, 8GB... but I 
stripped the
ignorable whitespace (5GB of pretty printing...)

People can't possibly use this data in XML format...

Alex Milowski                FAX: (707) 598-7649                        
  alex at milowski.com

"The excellence of grammar as a guide is proportional to the paucity of 
the
inflexions, i.e. to the degree of analysis effected by the language
considered."

Bertrand Russell in a footnote of Principles of Mathematics





More information about the Biodevelopers mailing list