The Database Database
Your quote here!
Sorry for the poor layout of this page (we are working on it...) please use your scroll bar to browse the relevant sections.
The MetaBase will be an on-line database of biological databases (a database database).
- As a basic requirement, the MetaBase will contain a list of databases, URLs and short descriptions of the most commonly used biological databases available on the WWW.
- The MetaBase should be flexible, allowing many users to contribute, update and maintain the data.
- The MetaBase should be extensible, allowing for example, scientific papers to be linked to the databases through a variety of relationships.
- The MetaBase should be 'classifiable'. That is to say that it should not contain just one fixed database classification scheme, but many. Potentially any user could create his or her own classification of databases according to any criteria, and those classifications should be available to everyone.
- The MetaBase and its user management software should be built using PHP and MySQL for flexibility, allowing a continuous growth of the range and scope of databases and database information.
- Potentially the MetaBase could be used as a source of 'genuine' meta-data for integration projects.
- The MetaBase should link into the core requirements of the BioNeeds Group, providing an up to date source of biological database information, as well as tools for gathering database usage information from the biological community.
- The MetaBase should be published in a suitable open access journal.
A classification scheme
With regard to database classifiction (and the possibility of multiple classifiction schemes in the Metabase, I would like to outline one high level biological database classification scheme here. Please feel free to add your own proposed scheme to this site (until we have a better system for allowing the collaborative developemnt and retreival of classification systems).
One biological datbase classification scheme
- Primary database
- A database which is compised of the results of basic scientific experiments. Like a primary witness, it is a basic (first hand) source of data.
- Secondary database
- A database including computationally derived information from the primary data. These databases apply processing in the form of various algorithms to produce 'secondary' data from the primary data. A secondary database my link several primary databases using hyperlinks, but no serious integration effort is involved.
- Ternary database
- An integrated database which combines primary and or secondary datbases into a derived 'classification' database.
- Middle ware
- The technology for producing a ternary database should not be confused with the database iteslf. This is confusing because many middleware technologies develope a ternary database to show off the technology 'in action', and it is hard to distinguish the two. One example of this is the ECOCYC database.
To get going we need a good MSc student, a database schema for basic database and user information and a login / data entry and editing system. Ideally the design will be modular, allowing different visualization, searching and browsing 'modules' to plug into the overall design.
This section should not be static! It is very early on in the development process, so please let us know what we have missed.
This project is hosted at Bioinformatics.Org and it is thanks to the open policy of BiO that projects like this can exist. The MetaBase is a project of the BioNeeds Group
Comments and suggestions
If you have something to add, please do!