[Pipet Devel] Data Storage Interfaces

Mon Jun 14 17:56:55 EDT 1999

Humberto Ortiz Zuazaga wrote:
> 
> Paos makes me nervous too. It looks complex, and I can't see what it buys us
> over CORBA. Orbit is already a standard part of gnome, and we may as well
> leverage as much as we can from other efforts.

I guess we do need to determine just how much an object broker/server will be
used and what services are needed.

The problem with Orbit is that it is in C, and there are no Python bindings to
it.  In fact there is no GPL or LGPL licensed CORBA implementation for Python. 
So, are our requirements worth the effort of making bindings to Orbit?

If we are talking about only Python->Python communication for loci, it does seem
like a bit of overkill to use CORBA.  On the other hand, CORBA would let Loci
use non-Python components, at the price of anarchy IMHO.

> > It's completely language
> > independent, as well as "junction" indepedent (each end has a standard
> > interface, regardless of whether a C, Python, or Perl script is on the
> > other end, or whether the two are communication via CORBA, TCP/IP, UDP/
> > IP, shared memory, a pipe, a dynamically-loaded plug-in interface).
> 
> This sounds good, and can help make sure we don't overcommit to PAOS.  We just
> need a simple way of communicating between loci, "here's this data, please run
> foo v2 on it", "have your results, formatted for bar v1"

I'd be satisfied if we come up with some system that is (1) small, (2) simple
and (3) language independent.  Don't get me wrong; Loci's design should not be
dictated by a toolset that was not made with Loci in mind.  IOW, I am flexible. 
I came across PAOS and thought that it (or something like it) would be suitable
for Python->Python communication.  This may be more important for the GUI than
for TCP/IP-based communication; I don't know.

> Data objects can be identified by URI's with special URI's for data on a local
> disk (the locid will have to have some way to service requests for your local
> data, possibly from multiple loci).

I don't know if this is what you're addressing, but let's say this is how the
user will contact a remote command-line program:

    Workspace ------> Local Locid ----> Net -----> Remote Locid ----> CL Prog

I think the local and remote locids are really the same program.  So by a
network loopback, the Workspace contacts the local locid the same way any two
locids would communictate.  The above pathway then really looks like this:

    Workspace ---> Net ---> Local Locid ----> Net -----> Remote Locid ----> CL
Prog

This is something like making an httpd connection to Apache running on your
local machine.

This is the key:

    ----> Net -----> Locid ----> Net ---->

If all locids are the same and communicate the same way, the Loci network
becomes seamless and limitless.

This would mean, of course that command-line programs installed on the local
machine would be in something akin to a httpd/cgi-bin directory.

> But now say we want to run a five step pipeline on 2GB worth of genomic
> sequences, each of the five loci may want a copy of the sequence, which means
> our machine will send the file five times. Try that over a modem!
> 
> Caching at loci hubs can help solve this problem.

Right, local and remote locids will have the same capabilities.

:-)
Jeff
-- 
J.W. Bizzaro                  mailto:bizzaro at bc.edu
Boston College Chemistry      http://www.uml.edu/Dept/Chem/Bizzaro/
--