[Bioclusters] Question about grid

Chris Dwan bioclusters@bioinformatics.org
Thu, 13 May 2004 08:41:32 -0500

> I heard about some clever people who started to 'steal' computation 
> from
> unsuspecting web sites by hijacking the normal function of the site and
> co-opting its computations into a different program.

This is certainly an interesting idea, but I doubt that it will have 
practical application in terms of solving real scientific computing 
issues.  The problems are that the computing power available is rather 
small, the security and trust issues are rather large, and the overall 
gain (vs. buying a computer of one's own) is almost certainly negative.

I file this in the same bin with the people who want to store data for 
half an hour by bouncing radio waves off of Mars, people who do 
arithmetic using the sign bits on ethernet cards, people who store data 
for seconds at a time by modulating the signal on their long haul FC 
loop, and the like.  Fascinating games, but you get more juice out of 
one of those $900 desktops we were talking about earlier.

> If these stories are true, surly we could do this with a bit more
> civility, and set up a bunch of generic 'calculators' through the web
> which could then be used for grid computing.

The recent merge between the web services and the grid folks gives me 
hope that something similar to this will be possible, but at the level 
of workflows which integrate applications, rather than applications 
which integrate instructions.

I don't know anyone who is willing to open up their servers to a truly 
anonymous user, for truly arbitrary computation.  Keep in mind the 
security concerns surrounding simple things like open network relays 
and anonymous ftp.  The real power will be as we begin to converge on 
some standards for Remote Method Invocation (RMI) (web services are an 
example), authentication and authorization (SSL certs seem to be the 
order of the day), resource specification and discovery (the GGF and 
bioMOBY folks have put a lot of thought into this), and the like.

I believe that many of us would be willing to, and in fact want to, 
offer our special algorithm, tool,  or dataset via "the grid" rather 
than just as a web page.  We already have "federating" solutions to 
database interoperability, but it's difficult with current tools to 
specify things like expiration dates, update frequency, and API changes 
vs. transient errors.  Many people are already offering very generic 
SOAP APIs to their applications.  This is a start, and there will be 
much more.

In terms of anonymous vs. authorized users:  As it becomes more and 
more costly for me to allow someone else to use my resources, I would 
like to be able to throttle their usage (or at least their priority) 
according to my relationship with them.  It's a lot easier to form 
collaborations once we have a standard way of allowing our software to 
work together.  This is the "virtual organization" idea spoken of by 
the grid folks.  Of course, this is far from a new idea:  Organizations 
realized thousands of years ago that standardizing communication 
channels and components made them much more efficient.

I think that our sharing will need to be deliberate and mindful of the 
human issues involved.  It's much easier to justify (at the level of a 
CIO) that we're making a technology decision because it will make it 
easier for us to collaborate with our peers, than to justify installing 
a piece of software that will make our servers available for the entire 
planet to use anonymously.

All that said:  I'm quite excited to share the resources at my 
disposal.  I have two small clusters, a pile of software, and a bunch 
of databases.  If we define a few applications / data resources, I 
would be happy to try to get a web service / OGSA / SOAP / RMI version 
of them up.  The thing I won't do again is a "hello world" into the 
void.  We need to focus on the useful and the needed.

Are people sitting on their hands, wishing for some particular 
resource?  We've got bio-mirror and the like.  We've got net-blast.  
What's the next step, and how can I help?

-Chris Dwan