[Bioclusters] Question about grid

Arnon Klein bioclusters@bioinformatics.org
Thu, 13 May 2004 16:22:07 +0300

That's a really exciting concept, but why stop at dedicated cpu-service 
servers? Why not harness P2P into this, where each client has the 
ability both to donate cycles and use other people's cycles.
I can see this benefitting organisations such as universities, where 
there is a lot of unuesed desktop power, and many people who need that 
power, but for short periods only, so didn't get dedicated facilities.

So imagine running these servers/client hybrids, that accept code (in 
binary, bytecode, or source code format), on anywhere between thousands 
to millions of computers...
If the code is self-contained, then the bandwidth and latency issues are 
not as big as they would be for small chunks of instructions. I think 
that even with very fast networks, latencies will kill the benefits when 
scaling into something larger than a single LAN segment, so to avoid it, 
you have to batch the instructions together (i.e. send complete functions).

Doing this in Java is actualy pretty easy, since RMI lets you transport 
an object containing both code and data over the network.
You put up a server, exposing a method such as:

interface Computable {
	public Object compute() throws Exception;

public Object compute(Computable job) throws RemoteException;

and using the Java RMI facilities, call this method on the server with 
an object that implements a method called "compute" that does the 

Ofcourse , like you said, security and accounting issues will pose 
problems for a wide-spread installation.


Dan Bolser wrote:

>I had an idea to do with grid computing, but it may be total garbage.
>I heard about some clever people who started to 'steal' computation from
>unsuspecting web sites by hijacking the normal function of the site and
>co-opting its computations into a different program. 
>If these stories are true, surly we could do this with a bit more
>civility, and set up a bunch of generic 'calculators' through the web
>which could then be used for grid computing.
>The way I imagine the system is this... 
>Program starts by searching the web for calculators, the code is compiled
>for the 'web-engine' so every single instruction is encoded as an HTTP /
>CGI / XML request, and all instructions are performed over the web on a
>shifting number of calculators.
>Actually, I found something similar hear...
>I wanted to ask about the feasibility of such an idea. 
>For example if one machine sent all its instructions to another over a
>gigabit intra net, how much slower would this be than local computation?
>Is a gigabit LAN 1/2/3/10/100/1000 orders of magnitude slower than
>internal CPU communication channels?
>The power of an open source system like this would be if someone like
>Apache would take the idea on board and release it as part of its standard
>distribution. However, even if every web server on the web were running
>such a calculator (why not be ambitious), could the system be fast enough?
>Naturally there are a lot of issues regarding distribution / allocation /
>scheduling etc. but before we get into nasty details, is the idea remotely
>worth consideration?
>How difficult would it be to make a Java compiler accommodate such a 
>Thanks very much for any feedback,
>Bioclusters maillist  -  Bioclusters@bioinformatics.org