[Pipet Devel] Distributed Filesystems WAS Choice of ORB implementations
Brad Chapman
chapmanb at arches.uga.edu
Sat Apr 8 10:36:04 EDT 2000
I'm going to try to summarize some of the stuff Jarl and Jeff have
been talking about and then offer my opinions. Please smack me up if I
have represented anyone wrong.
My quick summary:
Okay, so I originally proposed a way to use the naming and trading
services of corba to manage remote vsh instances and "search" for
specific nodes within registered nodes.
Jeff and Jarl have proposed an alternative way of doing things
using a distributed filesystem approach. The choices that were
mentioned for implementing this were:
Jungle Monkey -> http://www.junglemonkey.net/
Gnutmeg -> http://sourceforge.net/project/?group_id=3965
My thoughts:
I guess I sort of see what you guys are thinking about, but I
don't really understand exactly how you are proposing to work this
type of system into vsh. Let me try to think through it:
1. There is a central server (at TOL) that all instances of vsh will
register with.
2. This central server will allow services for browsing and searching
the registered vsh instances (like the screenshots on the Jungle
Monkey page).
3. Each vsh instances will make available files that remote users can
browse. Here, I don't really understand what kind of files you want to
make available--xml files describing available nodes and subnets?
4. A user looking through the files finds a subnet they want to run on
a remote vsh implementation. Then they need to connect to the remote
vsh system (via CORBA) and request the available subnet information.
How does the user get the object reference for the remote system?
5. Then things would work the same in both of the proposals. The local
user would incorporate the remote node into their work flow diagram
and submit the work flow diagram for processing, and then during
processing all of the dl -> bl and bl -> dl and dl -> dl stuff would
have to happen to do the proper processing (I won't go into that again
here since that's not what we are discussing).
So, do I have it semi-right? What are your guys ideas for how this
will work?
I also have a couple of other issues:
1. As Jarl mentioned, the distributed filesystem plan depends heavily
on a central server to handle everything, and then we get into
problems with load on the server and what happens if it goes down.
Although I would propose that we start _initially_ with a central
naming and trading service, it is possible (ie. has been done--not
that I can do it yet :) to distribute these services over multiple
computers. How does the distributed filesystem plan scale up?
2. How does authentication work for viewing everyones files? Does all
this occur in the central server? If so, it seems like this might make
the central server a big target for cracking, since if you could do so
you would have access to everyone's files. For the corba services
plan, the authentication would occur at each vsh instance, which might
at least make things less of a target.
3. Jeff, you wrote this:
> Napster has an interesting security feature, even if it was
unintentional: You
> don't connect directly to a remote system. You connect to the
Napster server,
> and the Napster server connects to the remote system. How would
something
> like that jive with our plans?
Does this imply that every connection and filesharing and everything
has to funnel through the main server? For what I want to use vsh for,
this would be a serious pain. For a biological example (sorry, that's
all I can think of :-), lets say I had three Suns running local BLAST
servers and had 30,000 sequences to query. I'd like to be able to use
vsh to "distribute" the request over the three Suns (ie. 10,000 on
each) as a kind of cheap cluster or something. This is entirely a
local use (and could even be occuring on one local network), but would
I have to funnel all the connections through some remote central
server? I don't like this, and would rather not even have to register
my Suns with the remote server.
The thing I like about the corba services are that they are
voluntary and more of a convenience--if you could get a hold of on
object instance in another way, you wouldn't even have to use them.
How will the distributed filesystem plan work?
Sorry to be so long--thanks if you read through all of this :-)
Brad
More information about the Pipet-Devel
mailing list