modred team mailing list archive

Thread
Date
Sample Run

To: modred <modred@xxxxxxxxxxxxxxxxxxx>
From: Scott Lawrence <bytbox@xxxxxxxxx>
Date: Thu, 31 Dec 2009 20:43:24 -0500
Cc: Eric Van Albert <ericva1992@xxxxxxxxx>
Ok, here's the way I think we've agreed stuff works.  We still need to
deal with what happens when a server - or worse, the hub - goes down.
We need a good way of dealing with that. Perhaps we could deal with
the situation of hub downage later, since that actually involves
low-level networking stuff (you have to grab the hub's IP address). We
also need some way to add extensions - like an extension to encrypt
things differently, or an extension that lets the hub also act as a
router, letting clients talk to servers that aren't actually on the
internet, just on a subnetwork somewhere (so this reduces the number
of IP addresses needed for a modred network to 1, which is good).

System boot: the system boots when the hub comes up - until then, the
servers just sit quietly.  So the hub comes up, contacts each of its
servers, and sends them a pgp-signed timestamp with a message: I'm
here.  Talk to me.  In response, the servers prove who they, are, and
initialize necessary information from the hub.  If a server's down,
the hub duly notes the fact, and issues warnings if it senses that any
data will be inaccessible. (So yes, the hub must have an index of all
data.  This

Servers can join at later times, too, by contacting the hub.

System shutdown: the hub informs all servers that it's time for bed.
The servers in turn tell this to their clients, who either (a) finish
or (b) save their state or (c) send their state to the server for
better safekeeping. To be implemented much later. Then each server
contacts the hub that it's ready, but stays awake in case another
server wants data during the shutdown process.  When all servers
report ready-to-go, the hub says byebye, and then goes into a dormant
(not off state), where the network can still be remotely brought up,
with the administrator's private key.

Yeah, that could take a few minutes.

Client joins: the client contacts the hub, and either registers an
account or validates itself into an old account with a message signed
with its private key (naturally the hub had validated itself with the
client too).  The client is then assigned to a server, with which it
establishes a secure connection, with the hub assuring the server that
yes, that client is supposed to talk to you, etc...etc....  The client
may express preference for a certain type of server (I want to only
mess with my own account - give me the server with that info, or I
want to read documentation - give me the server with that), or the hub
may use some heuristic based on the user's account.  The client may
attempt to switch later, or may be switched later, being informed of
the switch with a signed note from the hub (the hub, not the server,
should initiate the switch, so the switch should only happen if the
server has to ask the hub a lot of questions because of the client).

Client leaves appropriately: the server gleans whatever information it
can get from the client, maybe storing the state for the client, and
the client walks off.

Client leaves badly: the server notes that the client is no longer
checking in every 10 seconds and send a "where are you" request.  With
no response, the client is sent a "disconnect", the server assigns the
client's task to somebody else (like itself).

A user wants something done: the user can do this two ways.
Essentially, he can compile a binary on his computer, or he can submit
the source to the cluster for compiling there (actually, he starts
bash/msh/mash (modred shell, modred again shell, of _course_) on the
cluster and types "compile" - because the cluster acts as a computer).
 The latter is preferable - if the former, _his_ computer, not a
server, acts as the coordinator.  It essentially has to submit source
code to the server for everything it wants to do.  Not a nice position
to be in, so this should only be done in cases where the only thing
the cluster's being used for is a gigantic hard drive. (We might add
more functionality to the local-compile option later, depending on how
things go - but we don't have to, because it's not the way the
cluster's meant to be used.)

A task originates: a task could originate from a client, or from a
server.  Doesn't matter.  A task shouldn't originate from the hub.

If a client originates the task, it gets sent to the server.  If the
server doesn't want to handle it due to load problems or the fact that
it simply doesn't have the information, it passes it off to the hub.
Otherwise, the server makes occasional requests to the hub saying "I
need /etc/xorg.conf".  The hub then says, "here's xorg.conf, here's
some relevant files too".  The server caches them for an appropriate
period of time.

The server sends computation requests to itself and other clients.
These computation requests generally result from forks.  When a client
forks, it may also spawn a computation request, sending that to the
server.  These computation requests are the same as the original
request, except there are dependencies. Programs written for modred
may specify optimization info.

Did I hit everything?

-- 
Scott Lawrence

Webmaster
The Blair Robot Project
Montgomery Blair High School