modred team mailing list archive

Thread
Date

Re: Fwd: Concept stuff

To: modred <modred@xxxxxxxxxxxxxxxxxxx>
From: Philip <thesmorpifier@xxxxxxxxx>
Date: Mon, 28 Dec 2009 00:34:52 -0500
In-reply-to: <53a52e1f0912272121o48c9daf1u9e41a949c7f3e5e7@mail.gmail.com>

What happens when a client disconnects with unfinished work? Is thework immeditately reassigned, or does the server wait for a specifiedperiod, etc. This could come up quite a lot because some clients willjust disconnect as soon as work they submitted is completed.


A little knowledge is a dangerous thing. So is a lot.

On Dec 28, 2009, at 12:21 AM, Scott Lawrence <bytbox@xxxxxxxxx> wrote:

---------- Forwarded message ----------
From: Frederic Koehler <fkfire@xxxxxxxxx>
Date: Sun, 27 Dec 2009 23:21:28 -0500
Subject: Re: [Modred] Concept stuff
To: Scott Lawrence <bytbox@xxxxxxxxx>
* This is sort-of a solution (while obviously less-than-optimalsecurity,some grid-computing stuff does this, like BOINC), however, it turnsout thatthis may require custom validation methods - for example, it'snormal forfloating point values to be different on different computers, andthe same
could apply for other computations.
* A malicious client would only need to misbehave on certainproblems that
a malicious user could designate (or recognize obvious fake programs),
allowing the fake program test to work.
- A better idea would be to randomly reduplicate somecomputations manytimes - the malicious client wouldn't notice anything, but couldeasily be
singled out
* Thirdly is mostly the same thing I wrote before - onlycomputations that
are said to be totally unimportant privacy wise could benefit from
client-side computing.
So I really think that client-side computation is only a good ideafor asmall subset of problems (like the type that there already existmassive
grid computing solutions for, like SETI@HOME)
On Sun, Dec 27, 2009 at 10:57 PM, Scott Lawrence <bytbox@xxxxxxxxx>wrote:
I want clients to be used for computation, and I want maximum
privacy+security given that restriction.  Some ideas:

With a large network, two computers can perform the same computation.
Furthermore, a smart modred hub can give fake problems to clients,
just to make sure that they're operating correctly.  A client that
isn't operating correctly gets cut. (No second chances! A program
could exploit that!)

If a user specifies a certain bit of data (SSN, for instance) as
highly sensitive, modred should know not to hand off that computation
to a client. (If it does by accident, it certainly should never hand
off the data.) privacy++

In all cases, computations should be anonymous. privacy++

Other ideas?


On 12/27/09, Frederic Koehler <fkfire@xxxxxxxxx> wrote:
The idea for client-side computation implies that we have highly-trusted
clients... (we know they won't provide invalid answers) Otherwise,
client-side computation requires verifying answers and so is onlyuseful
for
a few NP-ish problems. In addition (assuming trusted clients).
Also, it means that, since computations can contain sensitivedata, the
abillity to spread the computation is limited - unless we know the
computation is not user-sensitive, it can only try to use the user's
client(s). This way we also know that the client has no interest in
sabatoging answers to mess with other users (except to exploit
server-side
weaknesses, which is inevitable).

On Sat, Dec 26, 2009 at 10:28 PM, Scott Lawrence <bytbox@xxxxxxxxx>
wrote:
Here is what, as I envision it, will make modred unique (and hard):

* Support for clients who can come and leave, lending CPU time and
using CPU time as they choose. There are some clusters thatsupport
this, but not very many.
* Support for computers participating across the internet. Thisgoes
along with the previous part, but remember we need security to make
this worth anything. This also means that user data couldpotentially
be passed to untrusted computers - we need a way to prevent this.
* The ability for clients to run on any OS, using perl, python,java,or (on unix systems) C and C++ (servers and the hub will need torun
on linux or at least another unix, or a dedicated OS which we may
decide to write)
* Modred has great ease of use because it acts as a single unified
computer - a special client program exists that allows one to login,access and edit files, etc... This is very close to unique -google
has it, though
Because of that last point, many OS design issues should come upwhen
we code modred. (I think Freddy pointed this out?) Thus, we have a
chance to fix flaws in standard unix, incorporating plan 9-typestuff
(google it and read about it - Plan 9 from Bell Labs, the way the
future of unix was) while also creating an actually usable user
interface. (No offense, but to a newbie non-super-technical user,
linux is a bit harsh...)

Some implementation questions and ideas:

- how will updates be handled?  Remember we've got 200 computers
potentially, some of which might be clients that want toparticipate
in multiple clusters.
- maybe we should have programs not include front ends. Instead,the
modred software creates a front-end from the program's self
description.  This would enforce a consistent user interface if we
could implement it well
- how can we keep users from being able to snoop on each others'data?
That's just a sample to get people thinking.


On 12/26/09, David Tolnay <dtolnay@xxxxxxxxx> wrote:
Before diving in to specifics about the implementation I thinkwe needto decide how we want modred to be different from (read: betterthan)existing bootable cluster environments. Here is a short list tocheck
out:
Bootable Cluster CD (http://bccd.net/) - folks presented this atSC09
in portland, it was pretty neat stuff. Packed with education /
debugging / visualization features

Oscar (http://svn.oscar.openclustergroup.org/trac/oscar) - very
trivially simple way to transform an existing unix lab into acluster
resource

Lnx-bbc (http://www.lnx-bbc.com/) - includes cowsay!

Perceus/warewulf (http://www.perceus.org/portal/) - a lot of other
sites made reference to this, haven't read too much about it

What specifically do you want to improve over any of these?


On 12/25/09, Frederic Koehler <fkfire@xxxxxxxxx> wrote:
So, as far as I understand this project, the idea is to build
both a client library and a program using the library to do
clustering
stuff, along with matching server/hub foo (the library might bethe
same
or
whatever, not important).
So from this understanding, it seems that the system shouldprovide
some
basic pseudo-operating system stuff and programs can build onthat,
just
like they would normally build on their local libc/kernel andstuff.
So (I sure like the word "so" today...) if we want the type of
general
os-like stuff it seems their needs to be support for:
* A simpe message passing model - abstract away all the TCP-foo,
maybe
use existing foo here (obviously needs fleshing out)
* Permanent storage IO (clone the unix write(), read(), open() and
sync()
model,  or maybe just use one of the existing database-ish nosql
things
out
there)
- Unix-ish model - you create your data hunk, say youwant
all
this stuff in it, then after sync() we know it's actuallysomewhere
written
on a hard-drive, and other things can read it too
          - Unless this isn't in fact needed (but I assume it is)
          - Also need to figure out if it's filesystem-ish foo
(hierarchial) we want or more relational database-ish stuff

  * A task delegation model - some type of map/reduce-ish stuff
         - Servers have a few built-in computations, and client
utilizes
them?
         - Or more complex, servers run sandboxed computational
code?
  * A security system?
       - Needs fleshing out
- Presumably what the "hub" manages - it's the trustedthing- Obviously, not everybody is allowed to use the clusterforcomputation, not everybody can find out what everybody else isdoing,
etc.
     - But also, is their a limit on storage, are some things
prioritized
over others, ?
Theroretically, server's are written to provide the io backendand to
allow
for task delegation, clients use the api, although hub has it'swork
cut
out
delegating all the file io and figuring out what the state ofthat
is.
On top of some mixture of this, one could build a simple unix-ish
pseudo-cli, theroretically, as well as real software.
Anyway, before actually doing anything, people should readabout PVM(Parallel Virtual Machine) and the like (maybe also Hadoop andother
foo-ish
stuff) so Modred isn't just a bad clone of it
Anyway, (yes, twice in a row!), I figured _someone_ had torespond to
Scott,
otherwise he'd feel all lonely and sad :P Now he can have a warm
fuzzy
feeling of deep confusion and uncertainty instead :P
On Fri, Dec 25, 2009 at 11:06 PM, Scott Lawrence <bytbox@xxxxxxxxx>
wrote:
---------- Forwarded message ----------
From: Scott Lawrence <bytbox@xxxxxxxxx>
Date: Fri, 25 Dec 2009 19:20:13 -0500
Subject: Design Overview
To: modred <modred@xxxxxxxxxxxxxxxxxxx>

I'm going to assume that everyone understands the basic concepts
for
modred: a set of networked computers (by 'networked' I mean,
they're
all on the internet), divided for the sake of discussion intothreeclasses: the 'hub' (the dude in charge, who compupters whowant to
join connect to), the 'servers' (dedicated computers that can be
pretty much relied on not to go down, although redundancy isalwaysnice), and the 'clients' (computers that send in requests andcan
be
used for spare CPU cycles.

Ok, so much for assumptions... :-)

Things *I* think any design should emphasize:
* security.
* relative ease of use, while retaining significant power.
Challenging.  In particular, it should be possible to set up a
modred
network in under an hour, provided the computers are already set
up.
* along with the previous bullet point, having an interface that
lets
one use the entire network like a single computer. This issort of
like the way google docs works, except the cloud is private
* therefore, it should be a multi-user system with well-designed
privileges etc...
I'm not going to discuss my implementation ideas, let's hearothers
first.

--
Scott Lawrence

Webmaster
The Blair Robot Project
Montgomery Blair High School



--
Scott Lawrence

Webmaster
The Blair Robot Project
Montgomery Blair High School
_______________________________________________
Mailing list: https://launchpad.net/~modred
Post to     : modred@xxxxxxxxxxxxxxxxxxx
Unsubscribe : https://launchpad.net/~modred
More help   : https://help.launchpad.net/ListHelp
_______________________________________________
Mailing list: https://launchpad.net/~modred
Post to     : modred@xxxxxxxxxxxxxxxxxxx
Unsubscribe : https://launchpad.net/~modred
More help   : https://help.launchpad.net/ListHelp
--
Scott Lawrence

Webmaster
The Blair Robot Project
Montgomery Blair High School

_______________________________________________
Mailing list: https://launchpad.net/~modred
Post to     : modred@xxxxxxxxxxxxxxxxxxx
Unsubscribe : https://launchpad.net/~modred
More help   : https://help.launchpad.net/ListHelp
--
Scott Lawrence

Webmaster
The Blair Robot Project
Montgomery Blair High School
--
Scott Lawrence

Webmaster
The Blair Robot Project
Montgomery Blair High School

_______________________________________________
Mailing list: https://launchpad.net/~modred
Post to     : modred@xxxxxxxxxxxxxxxxxxx
Unsubscribe : https://launchpad.net/~modred
More help   : https://help.launchpad.net/ListHelp

References

Concept stuff
From: Frederic Koehler, 2009-12-26
Re: Concept stuff
From: David Tolnay, 2009-12-27
Re: Concept stuff
From: Scott Lawrence, 2009-12-27
Re: Concept stuff
From: Frederic Koehler, 2009-12-28
Re: Concept stuff
From: Scott Lawrence, 2009-12-28
Fwd: Concept stuff
From: Scott Lawrence, 2009-12-28