modred team mailing list archive

Thread
Date
Re: Fwd: Concept stuff

To: Michael Cohen <gnurdux@xxxxxxxxx>
From: Scott Lawrence <bytbox@xxxxxxxxx>
Date: Mon, 28 Dec 2009 22:46:18 -0500
Cc: modred@xxxxxxxxxxxxxxxxxxx
In-reply-to: <4B397B4C.10004@gmail.com>
Oh, that's interesting.  It might actually work... Other comments?


On 12/28/09, Michael Cohen <gnurdux@xxxxxxxxx> wrote:
> Wrong.  You need to have credits to pay people with.  And so you have to
> spend more credits if you want to pay people more.
>
> Michael Cohen
>
> Scott Lawrence wrote:
>> No.  That leads to people trying to make their projects as valuable as
>> possible, and within 3 days, the whole system will be worthless.
>>
>> Please remember that the mailing lists at launchpad don't perform
>> reply-to mangling. Instruct your client accordingly.
>>
>> On 12/28/09, Michael Cohen <gnurdux@xxxxxxxxx> wrote:
>>> I would actually award credits on a per-project basis.  Each project
>>> simply chooses how many credits to award for each task.  Your computer
>>> is offered so many credits for so much work; if it finishes it gets the
>>> creds and otherwise not.  Trying to do it based on time is dumb because
>>> we don't care about time, we care about computational power contributed.
>>>   If someone gives me 5000 hours, but in a VM limited to run at .1% of
>>> their Pentium 3 CPU, then that isn't worth as much to me as 5 hours on
>>> someone's brand new quad core powerhouse.
>>>
>>> Michael Cohen
>>>
>>> Scott Lawrence wrote:
>>>> Some administrators will consider CPU time credits to be very
>>>> important, though. Especially if they want them to be
>>>> buyable/exchangable for that stink green stuff.
>>>>
>>>> Let's deal with this later - we could just validate the same way we do
>>>> right answer/wrong answer validation.
>>>>
>>>>
>>>>
>>>>
>>>> On 12/28/09, Frederic Koehler <fkfire@xxxxxxxxx> wrote:
>>>>> ---------- Forwarded message ----------
>>>>> From: Frederic Koehler <fkfire@xxxxxxxxx>
>>>>> Date: Mon, Dec 28, 2009 at 10:17 PM
>>>>> Subject: Re: [Modred] Concept stuff
>>>>> To: Scott Lawrence <bytbox@xxxxxxxxx>
>>>>>
>>>>>
>>>>> Network capacity can be protected by batching responses. Normal clients
>>>>> can
>>>>> save several computations and send one big response (like when exiting)
>>>>> -
>>>>> this will have similar bandwidth to big computations but avoid the
>>>>> potential
>>>>> for a really long computation wasting time on a bunch of computers
>>>>> without
>>>>> being solved.
>>>>>
>>>>>
>>>>> On Mon, Dec 28, 2009 at 10:14 PM, Scott Lawrence <bytbox@xxxxxxxxx>
>>>>> wrote:
>>>>>
>>>>>> * Maybe. But that still makes it too easy to gain false credits.
>>>>>> * Yeah, I'm agreeing with this.  The hub delegates to the servers, and
>>>>>> the servers delegate to clients.
>>>>>> * This will overload network capacity, and I don't like it.  We should
>>>>>> be able to trust clients to make long computations (long=over 30
>>>>>> seconds).  Maybe clients could give servers hints on how long they'll
>>>>>> be on?
>>>>>>
>>>>>> On 12/28/09, Frederic Koehler <fkfire@xxxxxxxxx> wrote:
>>>>>>> Hah, my email died, wow...Wonder what happened to it....
>>>>>>> Anyway, this is something like what I wrote before:
>>>>>>>
>>>>>>> * CPU time credits can be very roughly estimated by averaging
>>>>>>> response
>>>>>> time.
>>>>>>> It's not all _that_ important anyway if nothing is a behemoth task.
>>>>>>>
>>>>>>> * The hub can immediately, upon establishing connection, redirect
>>>>>>> client
>>>>>> to
>>>>>>> a server. The server will still have to communicate with hub
>>>>>>> somewhat,
>>>>>> but
>>>>>>> it can only send stuff necessarily pertaining to the hub.
>>>>>>>
>>>>>>> * Jobs should probably be many small tasks to avoid the risk of
>>>>>>> losing
>>>>>>> a
>>>>>>> giant computation (since saving computation state is not
>>>>>>> easy/generalizable). Beyond that, sending keep-alive packets is
>>>>>>> enough
>>>>>>> to
>>>>>>> know when a client dies.
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Dec 28, 2009 at 6:16 PM, Scott Lawrence <bytbox@xxxxxxxxx>
>>>>>> wrote:
>>>>>>>> What?
>>>>>>>>
>>>>>>>> On 12/28/09, Frederic Koehler <fkfire@xxxxxxxxx> wrote:
>>>>>>>>> On Mon, Dec 28, 2009 at 12:45 AM, Scott Lawrence <bytbox@xxxxxxxxx>
>>>>>>>> wrote:
>>>>>>>>>> "What happens when a client disconnects with unfinished work? Is
>>>>>>>>>> the
>>>>>>>>>> work immeditately reassigned, or does the server wait for a
>>>>>>>>>> specified
>>>>>>>>>> period, etc. This could come up quite a lot because some clients
>>>>>>>>>> will
>>>>>>>>>> just disconnect as soon as work they submitted is completed."
>>>>>>>>>>
>>>>>>>>>> Ouch.  Good question.  Here's one solution: small tasks (expected
>>>>>> time
>>>>>>>>>> <2 seconds) are always assigned to two or more clients/servers.
>>>>>>>>>> If
>>>>>>>>>> both disconnect, reassign, if one disconnects, use the other guy's
>>>>>>>>>> answer. Large tasks, if a computer stops regularly checking in
>>>>>>>>>> every
>>>>>> 5
>>>>>>>>>> or so seconds, give that computer's results to date to another
>>>>>>>>>> computer.  So yeah, I think a client should have to make regular
>>>>>>>>>> reports to a server.
>>>>>>>>>>
>>>>>>>>>> Here's another problem: how do we tell how many CPU time credits
>>>>>>>>>> to
>>>>>>>>>> grant a client?  We can't always tell how long a problem should
>>>>>>>>>> take
>>>>>>>>>> beforehand.
>>>>>>>>>>
>>>>>>>>>> Here's another problem: which computer should handle the clients?
>>>>>>>>>> As
>>>>>>>>>> I've been thinking about this, there are three types of computers,
>>>>>> the
>>>>>>>>>> single hub, the various dedicated servers (capable of storing
>>>>>>>>>> permanent data), and the clients.  (The hub is necessary - without
>>>>>> it,
>>>>>>>>>> the performance of the cluster drastically decreases.) So clients
>>>>>>>>>> connect to the hub, and then the hub directs all computers.  But
>>>>>>>>>> the
>>>>>>>>>> hub will get overloaded if 100 computers are checking in every 10
>>>>>>>>>> seconds to give it more data (and then the hub has to pass this on
>>>>>>>>>> to
>>>>>>>>>> other servers for storage, etc...).  So at some point, the hub
>>>>>>>>>> needs
>>>>>>>>>> to tell the client to talk to the server.  When?
>>>>>>>>>>
>>>>>>>>>> Who wants to create that prototype?
>>>>>>>>>>
>>>>>>>>>> On 12/28/09, Scott Lawrence <bytbox@xxxxxxxxx> wrote:
>>>>>>>>>>> This is where we start building prototypes. However, just to keep
>>>>>> the
>>>>>>>>>>> theoretical side going: I disagree about the privacy issue.  Most
>>>>>>>>>>> operations that would benefit from the CPU time of a cluster
>>>>>> (notice
>>>>>>>>>>> I'm not talking about the data storage and reliability benefits,
>>>>>>>>>>> which
>>>>>>>>>>> aren't affected by the presence of clients) are not very private.
>>>>>>>>>>> Rendering nice screensavers ("Electric Sheep", I think that one's
>>>>>>>>>>> called), and hefty data sifting aren't private - who cares about
>>>>>> the
>>>>>>>>>>> screensavers, and the data is generally public anyway (of course
>>>>>>>>>>> if
>>>>>>>>>>> it
>>>>>>>>>>> wasn't, it would be marked so).
>>>>>>>>>>>
>>>>>>>>>>> Ray tracing and simulation could be more of an issue.
>>>>>>>>>>> Hypothetical
>>>>>>>>>>> situation: Alice is simulating how wind will affect her
>>>>>>>>>>> proprietary
>>>>>>>>>>> airplane design.  Naturally, she can't hand off the whole design,
>>>>>> or
>>>>>>>>>>> even parts of the design, to random client computers.  This is
>>>>>> where
>>>>>>>>>>> the windows programmer says, "so the client computers can't help
>>>>>>>>>>> Alice."  But that's not true - as a bad example, what if the
>>>>>>>>>>> Modred
>>>>>>>>>>> hub gave to a client computer 80 types of landing gear, and told
>>>>>> the
>>>>>>>>>>> computer, not to simulate something, but to solve a general
>>>>>>>>>>> formula
>>>>>>>>>>> that could later be used in the computation in a trivial and
>>>>>>>>>>> quick
>>>>>>>>>>> way?  If that client is evil, it will learn that Alice's airplane
>>>>>> has
>>>>>>>>>>> some sort of landing gear.
>>>>>>>>>>>
>>>>>>>>>>> Somebody needs to create a prototype of a server that can create
>>>>>>>>>>> arbitrary problems in some format, so we can all try to trick it.
>>>>>>  I
>>>>>>>>>>> suggest lisp as the language, but it's up to the implementer.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 12/28/09, Scott Lawrence <bytbox@xxxxxxxxx> wrote:
>>>>>>>>>>>> ---------- Forwarded message ----------
>>>>>>>>>>>> From: Frederic Koehler <fkfire@xxxxxxxxx>
>>>>>>>>>>>> Date: Sun, 27 Dec 2009 23:21:28 -0500
>>>>>>>>>>>> Subject: Re: [Modred] Concept stuff
>>>>>>>>>>>> To: Scott Lawrence <bytbox@xxxxxxxxx>
>>>>>>>>>>>>
>>>>>>>>>>>>  * This is sort-of a solution (while obviously less-than-optimal
>>>>>>>>>>>> security,
>>>>>>>>>>>> some grid-computing stuff does this, like BOINC), however, it
>>>>>> turns
>>>>>>>> out
>>>>>>>>>>>> that
>>>>>>>>>>>> this may require custom validation methods - for example, it's
>>>>>>>>>>>> normal
>>>>>>>>>> for
>>>>>>>>>>>> floating point values to be different on different computers,
>>>>>>>>>>>> and
>>>>>>>>>>>> the
>>>>>>>>>>>> same
>>>>>>>>>>>> could apply for other computations.
>>>>>>>>>>>>  * A malicious client would only need to misbehave on certain
>>>>>>>> problems
>>>>>>>>>>>> that
>>>>>>>>>>>> a malicious user could designate (or recognize obvious fake
>>>>>>>> programs),
>>>>>>>>>>>> allowing the fake program test to work.
>>>>>>>>>>>>     - A better idea would be to randomly reduplicate some
>>>>>>>> computations
>>>>>>>>>>>> many
>>>>>>>>>>>> times - the malicious client wouldn't notice anything, but could
>>>>>>>> easily
>>>>>>>>>>>> be
>>>>>>>>>>>> singled out
>>>>>>>>>>>>
>>>>>>>>>>>>   * Thirdly is mostly the same thing I wrote before - only
>>>>>>>> computations
>>>>>>>>>>>> that
>>>>>>>>>>>> are said to be totally unimportant privacy wise could benefit
>>>>>>>>>>>> from
>>>>>>>>>>>> client-side computing.
>>>>>>>>>>>>
>>>>>>>>>>>> So I really think that client-side computation is only a good
>>>>>>>>>>>> idea
>>>>>>>> for
>>>>>>>>>>>> a
>>>>>>>>>>>> small subset of problems (like the type that there already exist
>>>>>>>>>>>> massive
>>>>>>>>>>>> grid computing solutions for, like SETI@HOME)
>>>>>>>>>>>>
>>>>>>>>>>>> On Sun, Dec 27, 2009 at 10:57 PM, Scott Lawrence <
>>>>>> bytbox@xxxxxxxxx>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> I want clients to be used for computation, and I want maximum
>>>>>>>>>>>>> privacy+security given that restriction.  Some ideas:
>>>>>>>>>>>>>
>>>>>>>>>>>>> With a large network, two computers can perform the same
>>>>>>>> computation.
>>>>>>>>>>>>> Furthermore, a smart modred hub can give fake problems to
>>>>>> clients,
>>>>>>>>>>>>> just to make sure that they're operating correctly.  A client
>>>>>> that
>>>>>>>>>>>>> isn't operating correctly gets cut. (No second chances! A
>>>>>>>>>>>>> program
>>>>>>>>>>>>> could exploit that!)
>>>>>>>>>>>>>
>>>>>>>>>>>>> If a user specifies a certain bit of data (SSN, for instance)
>>>>>>>>>>>>> as
>>>>>>>>>>>>> highly sensitive, modred should know not to hand off that
>>>>>>>> computation
>>>>>>>>>>>>> to a client. (If it does by accident, it certainly should never
>>>>>>>>>>>>> hand
>>>>>>>>>>>>> off the data.) privacy++
>>>>>>>>>>>>>
>>>>>>>>>>>>> In all cases, computations should be anonymous. privacy++
>>>>>>>>>>>>>
>>>>>>>>>>>>> Other ideas?
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 12/27/09, Frederic Koehler <fkfire@xxxxxxxxx> wrote:
>>>>>>>>>>>>>> The idea for client-side computation implies that we have
>>>>>>>>>>>>>> highly-trusted
>>>>>>>>>>>>>> clients... (we know they won't provide invalid answers)
>>>>>>>>>>>>>> Otherwise,
>>>>>>>>>>>>>> client-side computation requires verifying answers and so is
>>>>>> only
>>>>>>>>>>>>>> useful
>>>>>>>>>>>>> for
>>>>>>>>>>>>>> a few NP-ish problems. In addition (assuming trusted clients).
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Also, it means that, since computations can contain sensitive
>>>>>>>> data,
>>>>>>>>>>>>>> the
>>>>>>>>>>>>>> abillity to spread the computation is limited - unless we know
>>>>>>>>>>>>>> the
>>>>>>>>>>>>>> computation is not user-sensitive, it can only try to use the
>>>>>>>> user's
>>>>>>>>>>>>>> client(s). This way we also know that the client has no
>>>>>> interest
>>>>>>>> in
>>>>>>>>>>>>>> sabatoging answers to mess with other users (except to exploit
>>>>>>>>>>>>> server-side
>>>>>>>>>>>>>> weaknesses, which is inevitable).
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Sat, Dec 26, 2009 at 10:28 PM, Scott Lawrence <
>>>>>>>> bytbox@xxxxxxxxx>
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>> Here is what, as I envision it, will make modred unique (and
>>>>>>>> hard):
>>>>>>>>>>>>>>>  * Support for clients who can come and leave, lending CPU
>>>>>> time
>>>>>>>> and
>>>>>>>>>>>>>>> using CPU time as they choose.  There are some clusters that
>>>>>>>>>>>>>>> support
>>>>>>>>>>>>>>> this, but not very many.
>>>>>>>>>>>>>>>  * Support for computers participating across the internet.
>>>>>>>>>>>>>>> This
>>>>>>>>>>>>>>> goes
>>>>>>>>>>>>>>> along with the previous part, but remember we need security
>>>>>>>>>>>>>>> to
>>>>>>>> make
>>>>>>>>>>>>>>> this worth anything. This also means that user data could
>>>>>>>>>> potentially
>>>>>>>>>>>>>>> be passed to untrusted computers - we need a way to prevent
>>>>>>>>>>>>>>> this.
>>>>>>>>>>>>>>>  * The ability for clients to run on any OS, using perl,
>>>>>> python,
>>>>>>>>>>>>>>> java,
>>>>>>>>>>>>>>> or (on unix systems) C and C++ (servers and the hub will need
>>>>>> to
>>>>>>>>>>>>>>> run
>>>>>>>>>>>>>>> on linux or at least another unix, or a dedicated OS which we
>>>>>>>>>>>>>>> may
>>>>>>>>>>>>>>> decide to write)
>>>>>>>>>>>>>>>  * Modred has great ease of use because it acts as a single
>>>>>>>> unified
>>>>>>>>>>>>>>> computer - a special client program exists that allows one to
>>>>>>>>>>>>>>> log
>>>>>>>>>> in,
>>>>>>>>>>>>>>> access and edit files, etc...  This is very close to unique -
>>>>>>>>>>>>>>> google
>>>>>>>>>>>>>>> has it, though
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Because of that last point, many OS design issues should come
>>>>>> up
>>>>>>>>>> when
>>>>>>>>>>>>>>> we code modred. (I think Freddy pointed this out?) Thus, we
>>>>>> have
>>>>>>>> a
>>>>>>>>>>>>>>> chance to fix flaws in standard unix, incorporating plan
>>>>>> 9-type
>>>>>>>>>> stuff
>>>>>>>>>>>>>>> (google it and read about it - Plan 9 from Bell Labs, the way
>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>> future of unix was) while also creating an actually usable
>>>>>> user
>>>>>>>>>>>>>>> interface. (No offense, but to a newbie non-super-technical
>>>>>>>>>>>>>>> user,
>>>>>>>>>>>>>>> linux is a bit harsh...)
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Some implementation questions and ideas:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>  - how will updates be handled?  Remember we've got 200
>>>>>>>>>>>>>>> computers
>>>>>>>>>>>>>>> potentially, some of which might be clients that want to
>>>>>>>>>>>>>>> participate
>>>>>>>>>>>>>>> in multiple clusters.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>  - maybe we should have programs not include front ends.
>>>>>>>>  Instead,
>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>> modred software creates a front-end from the program's self
>>>>>>>>>>>>>>> description.  This would enforce a consistent user interface
>>>>>> if
>>>>>>>> we
>>>>>>>>>>>>>>> could implement it well
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>  - how can we keep users from being able to snoop on each
>>>>>>>>>>>>>>> others'
>>>>>>>>>>>>>>> data?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> That's just a sample to get people thinking.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On 12/26/09, David Tolnay <dtolnay@xxxxxxxxx> wrote:
>>>>>>>>>>>>>>>> Before diving in to specifics about the implementation I
>>>>>> think
>>>>>>>> we
>>>>>>>>>>>>>>>> need
>>>>>>>>>>>>>>>> to decide how we want modred to be different from (read:
>>>>>>>>>>>>>>>> better
>>>>>>>>>>>>>>>> than)
>>>>>>>>>>>>>>>> existing bootable cluster environments. Here is a short
>>>>>>>>>>>>>>>> list
>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>> check
>>>>>>>>>>>>>>>> out:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Bootable Cluster CD (http://bccd.net/) - folks presented
>>>>>> this
>>>>>>>> at
>>>>>>>>>>>>>>>> SC09
>>>>>>>>>>>>>>>> in portland, it was pretty neat stuff. Packed with
>>>>>>>>>>>>>>>> education
>>>>>> /
>>>>>>>>>>>>>>>> debugging / visualization features
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Oscar (http://svn.oscar.openclustergroup.org/trac/oscar) -
>>>>>>>> very
>>>>>>>>>>>>>>>> trivially simple way to transform an existing unix lab into
>>>>>> a
>>>>>>>>>>>>>>>> cluster
>>>>>>>>>>>>>>>> resource
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Lnx-bbc (http://www.lnx-bbc.com/) - includes cowsay!
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Perceus/warewulf (http://www.perceus.org/portal/) - a lot
>>>>>> of
>>>>>>>>>> other
>>>>>>>>>>>>>>>> sites made reference to this, haven't read too much about
>>>>>>>>>>>>>>>> it
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> What specifically do you want to improve over any of these?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On 12/25/09, Frederic Koehler <fkfire@xxxxxxxxx> wrote:
>>>>>>>>>>>>>>>>> So, as far as I understand this project, the idea is to
>>>>>> build
>>>>>>>>>>>>>>>>> both a client library and a program using the library to
>>>>>>>>>>>>>>>>> do
>>>>>>>>>>>>> clustering
>>>>>>>>>>>>>>>>> stuff, along with matching server/hub foo (the library
>>>>>> might
>>>>>>>> be
>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>> same
>>>>>>>>>>>>>>>>> or
>>>>>>>>>>>>>>>>> whatever, not important).
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> So from this understanding, it seems that the system
>>>>>>>>>>>>>>>>> should
>>>>>>>>>>>>>>>>> provide
>>>>>>>>>>>>>>>>> some
>>>>>>>>>>>>>>>>> basic pseudo-operating system stuff and programs can build
>>>>>> on
>>>>>>>>>>>>>>>>> that,
>>>>>>>>>>>>>>>>> just
>>>>>>>>>>>>>>>>> like they would normally build on their local libc/kernel
>>>>>> and
>>>>>>>>>>>>>>>>> stuff.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> So (I sure like the word "so" today...) if we want the
>>>>>>>>>>>>>>>>> type
>>>>>>>>>>>>>>>>> of
>>>>>>>>>>>>> general
>>>>>>>>>>>>>>>>> os-like stuff it seems their needs to be support for:
>>>>>>>>>>>>>>>>>    * A simpe message passing model - abstract away all the
>>>>>>>>>>>>>>>>> TCP-foo,
>>>>>>>>>>>>>>>  maybe
>>>>>>>>>>>>>>>>> use existing foo here (obviously needs fleshing out)
>>>>>>>>>>>>>>>>>    * Permanent storage IO (clone the unix write(), read(),
>>>>>>>>>>>>>>>>> open()
>>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>> sync()
>>>>>>>>>>>>>>>>> model,  or maybe just use one of the existing database-ish
>>>>>>>> nosql
>>>>>>>>>>>>> things
>>>>>>>>>>>>>>>>> out
>>>>>>>>>>>>>>>>> there)
>>>>>>>>>>>>>>>>>            - Unix-ish model - you create your data hunk,
>>>>>> say
>>>>>>>> you
>>>>>>>>>>>>>>>>> want
>>>>>>>>>>>>>>> all
>>>>>>>>>>>>>>>>> this stuff in it, then after sync() we know it's actually
>>>>>>>>>>>>>>>>> somewhere
>>>>>>>>>>>>>>>>> written
>>>>>>>>>>>>>>>>> on a hard-drive, and other things can read it too
>>>>>>>>>>>>>>>>>            - Unless this isn't in fact needed (but I
>>>>>>>>>>>>>>>>> assume
>>>>>>>>>>>>>>>>> it
>>>>>>>>>> is)
>>>>>>>>>>>>>>>>>            - Also need to figure out if it's
>>>>>>>>>>>>>>>>> filesystem-ish
>>>>>>>> foo
>>>>>>>>>>>>>>>>> (hierarchial) we want or more relational database-ish
>>>>>>>>>>>>>>>>> stuff
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>    * A task delegation model - some type of map/reduce-ish
>>>>>>>> stuff
>>>>>>>>>>>>>>>>>           - Servers have a few built-in computations, and
>>>>>>>> client
>>>>>>>>>>>>>>> utilizes
>>>>>>>>>>>>>>>>> them?
>>>>>>>>>>>>>>>>>           - Or more complex, servers run sandboxed
>>>>>>>> computational
>>>>>>>>>>>>> code?
>>>>>>>>>>>>>>>>>    * A security system?
>>>>>>>>>>>>>>>>>         - Needs fleshing out
>>>>>>>>>>>>>>>>>         - Presumably what the "hub" manages - it's the
>>>>>>>>>>>>>>>>> trusted
>>>>>>>>>>>>>>>>> thing
>>>>>>>>>>>>>>>>>         - Obviously, not everybody is allowed to use the
>>>>>>>> cluster
>>>>>>>>>>>>>>>>> for
>>>>>>>>>>>>>>>>> computation, not everybody can find out what everybody
>>>>>>>>>>>>>>>>> else
>>>>>>>>>>>>>>>>> is
>>>>>>>>>>>>>>>>> doing,
>>>>>>>>>>>>>>> etc.
>>>>>>>>>>>>>>>>>       - But also, is their a limit on storage, are some
>>>>>>>>>>>>>>>>> things
>>>>>>>>>>>>>>> prioritized
>>>>>>>>>>>>>>>>> over others, ?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Theroretically, server's are written to provide the io
>>>>>>>>>>>>>>>>> backend
>>>>>>>>>> and
>>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>> allow
>>>>>>>>>>>>>>>>> for task delegation, clients use the api, although hub has
>>>>>>>> it's
>>>>>>>>>>>>>>>>> work
>>>>>>>>>>>>>>>>> cut
>>>>>>>>>>>>>>>>> out
>>>>>>>>>>>>>>>>> delegating all the file io and figuring out what the state
>>>>>> of
>>>>>>>>>> that
>>>>>>>>>>>>> is.
>>>>>>>>>>>>>>>>> On top of some mixture of this, one could build a simple
>>>>>>>>>>>>>>>>> unix-ish
>>>>>>>>>>>>>>>>> pseudo-cli, theroretically, as well as real software.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Anyway, before actually doing anything, people should read
>>>>>>>> about
>>>>>>>>>>>>>>>>> PVM
>>>>>>>>>>>>>>>>> (Parallel Virtual Machine) and the like (maybe also Hadoop
>>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>> other
>>>>>>>>>>>>>>>>> foo-ish
>>>>>>>>>>>>>>>>> stuff) so Modred isn't just a bad clone of it
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Anyway, (yes, twice in a row!), I figured _someone_ had to
>>>>>>>>>> respond
>>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>> Scott,
>>>>>>>>>>>>>>>>> otherwise he'd feel all lonely and sad :P Now he can have
>>>>>>>>>>>>>>>>> a
>>>>>>>> warm
>>>>>>>>>>>>> fuzzy
>>>>>>>>>>>>>>>>> feeling of deep confusion and uncertainty instead :P
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Fri, Dec 25, 2009 at 11:06 PM, Scott Lawrence
>>>>>>>>>>>>>>>>> <bytbox@xxxxxxxxx>
>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>> ---------- Forwarded message ----------
>>>>>>>>>>>>>>>>>> From: Scott Lawrence <bytbox@xxxxxxxxx>
>>>>>>>>>>>>>>>>>> Date: Fri, 25 Dec 2009 19:20:13 -0500
>>>>>>>>>>>>>>>>>> Subject: Design Overview
>>>>>>>>>>>>>>>>>> To: modred <modred@xxxxxxxxxxxxxxxxxxx>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> I'm going to assume that everyone understands the basic
>>>>>>>>>> concepts
>>>>>>>>>>>>> for
>>>>>>>>>>>>>>>>>> modred: a set of networked computers (by 'networked' I
>>>>>>>>>>>>>>>>>> mean,
>>>>>>>>>>>>> they're
>>>>>>>>>>>>>>>>>> all on the internet), divided for the sake of discussion
>>>>>>>> into
>>>>>>>>>>>>>>>>>> three
>>>>>>>>>>>>>>>>>> classes: the 'hub' (the dude in charge, who compupters
>>>>>> who
>>>>>>>>>>>>>>>>>> want
>>>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>> join connect to), the 'servers' (dedicated computers
>>>>>>>>>>>>>>>>>> that
>>>>>>>> can
>>>>>>>>>> be
>>>>>>>>>>>>>>>>>> pretty much relied on not to go down, although
>>>>>>>>>>>>>>>>>> redundancy
>>>>>>>>>>>>>>>>>> is
>>>>>>>>>>>>>>>>>> always
>>>>>>>>>>>>>>>>>> nice), and the 'clients' (computers that send in
>>>>>>>>>>>>>>>>>> requests
>>>>>>>> and
>>>>>>>>>>>>>>>>>> can
>>>>>>>>>>>>> be
>>>>>>>>>>>>>>>>>> used for spare CPU cycles.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Ok, so much for assumptions... :-)
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Things *I* think any design should emphasize:
>>>>>>>>>>>>>>>>>>  * security.
>>>>>>>>>>>>>>>>>>  * relative ease of use, while retaining significant
>>>>>> power.
>>>>>>>>>>>>>>>>>> Challenging.  In particular, it should be possible to
>>>>>>>>>>>>>>>>>> set
>>>>>>>>>>>>>>>>>> up
>>>>>>>> a
>>>>>>>>>>>>> modred
>>>>>>>>>>>>>>>>>> network in under an hour, provided the computers are
>>>>>>>>>>>>>>>>>> already
>>>>>>>>>> set
>>>>>>>>>>>>> up.
>>>>>>>>>>>>>>>>>>  * along with the previous bullet point, having an
>>>>>>>>>>>>>>>>>> interface
>>>>>>>>>>>>>>>>>> that
>>>>>>>>>>>>>>>>>> lets
>>>>>>>>>>>>>>>>>> one use the entire network like a single computer.  This
>>>>>> is
>>>>>>>>>> sort
>>>>>>>>>>>>>>>>>> of
>>>>>>>>>>>>>>>>>> like the way google docs works, except the cloud is
>>>>>> private
>>>>>>>>>>>>>>>>>>  * therefore, it should be a multi-user system with
>>>>>>>>>>>>>>>>>> well-designed
>>>>>>>>>>>>>>>>>> privileges etc...
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> I'm not going to discuss my implementation ideas, let's
>>>>>>>>>>>>>>>>>> hear
>>>>>>>>>>>>>>>>>> others
>>>>>>>>>>>>>>>>>> first.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>> Scott Lawrence
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Webmaster
>>>>>>>>>>>>>>>>>> The Blair Robot Project
>>>>>>>>>>>>>>>>>> Montgomery Blair High School
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>> Scott Lawrence
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Webmaster
>>>>>>>>>>>>>>>>>> The Blair Robot Project
>>>>>>>>>>>>>>>>>> Montgomery Blair High School
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>>>>>  Mailing list: https://launchpad.net/~modred
>>>>>>>>>>>>>>>>>  Post to     : modred@xxxxxxxxxxxxxxxxxxx
>>>>>>>>>>>>>>>>>  Unsubscribe : https://launchpad.net/~modred
>>>>>>>>>>>>>>>>>  More help   : https://help.launchpad.net/ListHelp
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>>>> Mailing list: https://launchpad.net/~modred
>>>>>>>>>>>>>>>> Post to     : modred@xxxxxxxxxxxxxxxxxxx
>>>>>>>>>>>>>>>> Unsubscribe : https://launchpad.net/~modred
>>>>>>>>>>>>>>>> More help   : https://help.launchpad.net/ListHelp
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>> Scott Lawrence
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Webmaster
>>>>>>>>>>>>>>> The Blair Robot Project
>>>>>>>>>>>>>>> Montgomery Blair High School
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>>> Mailing list: https://launchpad.net/~modred
>>>>>>>>>>>>>>> Post to     : modred@xxxxxxxxxxxxxxxxxxx
>>>>>>>>>>>>>>> Unsubscribe : https://launchpad.net/~modred
>>>>>>>>>>>>>>> More help   : https://help.launchpad.net/ListHelp
>>>>>>>>>>>>>>>
>>>>>>>>>>>>> --
>>>>>>>>>>>>> Scott Lawrence
>>>>>>>>>>>>>
>>>>>>>>>>>>> Webmaster
>>>>>>>>>>>>> The Blair Robot Project
>>>>>>>>>>>>> Montgomery Blair High School
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>>>>> Scott Lawrence
>>>>>>>>>>>>
>>>>>>>>>>>> Webmaster
>>>>>>>>>>>> The Blair Robot Project
>>>>>>>>>>>> Montgomery Blair High School
>>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> Scott Lawrence
>>>>>>>>>>>
>>>>>>>>>>> Webmaster
>>>>>>>>>>> The Blair Robot Project
>>>>>>>>>>> Montgomery Blair High School
>>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> Scott Lawrence
>>>>>>>>>>
>>>>>>>>>> Webmaster
>>>>>>>>>> The Blair Robot Project
>>>>>>>>>> Montgomery Blair High School
>>>>>>>>>>
>>>>>>>>>> _______________________________________________
>>>>>>>>>> Mailing list: https://launchpad.net/~modred
>>>>>>>>>> Post to     : modred@xxxxxxxxxxxxxxxxxxx
>>>>>>>>>> Unsubscribe : https://launchpad.net/~modred
>>>>>>>>>> More help   : https://help.launchpad.net/ListHelp
>>>>>>>>>>
>>>>>>>> --
>>>>>>>> Scott Lawrence
>>>>>>>>
>>>>>>>> Webmaster
>>>>>>>> The Blair Robot Project
>>>>>>>> Montgomery Blair High School
>>>>>>>>
>>>>>> --
>>>>>> Scott Lawrence
>>>>>>
>>>>>> Webmaster
>>>>>> The Blair Robot Project
>>>>>> Montgomery Blair High School
>>>>>>
>>>>
>>>
>>
>>
>
>
> _______________________________________________
> Mailing list: https://launchpad.net/~modred
> Post to     : modred@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : https://launchpad.net/~modred
> More help   : https://help.launchpad.net/ListHelp
>


-- 
Scott Lawrence

Webmaster
The Blair Robot Project
Montgomery Blair High School
Follow ups

Re: Fwd: Concept stuff
From: Michael Cohen, 2009-12-29
References

Concept stuff
From: Frederic Koehler, 2009-12-26
Re: Concept stuff
From: Scott Lawrence, 2009-12-28
Re: Concept stuff
From: Frederic Koehler, 2009-12-29
Re: Concept stuff
From: Scott Lawrence, 2009-12-29
Fwd: Concept stuff
From: Frederic Koehler, 2009-12-29
Re: Fwd: Concept stuff
From: Scott Lawrence, 2009-12-29
Re: Fwd: Concept stuff
From: Scott Lawrence, 2009-12-29
Re: Fwd: Concept stuff
From: Michael Cohen, 2009-12-29