modred team mailing list archive

Thread
Date
Re: Ideas

To: Michael Cohen <gnurdux@xxxxxxxxx>
From: Scott Lawrence <bytbox@xxxxxxxxx>
Date: Mon, 28 Dec 2009 22:52:55 -0500
Cc: modred@xxxxxxxxxxxxxxxxxxx
In-reply-to: <4B397C52.3090208@gmail.com>
Oh. I see.

My first instinct is to say: "ban them!"  But it would be really nice
if most existing source code could run out-of-the-box on the cluster,
even if there wouldn't be a speedup.

I'm not planning on support C/C++ on windows - that's way too much
trouble - so we only have to worry about unix systems.  Are interrupts
the only things we would have to block?

On 12/28/09, Michael Cohen <gnurdux@xxxxxxxxx> wrote:
> You are simply incorrect here.  The issue isn't library calls, it's
> system calls.  Libc calls themselves use system calls, which are
> interrupts.  You can do everything without touching libc.  You just do
> the right stuff to the stuck and do an interrupt or whatever.  The
> library doesn't have some special way to access the kernel.
>
> Michael Cohen
>
> Scott Lawrence wrote:
>> You're all missing the point.  I'm claiming that, properly
>> implemented, Modred should require no sandboxing outside of what is
>> necessary to implement it's logic.
>>
>> So back to our good friends Alice, Bob, and Mallory.  Alice sends the
>> cluster (which means she directs it to the hub, but let's just
>> consider the cluster a big black box for now) some C source code.
>> This code does some strange stuff - lots of file i/o and memory
>> access.  What does the cluster do with this?
>>
>> It links the program with its own special libraries.  Even inline
>> assembly has to call functions to interface with the hard drive and
>> allocate memory and such. Malicious code that gets submitted to the
>> server will be sanitized in this fashion.  The only problem I see is
>> with illegal memory access - but I suspect this will be dealt with,
>> because the cluster has to analyze what data the client program
>> accesses anyway...
>>
>> Now Bob wants to compile and link his program on his own computer.
>> Fine.  He uses a different (smaller, incidentally) set of libraries.
>> These libraries don't intercept every call of malloc and stuff - those
>> are run on his computer.  But if he wants to access cluster data, he
>> has to use special functions.  And he can't actually run code on the
>> cluster.
>>
>> Now what does Mallory do again?
>>
>>
>> On 12/28/09, Michael Cohen <gnurdux@xxxxxxxxx> wrote:
>>> Server-side I don't see an issue.  (java's, lua's,
>>>  >  javascript's, .NET/mono, some other random thing) is basically what
>>> I already said.  There are other sandboxing systems that are designed to
>>> work on x86 native code, such as vx32 (I think I mentioned that also).
>>> Many of these schemes (with the exception of vx32) have the advantage
>>> that they also automatically make the code cross-platform.  Even vx32 is
>>> supposedly portable to Windows, but nobody has done it yet and I have no
>>> idea if any of us have the expertise to.
>>>
>>> Frederic Koehler wrote:
>>>> As far as sandboxing, server-side you can presumably rely on the
>>>> operating
>>>> system's sandboxes (per-user or perhaps some more elaborate mechanism
>>>> like
>>>> FreeBSD's jails).
>>>>
>>>> But as soon as the cluster sends code out to clients, obviously there is
>>>> a
>>>> big issue if we let them do whatever the hell they want. Just preventing
>>>> assembly or anything like that simply doesn't work in C/C++, (not to
>>>> mention
>>>> it would be suprisingly hard/irritating,) since the code could still
>>>> execute
>>>> the system-calls (you could try not linking against libc,too, but then
>>>> you
>>>> _really_ have no portability :P).
>>>>
>>>> System-call controlling is possible, but is either pretty unportable
>>>> (lots
>>>> of x86 assembly stuff) or slow-ish (virtual machines).
>>>>
>>>> That being said, if you completely seperate client-sendable code from
>>>> server-code, I think that allays a lot of the concerns. Requiring
>>>> client-sendable code to be written for some safe VM (java's, lua's,
>>>>  javascript's, .NET/mono, some other random thing) could avoid this. In
>>>> addition, client-sendable code would intentionally be written with
>>>> knowledge
>>>> of the sensitivity of the data it handles (i.e. not written at all if
>>>> the
>>>> data is important).
>>>>
>>>> On Mon, Dec 28, 2009 at 7:49 PM, Michael Cohen <gnurdux@xxxxxxxxx>
>>>> wrote:
>>>>
>>>>> I would still be happier if there were a sandbox, actually.  There are
>>>>> ways
>>>>> of getting around that sort of thing that are too complicated to
>>>>> prevent
>>>>> at
>>>>> the source level IMO.  For instance, you can use inline assembly.  So
>>>>> we
>>>>> block inline assembly.  That's all well and good, but now we've blocked
>>>>> people using legitimate assembly optimizations. Worse, what happens if
>>>>> they
>>>>> execute some shellcody stuff, allowing them to escape?  I don't really
>>>>> know
>>>>> how to block that at all.  On the other hand, a sandbox would not add
>>>>> much
>>>>> overhead since these tasks will most likely use lots of CPU time but
>>>>> few
>>>>> system calls or whatever.
>>>>>
>>>>>
>>>>> Michael Cohen
>>>>>
>>>>> Scott Lawrence wrote:
>>>>>
>>>>>> Ok, I'm going to build a prototype of my privacy model.  I'm not going
>>>>>> to implement the challenge-response stuff, I'll assume there's an
>>>>>> implementation of that and that it works.
>>>>>>
>>>>>> I think I've isolated the misunderstanding about the sandboxes.  You
>>>>>> don't submit binary code the the Modred cluster - you either submit
>>>>>> source, to be linked by the modred cluster with the relevant
>>>>>> libraries, or you link the code yourself with the libraries.  The
>>>>>> libraries that you would link with merely copy the program over to the
>>>>>> cluster, where it can be executed in a manner deemed fit by the code
>>>>>> there.
>>>>>>
>>>>>> I suppose you could say that that is a sandbox. ;-)
>>>>>>
>>>>>>
>>>>>> On 12/28/09, Michael Cohen <gnurdux@xxxxxxxxx> wrote:
>>>>>>
>>>>>>> If you read my email more carefully, you will see that I am not
>>>>>>> necessary objecting to Scott's suggestion.  I say that it is not
>>>>>>> necessary, but that it would be the only thing necessary to allow
>>>>>>> more
>>>>>>> problem-specific privacy tasks to be used.  The need for a sandbox is
>>>>>>> pretty simple.  If we make untrusted users able to ask for tasks, if
>>>>>>> they upload code, then I don't want it running unsandboxed on my
>>>>>>> computer.  Otherwise, their code could steal my files, wipe my
>>>>>>> harddisk,
>>>>>>> install Windows or do other undesirable things.  If it is sandboxed,
>>>>>>> then arbitary code can be executed safely, as long as we trust the
>>>>>>> sandbox.  Sandboxed environments are often also cross-platform,
>>>>>>> another
>>>>>>> plus, since they typically replace or intercept any kind of system
>>>>>>> call.
>>>>>>>
>>>>>>> Michael Cohen
>>>>>>>
>>>>>>> Scott Lawrence wrote:
>>>>>>>
>>>>>>>> Well, I'm glad someone expresses opinions I don't agree with...
>>>>>>>>
>>>>>>>> I think Mikey's objection to privacy concerns is that it's so
>>>>>>>> problem-specific, we can't reasonably expect to have a general
>>>>>>>> implementation.  But if the user specifies which parts of the data
>>>>>>>> are
>>>>>>>> private, the Modred hub just has to be sure to divvy up tasks in a
>>>>>>>> way
>>>>>>>> that gives those bits of information only to the trusted, dedicated
>>>>>>>> servers.
>>>>>>>>
>>>>>>>> For the purposes of clarity, I will be referring to dedicated
>>>>>>>> servers
>>>>>>>> as simply "servers", and the central server as the "hub".
>>>>>>>>
>>>>>>>> I don't see the need for a sandbox.  Could you present some specific
>>>>>>>> attacks that a sandbox would fix?
>>>>>>>>
>>>>>>>>
>>>>>>>> On 12/28/09, Michael Cohen <gnurdux@xxxxxxxxx> wrote:
>>>>>>>>
>>>>>>>>> It seems to me that dealing with privacy concerns is an extremely
>>>>>>>>> problem-specific issue.  In any given case you need to work out how
>>>>>>>>> much
>>>>>>>>> you can give to people without letting private information leak,
>>>>>>>>> but
>>>>>>>>> the
>>>>>>>>> details vary greatly from problem to problem.  That isn't our
>>>>>>>>> business,
>>>>>>>>> and I don't think we should concern ourselves with it too much.
>>>>>>>>> The
>>>>>>>>> way
>>>>>>>>> I see it there are two options:
>>>>>>>>>
>>>>>>>>> 1. make this designed for stuff without privacy concerns
>>>>>>>>>        I think this is both the easiest and the best option.  I
>>>>>>>>> don't
>>>>>>>>> really
>>>>>>>>> like the idea of a public, free service doing computations for an
>>>>>>>>> evil
>>>>>>>>> corporation anyway; if it's being done BY the public it should be
>>>>>>>>> done
>>>>>>>>> FOR the public.
>>>>>>>>>
>>>>>>>>> 2. add in a small amount of functionality designed to facilitate
>>>>>>>>> dealing
>>>>>>>>> with privacy concerns
>>>>>>>>>        At the level of this project, that would probably just be
>>>>>>>>> the
>>>>>>>>> controls
>>>>>>>>> on what data gets sent to what people.  There might be reasons for
>>>>>>>>> adding such controls anyway; some tasks could be designated for
>>>>>>>>> only
>>>>>>>>> "trusted" users.
>>>>>>>>>
>>>>>>>>> Either way I doubt that this will be a big issue.  I think maybe a
>>>>>>>>> bigger issue is how to run arbitrary code efficiently and securely.
>>>>>>>>>
>>>>>>>>> I see only a few solutions
>>>>>>>>>
>>>>>>>>>        Don't allow arbitrary code, but only a defined set of tasks.
>>>>>>>>>  Or,
>>>>>>>>> similarly, allow some "trusted" set of tasks, each separately
>>>>>>>>> ported
>>>>>>>>> to
>>>>>>>>> each platform (like boinc).
>>>>>>>>>
>>>>>>>>>        Use Java.  This lets us easily sandbox it and is
>>>>>>>>> cross-platform,
>>>>>>>>> but
>>>>>>>>> sacrifices a bit on efficiency.  Also, Java can be annoying
>>>>>>>>> (although
>>>>>>>>> other JVM languages would also work in this situation).
>>>>>>>>>
>>>>>>>>>        There are ways of running cross-platform, C/C++ code in a
>>>>>>>>> sandbox as
>>>>>>>>> well.  One possibility is to use LLVM, although the LLVM developers
>>>>>>>>> specifically say that LLVM is NOT designed to be used this way.
>>>>>>>>>  Another
>>>>>>>>> possibility is to use a sandboxed code system that works on
>>>>>>>>> multiple
>>>>>>>>> operating systems but only on x86.  This includes things like VX32,
>>>>>>>>> which is apparently portable to Windows, but hasn't been ported.  I
>>>>>>>>> don't know whether or not that sort of thing is within our
>>>>>>>>> abilities.
>>>>>>>>> Another option might be Google Native Client; that is designed to
>>>>>>>>> be
>>>>>>>>> used in a web browser but I don't know how hard it would be to "rip
>>>>>>>>> out"
>>>>>>>>> the sandboxing/cross-OS x86 code stuff.
>>>>>>>>>
>>>>>>>>> Michael Cohen
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> Mailing list: https://launchpad.net/~modred
>>>>>>>>> Post to     : modred@xxxxxxxxxxxxxxxxxxx
>>>>>>>>> Unsubscribe : https://launchpad.net/~modred
>>>>>>>>> More help   : https://help.launchpad.net/ListHelp
>>>>>>>>>
>>>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Mailing list: https://launchpad.net/~modred
>>>>>>> Post to     : modred@xxxxxxxxxxxxxxxxxxx
>>>>>>> Unsubscribe : https://launchpad.net/~modred
>>>>>>> More help   : https://help.launchpad.net/ListHelp
>>>>>>>
>>>>>>>
>>>>> _______________________________________________
>>>>> Mailing list: https://launchpad.net/~modred
>>>>> Post to     : modred@xxxxxxxxxxxxxxxxxxx
>>>>> Unsubscribe : https://launchpad.net/~modred
>>>>> More help   : https://help.launchpad.net/ListHelp
>>>>>
>>>
>>> _______________________________________________
>>> Mailing list: https://launchpad.net/~modred
>>> Post to     : modred@xxxxxxxxxxxxxxxxxxx
>>> Unsubscribe : https://launchpad.net/~modred
>>> More help   : https://help.launchpad.net/ListHelp
>>>
>>
>>
>
>
> _______________________________________________
> Mailing list: https://launchpad.net/~modred
> Post to     : modred@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : https://launchpad.net/~modred
> More help   : https://help.launchpad.net/ListHelp
>


-- 
Scott Lawrence

Webmaster
The Blair Robot Project
Montgomery Blair High School
Follow ups

Re: Ideas
From: Michael Cohen, 2009-12-29
References

Ideas
From: Michael Cohen, 2009-12-28
Re: Ideas
From: Scott Lawrence, 2009-12-28
Re: Ideas
From: Michael Cohen, 2009-12-28
Re: Ideas
From: Scott Lawrence, 2009-12-28
Re: Ideas
From: Michael Cohen, 2009-12-29
Re: Ideas
From: Frederic Koehler, 2009-12-29
Re: Ideas
From: Michael Cohen, 2009-12-29
Re: Ideas
From: Scott Lawrence, 2009-12-29
Re: Ideas
From: Michael Cohen, 2009-12-29