modred team mailing list archive

Thread
Date
Re: Ideas

To: Michael Cohen <gnurdux@xxxxxxxxx>
From: Scott Lawrence <bytbox@xxxxxxxxx>
Date: Mon, 28 Dec 2009 23:05:39 -0500
Cc: modred@xxxxxxxxxxxxxxxxxxx
In-reply-to: <53a52e1f0912282004j638e630fj3cff77de84bb43cf@mail.gmail.com>
Actually, not.  Ignore that last message.

Can you build a prototype, that calls a specified function in place of
the kernel??

On 12/28/09, Scott Lawrence <bytbox@xxxxxxxxx> wrote:
> Ok.  I think that for the most part, we should block system calls.
>
> On 12/28/09, Michael Cohen <gnurdux@xxxxxxxxx> wrote:
>> Very little to not at all if the code doesn't make many system calls.  I
>> wouldn't expect it to make many anyway; the tasks that this is good for
>> shouldn't be ones that require much communication (because the Internet
>> is fairly slow; if it's always sending stuff and requiring responses
>> that gives probably a .1 second latency each step at least), so its
>> mostly just running on the CPU.  It would certainly add less overhead
>> for CPU-intensive things than say, Java.
>>
>> Michael Cohen
>>
>> Scott Lawrence wrote:
>>> And this is the only thing that needs to be done? How much will it
>>> slow the code down? More importantly
>>>
>>>
>>> On 12/28/09, Michael Cohen <gnurdux@xxxxxxxxx> wrote:
>>>> We can't actually block interrupts; that require kernel mode code.
>>>> Also, I think there are other mechanisms for system calls.
>>>>
>>>> BUT
>>>>
>>>> lucky for us, Linux (and other unixes, but with slightly different
>>>> implementations) has a built-in way to intercept system calls.  It's
>>>> called ptrace, and it is what is used for the USACO sandbox.
>>>>
>>>> Michael Cohen
>>>>
>>>> Scott Lawrence wrote:
>>>>> Oh. I see.
>>>>>
>>>>> My first instinct is to say: "ban them!"  But it would be really nice
>>>>> if most existing source code could run out-of-the-box on the cluster,
>>>>> even if there wouldn't be a speedup.
>>>>>
>>>>> I'm not planning on support C/C++ on windows - that's way too much
>>>>> trouble - so we only have to worry about unix systems.  Are interrupts
>>>>> the only things we would have to block?
>>>>>
>>>>> On 12/28/09, Michael Cohen <gnurdux@xxxxxxxxx> wrote:
>>>>>> You are simply incorrect here.  The issue isn't library calls, it's
>>>>>> system calls.  Libc calls themselves use system calls, which are
>>>>>> interrupts.  You can do everything without touching libc.  You just
>>>>>> do
>>>>>> the right stuff to the stuck and do an interrupt or whatever.  The
>>>>>> library doesn't have some special way to access the kernel.
>>>>>>
>>>>>> Michael Cohen
>>>>>>
>>>>>> Scott Lawrence wrote:
>>>>>>> You're all missing the point.  I'm claiming that, properly
>>>>>>> implemented, Modred should require no sandboxing outside of what is
>>>>>>> necessary to implement it's logic.
>>>>>>>
>>>>>>> So back to our good friends Alice, Bob, and Mallory.  Alice sends
>>>>>>> the
>>>>>>> cluster (which means she directs it to the hub, but let's just
>>>>>>> consider the cluster a big black box for now) some C source code.
>>>>>>> This code does some strange stuff - lots of file i/o and memory
>>>>>>> access.  What does the cluster do with this?
>>>>>>>
>>>>>>> It links the program with its own special libraries.  Even inline
>>>>>>> assembly has to call functions to interface with the hard drive and
>>>>>>> allocate memory and such. Malicious code that gets submitted to the
>>>>>>> server will be sanitized in this fashion.  The only problem I see is
>>>>>>> with illegal memory access - but I suspect this will be dealt with,
>>>>>>> because the cluster has to analyze what data the client program
>>>>>>> accesses anyway...
>>>>>>>
>>>>>>> Now Bob wants to compile and link his program on his own computer.
>>>>>>> Fine.  He uses a different (smaller, incidentally) set of libraries.
>>>>>>> These libraries don't intercept every call of malloc and stuff -
>>>>>>> those
>>>>>>> are run on his computer.  But if he wants to access cluster data, he
>>>>>>> has to use special functions.  And he can't actually run code on the
>>>>>>> cluster.
>>>>>>>
>>>>>>> Now what does Mallory do again?
>>>>>>>
>>>>>>>
>>>>>>> On 12/28/09, Michael Cohen <gnurdux@xxxxxxxxx> wrote:
>>>>>>>> Server-side I don't see an issue.  (java's, lua's,
>>>>>>>>  >  javascript's, .NET/mono, some other random thing) is basically
>>>>>>>> what
>>>>>>>> I already said.  There are other sandboxing systems that are
>>>>>>>> designed
>>>>>>>> to
>>>>>>>> work on x86 native code, such as vx32 (I think I mentioned that
>>>>>>>> also).
>>>>>>>> Many of these schemes (with the exception of vx32) have the
>>>>>>>> advantage
>>>>>>>> that they also automatically make the code cross-platform.  Even
>>>>>>>> vx32
>>>>>>>> is
>>>>>>>> supposedly portable to Windows, but nobody has done it yet and I
>>>>>>>> have
>>>>>>>> no
>>>>>>>> idea if any of us have the expertise to.
>>>>>>>>
>>>>>>>> Frederic Koehler wrote:
>>>>>>>>> As far as sandboxing, server-side you can presumably rely on the
>>>>>>>>> operating
>>>>>>>>> system's sandboxes (per-user or perhaps some more elaborate
>>>>>>>>> mechanism
>>>>>>>>> like
>>>>>>>>> FreeBSD's jails).
>>>>>>>>>
>>>>>>>>> But as soon as the cluster sends code out to clients, obviously
>>>>>>>>> there
>>>>>>>>> is
>>>>>>>>> a
>>>>>>>>> big issue if we let them do whatever the hell they want. Just
>>>>>>>>> preventing
>>>>>>>>> assembly or anything like that simply doesn't work in C/C++, (not
>>>>>>>>> to
>>>>>>>>> mention
>>>>>>>>> it would be suprisingly hard/irritating,) since the code could
>>>>>>>>> still
>>>>>>>>> execute
>>>>>>>>> the system-calls (you could try not linking against libc,too, but
>>>>>>>>> then
>>>>>>>>> you
>>>>>>>>> _really_ have no portability :P).
>>>>>>>>>
>>>>>>>>> System-call controlling is possible, but is either pretty
>>>>>>>>> unportable
>>>>>>>>> (lots
>>>>>>>>> of x86 assembly stuff) or slow-ish (virtual machines).
>>>>>>>>>
>>>>>>>>> That being said, if you completely seperate client-sendable code
>>>>>>>>> from
>>>>>>>>> server-code, I think that allays a lot of the concerns. Requiring
>>>>>>>>> client-sendable code to be written for some safe VM (java's,
>>>>>>>>> lua's,
>>>>>>>>>  javascript's, .NET/mono, some other random thing) could avoid
>>>>>>>>> this.
>>>>>>>>> In
>>>>>>>>> addition, client-sendable code would intentionally be written with
>>>>>>>>> knowledge
>>>>>>>>> of the sensitivity of the data it handles (i.e. not written at all
>>>>>>>>> if
>>>>>>>>> the
>>>>>>>>> data is important).
>>>>>>>>>
>>>>>>>>> On Mon, Dec 28, 2009 at 7:49 PM, Michael Cohen <gnurdux@xxxxxxxxx>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> I would still be happier if there were a sandbox, actually.
>>>>>>>>>> There
>>>>>>>>>> are
>>>>>>>>>> ways
>>>>>>>>>> of getting around that sort of thing that are too complicated to
>>>>>>>>>> prevent
>>>>>>>>>> at
>>>>>>>>>> the source level IMO.  For instance, you can use inline assembly.
>>>>>>>>>> So
>>>>>>>>>> we
>>>>>>>>>> block inline assembly.  That's all well and good, but now we've
>>>>>>>>>> blocked
>>>>>>>>>> people using legitimate assembly optimizations. Worse, what
>>>>>>>>>> happens
>>>>>>>>>> if
>>>>>>>>>> they
>>>>>>>>>> execute some shellcody stuff, allowing them to escape?  I don't
>>>>>>>>>> really
>>>>>>>>>> know
>>>>>>>>>> how to block that at all.  On the other hand, a sandbox would not
>>>>>>>>>> add
>>>>>>>>>> much
>>>>>>>>>> overhead since these tasks will most likely use lots of CPU time
>>>>>>>>>> but
>>>>>>>>>> few
>>>>>>>>>> system calls or whatever.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Michael Cohen
>>>>>>>>>>
>>>>>>>>>> Scott Lawrence wrote:
>>>>>>>>>>
>>>>>>>>>>> Ok, I'm going to build a prototype of my privacy model.  I'm not
>>>>>>>>>>> going
>>>>>>>>>>> to implement the challenge-response stuff, I'll assume there's
>>>>>>>>>>> an
>>>>>>>>>>> implementation of that and that it works.
>>>>>>>>>>>
>>>>>>>>>>> I think I've isolated the misunderstanding about the sandboxes.
>>>>>>>>>>> You
>>>>>>>>>>> don't submit binary code the the Modred cluster - you either
>>>>>>>>>>> submit
>>>>>>>>>>> source, to be linked by the modred cluster with the relevant
>>>>>>>>>>> libraries, or you link the code yourself with the libraries.
>>>>>>>>>>> The
>>>>>>>>>>> libraries that you would link with merely copy the program over
>>>>>>>>>>> to
>>>>>>>>>>> the
>>>>>>>>>>> cluster, where it can be executed in a manner deemed fit by the
>>>>>>>>>>> code
>>>>>>>>>>> there.
>>>>>>>>>>>
>>>>>>>>>>> I suppose you could say that that is a sandbox. ;-)
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 12/28/09, Michael Cohen <gnurdux@xxxxxxxxx> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> If you read my email more carefully, you will see that I am not
>>>>>>>>>>>> necessary objecting to Scott's suggestion.  I say that it is
>>>>>>>>>>>> not
>>>>>>>>>>>> necessary, but that it would be the only thing necessary to
>>>>>>>>>>>> allow
>>>>>>>>>>>> more
>>>>>>>>>>>> problem-specific privacy tasks to be used.  The need for a
>>>>>>>>>>>> sandbox
>>>>>>>>>>>> is
>>>>>>>>>>>> pretty simple.  If we make untrusted users able to ask for
>>>>>>>>>>>> tasks,
>>>>>>>>>>>> if
>>>>>>>>>>>> they upload code, then I don't want it running unsandboxed on
>>>>>>>>>>>> my
>>>>>>>>>>>> computer.  Otherwise, their code could steal my files, wipe my
>>>>>>>>>>>> harddisk,
>>>>>>>>>>>> install Windows or do other undesirable things.  If it is
>>>>>>>>>>>> sandboxed,
>>>>>>>>>>>> then arbitary code can be executed safely, as long as we trust
>>>>>>>>>>>> the
>>>>>>>>>>>> sandbox.  Sandboxed environments are often also cross-platform,
>>>>>>>>>>>> another
>>>>>>>>>>>> plus, since they typically replace or intercept any kind of
>>>>>>>>>>>> system
>>>>>>>>>>>> call.
>>>>>>>>>>>>
>>>>>>>>>>>> Michael Cohen
>>>>>>>>>>>>
>>>>>>>>>>>> Scott Lawrence wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Well, I'm glad someone expresses opinions I don't agree
>>>>>>>>>>>>> with...
>>>>>>>>>>>>>
>>>>>>>>>>>>> I think Mikey's objection to privacy concerns is that it's so
>>>>>>>>>>>>> problem-specific, we can't reasonably expect to have a general
>>>>>>>>>>>>> implementation.  But if the user specifies which parts of the
>>>>>>>>>>>>> data
>>>>>>>>>>>>> are
>>>>>>>>>>>>> private, the Modred hub just has to be sure to divvy up tasks
>>>>>>>>>>>>> in
>>>>>>>>>>>>> a
>>>>>>>>>>>>> way
>>>>>>>>>>>>> that gives those bits of information only to the trusted,
>>>>>>>>>>>>> dedicated
>>>>>>>>>>>>> servers.
>>>>>>>>>>>>>
>>>>>>>>>>>>> For the purposes of clarity, I will be referring to dedicated
>>>>>>>>>>>>> servers
>>>>>>>>>>>>> as simply "servers", and the central server as the "hub".
>>>>>>>>>>>>>
>>>>>>>>>>>>> I don't see the need for a sandbox.  Could you present some
>>>>>>>>>>>>> specific
>>>>>>>>>>>>> attacks that a sandbox would fix?
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 12/28/09, Michael Cohen <gnurdux@xxxxxxxxx> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> It seems to me that dealing with privacy concerns is an
>>>>>>>>>>>>>> extremely
>>>>>>>>>>>>>> problem-specific issue.  In any given case you need to work
>>>>>>>>>>>>>> out
>>>>>>>>>>>>>> how
>>>>>>>>>>>>>> much
>>>>>>>>>>>>>> you can give to people without letting private information
>>>>>>>>>>>>>> leak,
>>>>>>>>>>>>>> but
>>>>>>>>>>>>>> the
>>>>>>>>>>>>>> details vary greatly from problem to problem.  That isn't our
>>>>>>>>>>>>>> business,
>>>>>>>>>>>>>> and I don't think we should concern ourselves with it too
>>>>>>>>>>>>>> much.
>>>>>>>>>>>>>> The
>>>>>>>>>>>>>> way
>>>>>>>>>>>>>> I see it there are two options:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 1. make this designed for stuff without privacy concerns
>>>>>>>>>>>>>>        I think this is both the easiest and the best option.
>>>>>>>>>>>>>> I
>>>>>>>>>>>>>> don't
>>>>>>>>>>>>>> really
>>>>>>>>>>>>>> like the idea of a public, free service doing computations
>>>>>>>>>>>>>> for
>>>>>>>>>>>>>> an
>>>>>>>>>>>>>> evil
>>>>>>>>>>>>>> corporation anyway; if it's being done BY the public it
>>>>>>>>>>>>>> should
>>>>>>>>>>>>>> be
>>>>>>>>>>>>>> done
>>>>>>>>>>>>>> FOR the public.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 2. add in a small amount of functionality designed to
>>>>>>>>>>>>>> facilitate
>>>>>>>>>>>>>> dealing
>>>>>>>>>>>>>> with privacy concerns
>>>>>>>>>>>>>>        At the level of this project, that would probably just
>>>>>>>>>>>>>> be
>>>>>>>>>>>>>> the
>>>>>>>>>>>>>> controls
>>>>>>>>>>>>>> on what data gets sent to what people.  There might be
>>>>>>>>>>>>>> reasons
>>>>>>>>>>>>>> for
>>>>>>>>>>>>>> adding such controls anyway; some tasks could be designated
>>>>>>>>>>>>>> for
>>>>>>>>>>>>>> only
>>>>>>>>>>>>>> "trusted" users.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Either way I doubt that this will be a big issue.  I think
>>>>>>>>>>>>>> maybe
>>>>>>>>>>>>>> a
>>>>>>>>>>>>>> bigger issue is how to run arbitrary code efficiently and
>>>>>>>>>>>>>> securely.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I see only a few solutions
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>        Don't allow arbitrary code, but only a defined set of
>>>>>>>>>>>>>> tasks.
>>>>>>>>>>>>>>  Or,
>>>>>>>>>>>>>> similarly, allow some "trusted" set of tasks, each separately
>>>>>>>>>>>>>> ported
>>>>>>>>>>>>>> to
>>>>>>>>>>>>>> each platform (like boinc).
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>        Use Java.  This lets us easily sandbox it and is
>>>>>>>>>>>>>> cross-platform,
>>>>>>>>>>>>>> but
>>>>>>>>>>>>>> sacrifices a bit on efficiency.  Also, Java can be annoying
>>>>>>>>>>>>>> (although
>>>>>>>>>>>>>> other JVM languages would also work in this situation).
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>        There are ways of running cross-platform, C/C++ code
>>>>>>>>>>>>>> in
>>>>>>>>>>>>>> a
>>>>>>>>>>>>>> sandbox as
>>>>>>>>>>>>>> well.  One possibility is to use LLVM, although the LLVM
>>>>>>>>>>>>>> developers
>>>>>>>>>>>>>> specifically say that LLVM is NOT designed to be used this
>>>>>>>>>>>>>> way.
>>>>>>>>>>>>>>  Another
>>>>>>>>>>>>>> possibility is to use a sandboxed code system that works on
>>>>>>>>>>>>>> multiple
>>>>>>>>>>>>>> operating systems but only on x86.  This includes things like
>>>>>>>>>>>>>> VX32,
>>>>>>>>>>>>>> which is apparently portable to Windows, but hasn't been
>>>>>>>>>>>>>> ported.
>>>>>>>>>>>>>> I
>>>>>>>>>>>>>> don't know whether or not that sort of thing is within our
>>>>>>>>>>>>>> abilities.
>>>>>>>>>>>>>> Another option might be Google Native Client; that is
>>>>>>>>>>>>>> designed
>>>>>>>>>>>>>> to
>>>>>>>>>>>>>> be
>>>>>>>>>>>>>> used in a web browser but I don't know how hard it would be
>>>>>>>>>>>>>> to
>>>>>>>>>>>>>> "rip
>>>>>>>>>>>>>> out"
>>>>>>>>>>>>>> the sandboxing/cross-OS x86 code stuff.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Michael Cohen
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>> Mailing list: https://launchpad.net/~modred
>>>>>>>>>>>>>> Post to     : modred@xxxxxxxxxxxxxxxxxxx
>>>>>>>>>>>>>> Unsubscribe : https://launchpad.net/~modred
>>>>>>>>>>>>>> More help   : https://help.launchpad.net/ListHelp
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>> Mailing list: https://launchpad.net/~modred
>>>>>>>>>>>> Post to     : modred@xxxxxxxxxxxxxxxxxxx
>>>>>>>>>>>> Unsubscribe : https://launchpad.net/~modred
>>>>>>>>>>>> More help   : https://help.launchpad.net/ListHelp
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>> _______________________________________________
>>>>>>>>>> Mailing list: https://launchpad.net/~modred
>>>>>>>>>> Post to     : modred@xxxxxxxxxxxxxxxxxxx
>>>>>>>>>> Unsubscribe : https://launchpad.net/~modred
>>>>>>>>>> More help   : https://help.launchpad.net/ListHelp
>>>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> Mailing list: https://launchpad.net/~modred
>>>>>>>> Post to     : modred@xxxxxxxxxxxxxxxxxxx
>>>>>>>> Unsubscribe : https://launchpad.net/~modred
>>>>>>>> More help   : https://help.launchpad.net/ListHelp
>>>>>>>>
>>>>>> _______________________________________________
>>>>>> Mailing list: https://launchpad.net/~modred
>>>>>> Post to     : modred@xxxxxxxxxxxxxxxxxxx
>>>>>> Unsubscribe : https://launchpad.net/~modred
>>>>>> More help   : https://help.launchpad.net/ListHelp
>>>>>>
>>>>>
>>>>
>>>> _______________________________________________
>>>> Mailing list: https://launchpad.net/~modred
>>>> Post to     : modred@xxxxxxxxxxxxxxxxxxx
>>>> Unsubscribe : https://launchpad.net/~modred
>>>> More help   : https://help.launchpad.net/ListHelp
>>>>
>>>
>>>
>>
>>
>> _______________________________________________
>> Mailing list: https://launchpad.net/~modred
>> Post to     : modred@xxxxxxxxxxxxxxxxxxx
>> Unsubscribe : https://launchpad.net/~modred
>> More help   : https://help.launchpad.net/ListHelp
>>
>
>
> --
> Scott Lawrence
>
> Webmaster
> The Blair Robot Project
> Montgomery Blair High School
>


-- 
Scott Lawrence

Webmaster
The Blair Robot Project
Montgomery Blair High School
Follow ups

Re: Ideas
From: Michael Cohen, 2009-12-29
References

Ideas
From: Michael Cohen, 2009-12-28
Re: Ideas
From: Frederic Koehler, 2009-12-29
Re: Ideas
From: Michael Cohen, 2009-12-29
Re: Ideas
From: Scott Lawrence, 2009-12-29
Re: Ideas
From: Michael Cohen, 2009-12-29
Re: Ideas
From: Scott Lawrence, 2009-12-29
Re: Ideas
From: Michael Cohen, 2009-12-29
Re: Ideas
From: Scott Lawrence, 2009-12-29
Re: Ideas
From: Michael Cohen, 2009-12-29
Re: Ideas
From: Scott Lawrence, 2009-12-29