← Back to team overview

modred team mailing list archive

Re: Ideas

 

And this is the only thing that needs to be done? How much will it
slow the code down? More importantly


On 12/28/09, Michael Cohen <gnurdux@xxxxxxxxx> wrote:
> We can't actually block interrupts; that require kernel mode code.
> Also, I think there are other mechanisms for system calls.
>
> BUT
>
> lucky for us, Linux (and other unixes, but with slightly different
> implementations) has a built-in way to intercept system calls.  It's
> called ptrace, and it is what is used for the USACO sandbox.
>
> Michael Cohen
>
> Scott Lawrence wrote:
>> Oh. I see.
>>
>> My first instinct is to say: "ban them!"  But it would be really nice
>> if most existing source code could run out-of-the-box on the cluster,
>> even if there wouldn't be a speedup.
>>
>> I'm not planning on support C/C++ on windows - that's way too much
>> trouble - so we only have to worry about unix systems.  Are interrupts
>> the only things we would have to block?
>>
>> On 12/28/09, Michael Cohen <gnurdux@xxxxxxxxx> wrote:
>>> You are simply incorrect here.  The issue isn't library calls, it's
>>> system calls.  Libc calls themselves use system calls, which are
>>> interrupts.  You can do everything without touching libc.  You just do
>>> the right stuff to the stuck and do an interrupt or whatever.  The
>>> library doesn't have some special way to access the kernel.
>>>
>>> Michael Cohen
>>>
>>> Scott Lawrence wrote:
>>>> You're all missing the point.  I'm claiming that, properly
>>>> implemented, Modred should require no sandboxing outside of what is
>>>> necessary to implement it's logic.
>>>>
>>>> So back to our good friends Alice, Bob, and Mallory.  Alice sends the
>>>> cluster (which means she directs it to the hub, but let's just
>>>> consider the cluster a big black box for now) some C source code.
>>>> This code does some strange stuff - lots of file i/o and memory
>>>> access.  What does the cluster do with this?
>>>>
>>>> It links the program with its own special libraries.  Even inline
>>>> assembly has to call functions to interface with the hard drive and
>>>> allocate memory and such. Malicious code that gets submitted to the
>>>> server will be sanitized in this fashion.  The only problem I see is
>>>> with illegal memory access - but I suspect this will be dealt with,
>>>> because the cluster has to analyze what data the client program
>>>> accesses anyway...
>>>>
>>>> Now Bob wants to compile and link his program on his own computer.
>>>> Fine.  He uses a different (smaller, incidentally) set of libraries.
>>>> These libraries don't intercept every call of malloc and stuff - those
>>>> are run on his computer.  But if he wants to access cluster data, he
>>>> has to use special functions.  And he can't actually run code on the
>>>> cluster.
>>>>
>>>> Now what does Mallory do again?
>>>>
>>>>
>>>> On 12/28/09, Michael Cohen <gnurdux@xxxxxxxxx> wrote:
>>>>> Server-side I don't see an issue.  (java's, lua's,
>>>>>  >  javascript's, .NET/mono, some other random thing) is basically what
>>>>> I already said.  There are other sandboxing systems that are designed
>>>>> to
>>>>> work on x86 native code, such as vx32 (I think I mentioned that also).
>>>>> Many of these schemes (with the exception of vx32) have the advantage
>>>>> that they also automatically make the code cross-platform.  Even vx32
>>>>> is
>>>>> supposedly portable to Windows, but nobody has done it yet and I have
>>>>> no
>>>>> idea if any of us have the expertise to.
>>>>>
>>>>> Frederic Koehler wrote:
>>>>>> As far as sandboxing, server-side you can presumably rely on the
>>>>>> operating
>>>>>> system's sandboxes (per-user or perhaps some more elaborate mechanism
>>>>>> like
>>>>>> FreeBSD's jails).
>>>>>>
>>>>>> But as soon as the cluster sends code out to clients, obviously there
>>>>>> is
>>>>>> a
>>>>>> big issue if we let them do whatever the hell they want. Just
>>>>>> preventing
>>>>>> assembly or anything like that simply doesn't work in C/C++, (not to
>>>>>> mention
>>>>>> it would be suprisingly hard/irritating,) since the code could still
>>>>>> execute
>>>>>> the system-calls (you could try not linking against libc,too, but then
>>>>>> you
>>>>>> _really_ have no portability :P).
>>>>>>
>>>>>> System-call controlling is possible, but is either pretty unportable
>>>>>> (lots
>>>>>> of x86 assembly stuff) or slow-ish (virtual machines).
>>>>>>
>>>>>> That being said, if you completely seperate client-sendable code from
>>>>>> server-code, I think that allays a lot of the concerns. Requiring
>>>>>> client-sendable code to be written for some safe VM (java's, lua's,
>>>>>>  javascript's, .NET/mono, some other random thing) could avoid this.
>>>>>> In
>>>>>> addition, client-sendable code would intentionally be written with
>>>>>> knowledge
>>>>>> of the sensitivity of the data it handles (i.e. not written at all if
>>>>>> the
>>>>>> data is important).
>>>>>>
>>>>>> On Mon, Dec 28, 2009 at 7:49 PM, Michael Cohen <gnurdux@xxxxxxxxx>
>>>>>> wrote:
>>>>>>
>>>>>>> I would still be happier if there were a sandbox, actually.  There
>>>>>>> are
>>>>>>> ways
>>>>>>> of getting around that sort of thing that are too complicated to
>>>>>>> prevent
>>>>>>> at
>>>>>>> the source level IMO.  For instance, you can use inline assembly.  So
>>>>>>> we
>>>>>>> block inline assembly.  That's all well and good, but now we've
>>>>>>> blocked
>>>>>>> people using legitimate assembly optimizations. Worse, what happens
>>>>>>> if
>>>>>>> they
>>>>>>> execute some shellcody stuff, allowing them to escape?  I don't
>>>>>>> really
>>>>>>> know
>>>>>>> how to block that at all.  On the other hand, a sandbox would not add
>>>>>>> much
>>>>>>> overhead since these tasks will most likely use lots of CPU time but
>>>>>>> few
>>>>>>> system calls or whatever.
>>>>>>>
>>>>>>>
>>>>>>> Michael Cohen
>>>>>>>
>>>>>>> Scott Lawrence wrote:
>>>>>>>
>>>>>>>> Ok, I'm going to build a prototype of my privacy model.  I'm not
>>>>>>>> going
>>>>>>>> to implement the challenge-response stuff, I'll assume there's an
>>>>>>>> implementation of that and that it works.
>>>>>>>>
>>>>>>>> I think I've isolated the misunderstanding about the sandboxes.  You
>>>>>>>> don't submit binary code the the Modred cluster - you either submit
>>>>>>>> source, to be linked by the modred cluster with the relevant
>>>>>>>> libraries, or you link the code yourself with the libraries.  The
>>>>>>>> libraries that you would link with merely copy the program over to
>>>>>>>> the
>>>>>>>> cluster, where it can be executed in a manner deemed fit by the code
>>>>>>>> there.
>>>>>>>>
>>>>>>>> I suppose you could say that that is a sandbox. ;-)
>>>>>>>>
>>>>>>>>
>>>>>>>> On 12/28/09, Michael Cohen <gnurdux@xxxxxxxxx> wrote:
>>>>>>>>
>>>>>>>>> If you read my email more carefully, you will see that I am not
>>>>>>>>> necessary objecting to Scott's suggestion.  I say that it is not
>>>>>>>>> necessary, but that it would be the only thing necessary to allow
>>>>>>>>> more
>>>>>>>>> problem-specific privacy tasks to be used.  The need for a sandbox
>>>>>>>>> is
>>>>>>>>> pretty simple.  If we make untrusted users able to ask for tasks,
>>>>>>>>> if
>>>>>>>>> they upload code, then I don't want it running unsandboxed on my
>>>>>>>>> computer.  Otherwise, their code could steal my files, wipe my
>>>>>>>>> harddisk,
>>>>>>>>> install Windows or do other undesirable things.  If it is
>>>>>>>>> sandboxed,
>>>>>>>>> then arbitary code can be executed safely, as long as we trust the
>>>>>>>>> sandbox.  Sandboxed environments are often also cross-platform,
>>>>>>>>> another
>>>>>>>>> plus, since they typically replace or intercept any kind of system
>>>>>>>>> call.
>>>>>>>>>
>>>>>>>>> Michael Cohen
>>>>>>>>>
>>>>>>>>> Scott Lawrence wrote:
>>>>>>>>>
>>>>>>>>>> Well, I'm glad someone expresses opinions I don't agree with...
>>>>>>>>>>
>>>>>>>>>> I think Mikey's objection to privacy concerns is that it's so
>>>>>>>>>> problem-specific, we can't reasonably expect to have a general
>>>>>>>>>> implementation.  But if the user specifies which parts of the data
>>>>>>>>>> are
>>>>>>>>>> private, the Modred hub just has to be sure to divvy up tasks in a
>>>>>>>>>> way
>>>>>>>>>> that gives those bits of information only to the trusted,
>>>>>>>>>> dedicated
>>>>>>>>>> servers.
>>>>>>>>>>
>>>>>>>>>> For the purposes of clarity, I will be referring to dedicated
>>>>>>>>>> servers
>>>>>>>>>> as simply "servers", and the central server as the "hub".
>>>>>>>>>>
>>>>>>>>>> I don't see the need for a sandbox.  Could you present some
>>>>>>>>>> specific
>>>>>>>>>> attacks that a sandbox would fix?
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 12/28/09, Michael Cohen <gnurdux@xxxxxxxxx> wrote:
>>>>>>>>>>
>>>>>>>>>>> It seems to me that dealing with privacy concerns is an extremely
>>>>>>>>>>> problem-specific issue.  In any given case you need to work out
>>>>>>>>>>> how
>>>>>>>>>>> much
>>>>>>>>>>> you can give to people without letting private information leak,
>>>>>>>>>>> but
>>>>>>>>>>> the
>>>>>>>>>>> details vary greatly from problem to problem.  That isn't our
>>>>>>>>>>> business,
>>>>>>>>>>> and I don't think we should concern ourselves with it too much.
>>>>>>>>>>> The
>>>>>>>>>>> way
>>>>>>>>>>> I see it there are two options:
>>>>>>>>>>>
>>>>>>>>>>> 1. make this designed for stuff without privacy concerns
>>>>>>>>>>>        I think this is both the easiest and the best option.  I
>>>>>>>>>>> don't
>>>>>>>>>>> really
>>>>>>>>>>> like the idea of a public, free service doing computations for an
>>>>>>>>>>> evil
>>>>>>>>>>> corporation anyway; if it's being done BY the public it should be
>>>>>>>>>>> done
>>>>>>>>>>> FOR the public.
>>>>>>>>>>>
>>>>>>>>>>> 2. add in a small amount of functionality designed to facilitate
>>>>>>>>>>> dealing
>>>>>>>>>>> with privacy concerns
>>>>>>>>>>>        At the level of this project, that would probably just be
>>>>>>>>>>> the
>>>>>>>>>>> controls
>>>>>>>>>>> on what data gets sent to what people.  There might be reasons
>>>>>>>>>>> for
>>>>>>>>>>> adding such controls anyway; some tasks could be designated for
>>>>>>>>>>> only
>>>>>>>>>>> "trusted" users.
>>>>>>>>>>>
>>>>>>>>>>> Either way I doubt that this will be a big issue.  I think maybe
>>>>>>>>>>> a
>>>>>>>>>>> bigger issue is how to run arbitrary code efficiently and
>>>>>>>>>>> securely.
>>>>>>>>>>>
>>>>>>>>>>> I see only a few solutions
>>>>>>>>>>>
>>>>>>>>>>>        Don't allow arbitrary code, but only a defined set of
>>>>>>>>>>> tasks.
>>>>>>>>>>>  Or,
>>>>>>>>>>> similarly, allow some "trusted" set of tasks, each separately
>>>>>>>>>>> ported
>>>>>>>>>>> to
>>>>>>>>>>> each platform (like boinc).
>>>>>>>>>>>
>>>>>>>>>>>        Use Java.  This lets us easily sandbox it and is
>>>>>>>>>>> cross-platform,
>>>>>>>>>>> but
>>>>>>>>>>> sacrifices a bit on efficiency.  Also, Java can be annoying
>>>>>>>>>>> (although
>>>>>>>>>>> other JVM languages would also work in this situation).
>>>>>>>>>>>
>>>>>>>>>>>        There are ways of running cross-platform, C/C++ code in a
>>>>>>>>>>> sandbox as
>>>>>>>>>>> well.  One possibility is to use LLVM, although the LLVM
>>>>>>>>>>> developers
>>>>>>>>>>> specifically say that LLVM is NOT designed to be used this way.
>>>>>>>>>>>  Another
>>>>>>>>>>> possibility is to use a sandboxed code system that works on
>>>>>>>>>>> multiple
>>>>>>>>>>> operating systems but only on x86.  This includes things like
>>>>>>>>>>> VX32,
>>>>>>>>>>> which is apparently portable to Windows, but hasn't been ported.
>>>>>>>>>>> I
>>>>>>>>>>> don't know whether or not that sort of thing is within our
>>>>>>>>>>> abilities.
>>>>>>>>>>> Another option might be Google Native Client; that is designed to
>>>>>>>>>>> be
>>>>>>>>>>> used in a web browser but I don't know how hard it would be to
>>>>>>>>>>> "rip
>>>>>>>>>>> out"
>>>>>>>>>>> the sandboxing/cross-OS x86 code stuff.
>>>>>>>>>>>
>>>>>>>>>>> Michael Cohen
>>>>>>>>>>>
>>>>>>>>>>> _______________________________________________
>>>>>>>>>>> Mailing list: https://launchpad.net/~modred
>>>>>>>>>>> Post to     : modred@xxxxxxxxxxxxxxxxxxx
>>>>>>>>>>> Unsubscribe : https://launchpad.net/~modred
>>>>>>>>>>> More help   : https://help.launchpad.net/ListHelp
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> Mailing list: https://launchpad.net/~modred
>>>>>>>>> Post to     : modred@xxxxxxxxxxxxxxxxxxx
>>>>>>>>> Unsubscribe : https://launchpad.net/~modred
>>>>>>>>> More help   : https://help.launchpad.net/ListHelp
>>>>>>>>>
>>>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Mailing list: https://launchpad.net/~modred
>>>>>>> Post to     : modred@xxxxxxxxxxxxxxxxxxx
>>>>>>> Unsubscribe : https://launchpad.net/~modred
>>>>>>> More help   : https://help.launchpad.net/ListHelp
>>>>>>>
>>>>> _______________________________________________
>>>>> Mailing list: https://launchpad.net/~modred
>>>>> Post to     : modred@xxxxxxxxxxxxxxxxxxx
>>>>> Unsubscribe : https://launchpad.net/~modred
>>>>> More help   : https://help.launchpad.net/ListHelp
>>>>>
>>>>
>>>
>>> _______________________________________________
>>> Mailing list: https://launchpad.net/~modred
>>> Post to     : modred@xxxxxxxxxxxxxxxxxxx
>>> Unsubscribe : https://launchpad.net/~modred
>>> More help   : https://help.launchpad.net/ListHelp
>>>
>>
>>
>
>
> _______________________________________________
> Mailing list: https://launchpad.net/~modred
> Post to     : modred@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : https://launchpad.net/~modred
> More help   : https://help.launchpad.net/ListHelp
>


-- 
Scott Lawrence

Webmaster
The Blair Robot Project
Montgomery Blair High School



Follow ups

References