| 
  • If you are citizen of an European Union member nation, you may not use this service unless you are at least 16 years old.

  • You already know Dokkio is an AI-powered assistant to organize & manage your digital files & messages. Very soon, Dokkio will support Outlook as well as One Drive. Check it out today!

View
 

SecureSqueak

Page history last edited by Michael van der Gulik 12 years, 9 months ago

SecureSqueak

 

Project locations

 

This is the main "Web Site" for SecureSqueak.

Latest News on blogspot.com.

Project page on sourceforge.net.

Squeak code on squeaksource.com.

Subversion repository (for Namespaced code) on sourceforge.net.

Download images / releases.

Bug reporting / Feature requests on sourceforge.net

 

(yes, I'm making good use of the Internet :-) )

 

There is a downloadable image to demonstrate namespaces here. A video demoing this is here on Vimeo.com.

 

There is no mailing list; contact mikevdg at gulik.co.nz for information. If more than one person shows interest, I'll make a mailing list.

 

Project Description

 

SecureSqueak is going to be a minimal, secure fork of Squeak. The goal of SecureSqueak is to allow any untrusted, foreign code to run in your image in a manner that is secure and will not affect other objects in the image. This means:

 

  • Untrusted code has no access to plug-ins, devices, global variables or any other means of comprimising system performance and security unless such privileges have been granted.
  • Untrusted code cannot consume excessive resources (CPU, memory, disk space, network bandwidth) such that other objects are denied these resources.
  • The system as a whole is as stable and reactive as possible.

 

Classes inside SecureSqueak will be organised using Namespaces and Packages. SecureSqueak will consist of the following Packages:

 

  • Kernel
  • Collections (a basic, minimal version of this package. The user may load a more complex Collections package later.)
  • Namespaces
  • Dominions (probably integrated into Kernel)
  • Networking
  • Files (may be removed by the user?)
  • Compiler (may be removed by the user?)
  • REPLServer (may be removed by the user?)
  • Debugger API that can be used by graphical or CLI debuggers.
  • Debugger UI, probably based on REPLServer.
  • Canvas (also called Subcanvas) for 2D graphics.
  • (this list is work in progress)

 

SecureSqueak is initially intended to be a kernel for running DPON on. DPON is a distributed object architecture that will allow for remote Package loading over a network. The ability to load Packages from a file will also be included. It is intended that other people can use SecureSqueak for other projects, such as web servers.

 

As a core system, SecureSqueak's UI will be REPLServer or perhaps a simple console like KernelImage has. The intention is that further packages are loaded to make the system more user friendly.

 

Versions

 

SecureSqueak-TODO list. See also the bug reporting system on SourceForge.net.

 

Items with "UGP:" in front of them are items for the Unnamed Grand Project rather than SecureSqueak. This is a convenient place to keep these items.

 

The main priority is to get a working system going first. Components will be skeletel until extra functionality is needed. "Status: done" means that that item has been completed enough for that particular version; it does not mean that the item is complete or stable. Work will continue on all components as extra functionality or bug fixes are needed.

 

Version 0.1 - Make the infrastructure:

  • Namespaces (Status: done).
  • Usable tools (Status: done).
  • A basic Canvas package (actually called "Subcanvas") with event handling (Status: mostly done).
  • Port DPON to Namespaces.
  • UGP: Make a simple SiteBrowser (Status: in progress).
  • Features:
    • Package loading is atomic.
    • Packages can have circular dependencies.
    • Namespaces!
    • You can view Sites on a remote test image.

Version 0.1 can be downloaded from http://sourceforge.net/projects/securesqueak/

Version 0.2 - Scaffolding:

Version 0.2 is going to concentrate on source management tools and making a read namespace.

<<< Currently here >>> Changelog is above this line; plan is below this line.

  • Refactor classes: remove source code and make Class instances "binary only" (WIP).
  • Refactor code management tools so that source is managed externally to classes (WIP).
  • Import and modify the NewCompiler: http://scg.unibe.ch/research/newcompiler/ or http://www.squeaksource.com/OpalCompiler.html; ask MD.
  • Make objects completely independent of the existing Squeak classes. Literals are new instances of true, false, nil, strings, etc. 

Version 0.3 - Make it secure:

  • Refactor Namespaces to be a proper secure API.
  • Refactor Kernel: put system stuff in private namespaces.
  • APIs for secure programming: private methods, testing the sender, etc.
  • Skeletons for restricted classes.
  • Exceptions are stored - Processes that throw Exceptions not of interest to the user are suspended and made available for a sys admin to review (requires dominions; exceptions are handled by dominions).
  • Perhaps import and refactor the debugger (?)
  • Kernel tests? 
  • Method, class, variable annotations?
  • ObjectInspecter
  • ReadOnlyArray, ReadOnlyString, ReadOnlyLargePositiveInteger, ReadOnlyLargeNegativeInteger.
  • Restricted Symbols, Characters.
  • Restricted Semaphores, Delays.
  • RestrictedBlocks
  • Restricted classes.
  • Restricted MessageCaptures.
  • Restricted Exceptions.
  • No privileged methods (typically primitives).
  • Refactor Kernel: make wrappers for SmallInteger; create Object subclass: DominionedObject.
  • Dominions API (Status: prototype kind of works, on back burner for now).
  • Unicode support

Version 0.4 - Make it stable:

  • Can debug forking processes - Modify the debugger to automatically open any forked processes in another debug window.
  • Dominions functionality.
  • Run SmallLint.
  • Lots of tests.
  • Make FacetiousVM - the VM that tries really hard to crash your concurrent code.
  • Make WAN functionality; only allow network connections from/to trusted hosts until the code is proven secure. 

 

Version 0.5 - Make it usable:

  • Management tools, possibly using REPLServer? Manage dominions, manage exceptions.
  • Command-line telnet client (Done: it's called REPLServer. Needs porting to Namespaces).
  • Add cryptographic APIs.
  • UGP: add user management, authentication.
  • UGP: add basic widget set. 

 

Version 0.6 - Make it more secure:

  • VM modifications to harden it.
  • Bytecode verifier.
  • Lots of testing.

 

Version 0.7 - Make it small:

  • Shuffle root-object pointers so the rest of Squeak gets GCed.

 

Version 0.8: Make it fast:

  • Remove the SmallInteger wrapper (keeping it for possible ports to other VMs) and use a modified system SmallInteger.
  • Use primitives again.
  • etc.

Version 1.0:

  • UNCRASHABLE

 

 

Plan of Attack

 

At some stage after this:

 

  1. Write user interface tools (ReactiveUI).
  2. Write UI / Site development tool.
  3. Make example application (a TODO list manager?)
  4. Implement sound, 3-D APIs?
  5. Implement device access (disk, USB etc).

 

 

 

VM Modifications

 

Scheduling

 

Bootstrapping SecureSqueak

 

 

The standard Squeak release can be used to host SecureSqueak. If none of the SecureSqueak packages have any dependencies on the SystemDictionary bindings, then a minimal image can be made by removing keys from SystemDictionary.

 

Literals need to refer to their SecureSqueak equivalents. These could be:

  • true, false, nil.
  • SmallInteger, including -1, 0, 1, 2 which have special bytecodes. These will need to be done using a SmallInteger wrapper (yes, performance will die).
  • Numbers: Float, Fraction(?), Points(?), large positive and negative integers, etc?
  • Characters, Strings, Symbols.
  • Arrays made with #() notation.
  • Associations?
  • BlockContexts. MethodContexts aren't available to the code.
  • thisContext access -> return a wrapper maybe?
  • Message. If a user overrides >>doesNotUnderstand:, they can get a Message instance.
  • (what about CannotReturnSelector and MustBeBooleanSelector? MustBeBoolean)
  • Exceptions
  • (I believe these are safe: CompiledMethod, streams,

 

(is ContextPart>>blockCopy: safe?)

 

TODO: go through the Blue Book VM chapters looking for special objects and ways to get special objects, such as >>hash, >>asOop, >>nextInstance, etc.

 

The SecureSqueak compiler (separate from the Squeak compiler) needs also to not optimise the following optimised selectors:

 
  + - < >
  <= >= = ~=
  * / \ @
  bitShift: \\ bitAnd: bitOr:
  (at:) (at:put:) (size) (next)
  (nextPut:) (atEnd) == class
  blockCopy: value value: (do:)
  (new) (new:)
    
(x) (y)

 

Making classes secure

Any application can make a class, and that class can't get up to mischief, provided that:

  • The class is well-formed: it has a valid superclass reference, valid format and method dictionary. In other words, a constructor is called that does not give outsiders access to invalid or partially constructed classes, and all setter methods are defensively written.
  • It is impossible to change the class of an object. (currently done using becomeForward:).
  • The only way to make a new CompiledMethod is as the result of compiling IR (a format in the OpalCompiler) from a trusted compiler. 
  • The API on a Class does not give access to CompiledMethods or class variables (perhaps locking these out after baking).

 

Classes may not be made by untrusted code: the three instance variables in Class can crash the VM if incorrect.

 

This can be done with these methods:

Class>>subclass:instanceVariables: (and friends)

 

Classes, once made, can be "baked" to prevent further modifications.  "Final" classes (non-subclassable classes) can be made by overriding these methods.

 

Adding methods to classes

In SecureSqueak, untrusted code can be run in a secure sandbox: if the code does not have access to dangerous objects such as SystemDictionary, and bytecodes cannot do invalid things such as bad jumps, then code cannot damage an image. Instead of validating code, my approach is to take the IR (intermediate representation) and run the last step of compilation over it. When code is compiled, the second to last step is to make IR, which is an object graph of the instructions. The last stage of compilation just iterates over these, producing bytecodes. IR must consist of valid instructions by definition, and the compilation result is guaranteed by the IR translater to be valid bytecodes.

 

Possible ways of implementing this are:

  • Pass the IR graph directly to Class>>addMethod:. Class>>addMethod: will then defensively iterate over that graph using a defensively made IRTranslator, producing code.
  • The Compiler itself iterates over the IR graph using a IRTranslator given by the Class. 

 

The IR graph might not need to be the officially sanctioned classes: provided the visitor is coded defensively, anybody can implement their own IR classes.

 

Is it necessary to hold on to the IR for debugging or other reasons?

  • No: the debugger should show proper source code. If absent, the debugger should either disassemble bytecodes, or show bytecodes.
  • Maybe: if another remote host asks for that code, we need to have it. Alternatively, a class should be considered a compilation result, and a class's replication algorithm should hold on to the redistributable code (perhaps as a byte stream for speed). 
  • Perhaps a profiler would find it useful? No - bytecodes are still better.
  • Perhaps on an architecture where there are no bytecodes and the IR is compiled directly to assembly or C, the IR would be useful. 

 

Defensive programming guide

 

DPON

 

DPON ("Distributed Persistent object network") is a distributed object framework that will run on top of SecureSqueak. It will provide remote object replication and remote class loading.

 

Unnamed Grand Project

SecureSqueak and DPON are part of the Unnamed Grand Project.

 

Localisation 

 

Kernel modifications

  • Remove a LOT of stuff! Too much to list.
  • Remove most >>asXXX methods, replace with >>as: aClass.
  • Add security methods (>>domain, Sender, etc).
  • Add namespace stuff.
  • Add >>acceptVisitor: to all classes.
  • Add Integer>>mm (?)
  • "10 seconds wait." (i.e. Duration>>wait).
  • A special device namespace? Or registrar?
  • Semaphores: add more functionality e.g. signalAll.
  • BlockContext secure methods to make sure code returns, etc.
  • BlockContext>>assert?
  • self privateMethod? Sender privateMethod? Sender mustBe: self?
  • Collection>>,, ? "1,,2,,3,,4,,5" -> {1. 2. 3. 4. 5}?
  • Case statements somehow? "(1->[obj hello],, 2->[obj world]) caseOf: 2"?

 

Try to keep any complex systems out of the kernel if possible. This includes:

 

  • Unicode. There's lots of complexity here. The problem is that Unicode might be used in literals, file I/O, copy/paste UI stuff, method source and keyboard input.
  • Time. Only include support for fetching time as a number of seconds since the epoch. Time is very complicated: calendars, timezones, daylight saving, internationalised names of everything, etc.
  • Formatting and localisation. Object>>asString should represent something Smalltalkish; other external mechanisms need to be used for localisation. Each user needs a preference which determines which locale they use.
  • Math operations. I don't know how this would be achieved as Float would be part of the kernel, but a lot of operations have seen attention lately. Perhaps provide a BasicFloat that should be subclassed?
  • Collections. A very basic set of collections is needed for the kernel.

 

With Unicode, how could Unicode be implemented as an external package? Perhaps allow method literals to be any object (potentially mutable)? Symbols and ReadOnlyStrings could be limited to 8-bit characters? I like the idea of having names (package, namespace, class, method, symbols) containing unicode characters and perhaps even being able to be translated.

 

Using 8-bit characters (i.e. the standard Squeak character set, or ASCII, or UTF-8 chars up to 255 only) simplifies the following: Character>>isNumber (there are many sets of numbers in Unicode), String>>toUpper (apparently this can change for the latin alphabet depending on locale), String>>= (Unicode has many different characters that are the same), Character>>'<': character ordering changes per language.

 

Image checkpointing

Until a better VM is made which supports intelligent checkpointing and automatic recovery, the Squeak VM can be hacked to do checkpointing:

  • Every 1 minute or so, the VM saves the image automatically. We make the assumption that saving the image is a fairly non-intrusive operation.
  • Each time the image is saved, old snapshots of the image are retained.

 

Multiple archived snapshots of the image could be retained to provide history. For example, multiple snapshots could be kept on disk which are:

  • The latest 1-minute snapshot (i.e. age of 0-1 minute)
  • Between 1 and 10 minutes old.
  • 10-60 minutes old.
  • 60 minutes - 1 day old.
  • 1 day - 1 week old.
  • 1 week - 3 months old.

 

This allows the user to go back to one of these images, up to 3 months ago (or so, depending on implementation). When a snapshot reaches it's "expiry" date, it will wait until the snapshot above it (on the list above) also expires and then copy that old snapshot onto itself. The snapshot above it would do the same, such that the snapshots would all move down one place.

 

Links and references

 

Squeak-E: http://www.erights.org/history/squeak-e.html

Islands: http://wiki.squeak.org/squeak/2074

Capabilities: http://wiki.squeak.org/squeak/uploads/2074/sandbox.html

Portability across platforms: http://www.seaside.st/community/conventions

http://home.cc.gatech.edu/tony/uploads/32/SharedSmalltalk.htm

 

Java security holes, mostly historical: http://www.securingjava.com/chapter-five/

Also Google for Java security issues which may provide inspiration for tests.

 

- Should classes be able to implement their own doesNotUnderstand: method?

 

Comments (0)

You don't have permission to comment on this page.