| 
  • If you are citizen of an European Union member nation, you may not use this service unless you are at least 16 years old.

  • You already know Dokkio is an AI-powered assistant to organize & manage your digital files & messages. Very soon, Dokkio will support Outlook as well as One Drive. Check it out today!

View
 

Namespaces

Page history last edited by Michael van der Gulik 14 years, 6 months ago

 

Namespaces for Squeak

 

Quickstart 

Go back to the SecureSqueak page and download that image.

 

NamespaceBrowser Tutorial

 

Introduction

 

As part of SecureSqueak, I'm going to implement hierarchical Namespaces for Squeak. These will resemble Java's packaging system closely.

Namespaces are required to make Squeak more secure:

 

  • Although classes are immutable, adding something with a bad hash implementation to the global SystemDictionary will break that SystemDictionary and damage your image.
  • Having access to the global SystemDictionary means having access to all classes and global objects in that SystemDictionary, essentially giving access to the full image. Access should be reduced to only what is needed for a module to function.
  • Having access to the global SystemDictionary also means that any devices and primitive methods (i.e. native code, interfaces to things outside Squeak) in that SystemDictionary are available loaded code which also has access to that SystemDictionary.

 

Packages are also described below and allow for the following:

 

  • Multiple versions of the same Package can be loaded into an image. Package dependancies are recorded for exact versions of packages, meaning that with a good package loading system, a Package will run in the same environment as it was written and tested in. Particular versions of packages can be loaded into an image automatically to satisfy dependencies.
  • Packages as objects in an image provide many useful abilities to programmers: changes ("diffs" or "changesets") between versions of packages can be managed as objects in an image. Packages can be serialized, meaning they can be filed out and in or sent over a network. Package versions can be integrated somehow with a version control system.
  • Having Packages give a good unit of packaging to load remote code over a network or from a file.

 

 

Namespaces are used by the compiler to resolve literals in code to their values. Usually these literals are Classes, but they can also be global variables. Namespaces are used by developers to organise classes and global variables, and are also used when binding literals in remotely loaded code to local named objects.

 

Currently in Squeak, any literal in code which refer to global variables could refer to:

 

  • A global variable in the "Smalltalk" SystemDictionary (usually classes).
  • Any shared pools that the class uses.
  • Any class instance variables in that class.

 

 

Namespaces are intended to replace global variables in the "Smalltalk" SystemDictionary and shared pools. Class instance variables are going to stay for the meanwhile.

 

Namespaces

 

In most programming languages, a packaging, module or namespacing system is available for managing a large amount of code. This Namespace implementation for Squeak is an attempt to help the programmer manage a large amount of code as well.

 

Typically in a project in a legacy file-based programming language, a developer would organise the code he writes into a hierarchy of directories. In Squeak, no files are used and the code stays in the image (or rather, the attached "sources" file). This namespace approach allows the developer to organise his code into a hierarchy of classes.

 

This hierarchy is made of Namespaces. A Namespace is an object in Squeak that subclasses from Dictionary. You add assocations from #Symbols to other objects in it. Those objects can be Classes, global variables or other Namespaces.

 

Say, for example, that you created a Namespace and gave it the name "Fruit".

 

ns := Namespace new name: #Fruit.

 

Then you could add some classes to it:

 

ns addClass: Orange.
ns addClass: Apple.

 

This is functionally equivalent to:

 

ns at: #Orange put: Orange.
ns at: #Apple put: Apple.

 

The addClass: method makes sure that the class's environment is bound to that namespace and that it is added to the namespace with the correct name.

 

Now, in your code, you could refer to "Fruit.Orange" and "Fruit.Apple", which would return those classes respectively. Note that you would need to use import lists to do this, but we'll cover those later.

 

If you then added a subnamespace:

subns := Namespace new name: #Berries
ns addNamespace: subns.

 

And added a class to that:

subns addClass: Rasberry.

 

Then you could refer to "Fruit.Berries.Rasberry" in your code. "Fruit.Berries.Rasberry" is another way of saying "((Fruit at: #Berries) at: #Rasberry)" in your code, except that the name is looked up at compile time rather than runtime.

 

 

Technicalities

 

A Namespace is a subclass of Dictionary, and maps names (Symbols) to objects. Objects in a Namespace can be:

 

  • Classes
  • sub-Namespaces.
  • Global named objects.

 

Namespaces are hierarchical. The children in a Namespace are stored in the parent Namespace like any other object or class.

 

When a literal is referred to in a method, that method's class's Namespace (the "local Namespace") is first searched for that literal. Sub-Namespaces are searched if a dotted notation (e.g. "Collection.Dictionaries.IdentityDictionary") is used. Namespaces are always relative, so if the local Namespace has a sub-namespace called "Collection" then that would be searched.

 

If the literal is not found in the local Namespace, then that Namespace's "import list" is searched. The "import list" is a list of other Namespaces and Packages which will be searched for to find that literal. This "import list" is only ever searched through by searches originating from the local Namespace.

 

These are the only two places that are searched (i.e. the local Namespace and the import list). If the literal is not found after searching the local namespace and import list, then it is not found and an error is raised. This means that if a method uses fully quantified literal names (i.e. names starting from the root Namespace) then the root namespace with that name will need to be included in the local Namespace's import list.

 

Packages

 

Developers would organise a related group of classes into a Namespace hierarchy with a common root. In this approach, that root is a special subclass of Namespace called Package. A Package stores a number of Namespace hierarchies inside it.

 

For example, image again the Fruit namespace we created earlier. Every namespace lives either in another namespace, or a Package, like this:

 

p := Package new name: #FruitPackage.
p addNamespace: ns. "Remember that ns was the Fruit namespace. "

 

We could also add more Namespaces to this package if we wanted.

 

In this namespace approach, Packages are not static entities that have been filed out to disk. They are live objects and the developer edits code directly in the package. Users create objects which are instances of classes stored in packages. The Package objects are meant to replace the Smalltalk SystemDictionary as a means of managing classes.

 

A developer's tool is provided for developers to manage Packages; this is called a PackageManager. A PackageManager presents the developer with a windows on the screen displaying a list of Packages that are available in that PackageManager. This user interface allows the developer to create new packages, manage dependencies between them, compare differences between packages and search packages for a particular class or global variable. There is no specific need to use a PackageManager; Packages can exist on their own provided that there is a reference to them in the local image, otherwise they will be garbage collected.

 

This is the PackageManager GUI with the context menu open on the Subcanvas package:

 

 

A Package can be exported to a file and imported into another image. Classes live only in Packages and are not added to the SystemDictionary.

 

The Package class is a subclass of the Namespace class. A Package is a real object that has information about itself - name, version, author, date etc. Different versions of the same Package are unique objects with different values for their "version".

It is possible to load two different versions of the same Package into an image and make instances of classes from either Package. Even though classes may have the same name, they are in different Packages and so code using those classes will use the class from the correct Package.

The structure of code in an image would then be as follows:

 

  • A singleton PackageManager stores a list of Packages. Multiple versions of the same Package may be present in this PackageManager.
  • Each Package forms the root of a hierarchy of related Namespaces.
  • Each Namespace then contains global variables and classes.

 

If a Package is removed from the PackageManager but references to that Package are still in the image (e.g. in an import list somewhere), then that Package remains in the image. Package are regular objects in this regard.

 

PackageManagers are only of use to developers and for importing code. A end-user system would be completely operable without any PackageManagers. If the user has a reference to a particular object, then that object will have a reference to its class, that class would have a reference to its Namespace and that Namespace would have references to its Package and all other Packages it depends on. If that object is removed from the image and had the only reference to that class, then the Namespace, Package and dependant Packages would all be garbage collected.

 

In the current implementation, a singleton PackageManager is being used to manage Packages and to search for particular classes. At some stage, I plan to create the concept of a User in Squeak. Each User would then have a collection of his/her own objects, possibly including PackageManagers if that user is a developer.

 

(TODO:) When first loaded, Packages are read-only. When the user edits a Package, a new version is created. When the user is finished, the user "commits" the Package, making it read-only and perhaps synchronising it with some remote Package management system.

 

(TODO:) When a user compares differences between two packages (or an edited package against it's original version), the differences should be an object that can be manipulated. Example operations include applying it to another package, making selections from it (thus making new "Diff" objects), filing it out to a changeset, etc. It would also be useful to have "Diff streams" that other users can "subscribe" to (cf: the discussions about "Supertool" on Squeak-dev).

 

Import Lists

 

Import lists are how values for names are found. Currently in Squeak, when the compiler is looking for a class or global variable name that you typed in, it would search for that name in the SystemDictionary called "Smalltalk". With this namespace implementation, this is no longer the case. Every namespace has what is called an import list. The compiler first searches the local namespace of that class, and then searches the local namespace's import list. These are the only two places the compiler searches; if the name is not found then the code will not compile.

 

In other words, to be able to refer to classes and global variables by name, you need to add the namespace that contains them to the import list. The import list is an OrderedCollection containing a number of Namespaces which are searched for names that you use in code. The NamespaceBrowser makes this job easier for you by automatically adding namespaces to the import list if it can find them using the PackageManager.

 

Import lists are how dependencies between packages are formed. The dependencies of a package is the collection of all import list entries in all namespaces that package contains.

 

When a Package is exported to disk (currently in chuck format with a .st extension), each Namespace is written with it's import list. Each import list item (which is of type Namespace or Package) is exported with the UUID of the package the imported Namespace comes from. A package matching that UUID (i.e. the same version) must be present in the image when the exported Package is loaded.

 

This means that a loaded Package will import into the exact same environment it was developed and tested in, and should perform in exactly the same way as when it was tested. Some mechanism for automatically loading Package dependancies will eventually be developed to ease this process.

 

This is the "Namespaces Browser" open on the "Subcanvas" package. The import list is to the left of the code. Notice how the code refers to classes available in that import list using the dotted syntax.

 

 

Namespaces Example

 

For example, say we have a "CollectionsPackage-mvdg-10" package. Inside the "CollectionsPackage" package there is a "Collections" namespace. Inside the Collections namespace there is a "Ordered" namespace. Inside the "Ordered" namespace there is a "String" class (Note: this example is made up). In other words, you have the following namespaces in the CollectionsPackage-mvdg-10 package:

 

  • Collections
  • Collections.Ordered
  • Collections.Ordered.String

 

Your code can then add the "Collections-mvdg-10" package to its Namespace's import list and then use code such as "myString := Collections.Ordered.String new.". Alternatively, you could add the "Collections-mvdg-10.Collections.Ordered" Namespace to your local namespace's import list and simply do "myString := String new.".

 

In your code, "Collections.Ordered.String" is a shorter way of writing "(Collections at: #Ordered) at: #String", except that the Compiler will bind this at compile time rather than runtime.

 

Namespaces have:

 

  • a fully-qualified name (e.g. Kernel.Object) and a local name (e.g. Object).
  • a reference to the Package in which this Namespace belongs.
  • an import list of other Namespaces and Packages to be searched when looking for a literal.

 

 

Overrides

 

I don't yet know how to implement method overrides using Namespaces. If this is going to be implemented, it must be secure.

 

The issues are:

 

  • Overrides may not have direct access to instance variables. This is a security concern - instance variables may contain information that is not intended to be available outside that class, and overrides are considered foreign untrusted code. This almost sounds like Traits
  • Overrides must only be visible from within the Package/Namespace they are declared in. Other packages must not see any of the effects of the method override. In other words, overrides must not be able to interfere with the workings of other Packages at all.

*

For the meanwhile, it is best to make a new version of Packages that you want overrides in and supply them as well.

 

Security

 

One of the goals of this Namespaces design is to enable security. This involves preventing read and write access to namespaces that untrusted code wants to access.

 

How could this be implemented?

 

I could implement users/groups like Unix. Somehow.

 

Write-access to a Namespace would be done using a capability?

 

The remote code loader / linker could make the decision as to whether a namespace has access to another...? But what would that decision be based on?

 

"Private" namespaces may be accessable only from within its own package?

 

Special objects

 

The VM has a struct called foo in interp.c that contains an object reference to an Array called specialObjectsOop. This Array is the same as the Array returned from "Smalltalk specialObjectsArray" and contains references that the VM needs access to.

 

See them with categories and superclasses by doing:

 

(this code broke through wiki formatting)

Smalltalk specialObjectsArray do: [ :each

(each isKindOf: Class) ifTrue: [

  | nextSuperclass |

 

    Transcript show: each category.

 

    Transcript show: ':'.

 

    nextSuperclass := each.

 

    nextSuperclass isNil whileFalse: [

 

        Transcript show: nextSuperclass name.

 

        Transcript show: '->'.

 

        nextSuperclass := nextSuperclass superclass.

        Transcript cr.]]

 

See also SystemDictionary>>recreateSpecialObjectsArray.

 

It contains the following classes. They are listed here with their category names -> class names - superclasses:

 

1: #'Graphics-Primitives'->Bitmap - ArrayedCollection - SequenceableCollection - Collection

2: #'Kernel-Numbers'->SmallInteger - Integer - Number - Magnitude - Object

3: #'Collections-Strings'->ByteString - ArrayedCollection - SequenceableCollection - Collection

4: #'Collections-Arrayed'->Array - ArrayedCollection - SequenceableCollection - Collection

5: #'Kernel-Numbers'->Float - Number - Magnitude - Object

6: #'Kernel-Methods'->MethodContext - ContextPart - InstructionStream - Object

7: #'Kernel-Methods'->BlockContext - ContextPart - InstructionStream - Object

8: #'Graphics-Primitives'->Point- Object

9: #'Kernel-Numbers'->LargePositiveInteger - Integer - Number - Magnitude - Object

10: #'Kernel-Methods'->Message - Object

11: #'Kernel-Methods'->CompiledMethod - ByteArray - ArrayedCollection - SequenceableCollection - Collection

12: #'Kernel-Processes'->Semaphore - LinkedList - - SequenceableCollection - Collection

13: #'Collections-Strings'->Character - Magnitude - Object

14: #'Collections-Arrayed'->ByteArray - ArrayedCollection - SequenceableCollection - Collection

15: #'Kernel-Processes'->Process - Link - Object

16: #'Kernel-Methods'->PseudoContext - ProtoObject (!)

17: #'Kernel-Methods'->TranslatedMethod - ArrayedCollection - SequenceableCollection - Collection

18: #'Kernel-Numbers'->LargeNegativeInteger - LargePositiveInteger - Integer - Number - Magnitude - Object

 

These classes are dependencies by superclass:

Object - ProtoObject (of course)

ByteArray - ArrayedCollection - SequenceableCollection - Collection

Integer - Number - Magnitude

ContextPart - InstructionStream (Maybe could be refactored out? These contain significant code.)

LinkedList

Link

 

Also, some items in the special objects array are other objects of these classes:

UndefinedObject

False

True

Association

SystemDictionary

DisplayScreen

Semaphore

ByteSymbol

Array

Float

LargePositiveInteger

Point

ByteSymbol

MethodContext

BlockContext

 

These are some of the classes I think might be used directly by the VM or plugins:

Form, BitBlt

Socket

 

Instances of these classes could (potentially) be created by the VM; I'll have to look at the source code for the VM to find out what the potential problems could be. Because instances are made by the VM, only one version of each of these classes can be the "live" one in the system.

 

SmallInteger is a special type of object pointer (the high bit is set?).

Message is created by the VM during an invocation to doesNotUnderstand:.

I think that most of these classes could be created by various primitive methods.

 

To put these classes in Namespaces, they will need to remain much as they are and be moved across. The original class would be moved into a Namespace (and out of the SystemDictionary). The behaviour of the class including its superclass hierarchy would have to remain the same. It is possible to have two hierarchies with two "Object" classes - the original with all the legacy cruft, and a new one for a cleaner class hierarchy. This means that while these classes retain the same behaviour and inheritance hierarchy for compatibility, newer Namespaced classes could subclass from a much cleaner Object hierarchy.

 

Interestingly, "Class" and "Behavior" are not in the list above. This may mean that the VM only looks at the variable locations and assumes that instvarN is a MethodDictionary (?) - meaning possibly that a second inheritance hierarchy of "Object" - "Behavior" - "ClassDescription" - "Class" would be possible.

 

There is also a "CompactClasses" array (element 29 of the special objects array), which are classes that can be specified using a few bits in object headers. These are:

 

1: #'Kernel-Methods'->CompiledMethod

2: #'Kernel-Methods'->MethodProperties

3: #'Collections-Arrayed'->Array

4: #'Kernel-Methods'->PseudoContext

5: #'Kernel-Numbers'->LargePositiveInteger

6: #'Kernel-Numbers'->Float

7: #'Kernel-Methods'->MethodDictionary

8: #'Collections-Support'->Association

9: #'Graphics-Primitives'->Point

10: #'Graphics-Primitives'->Rectangle

11: #'Collections-Strings'->ByteString

12: #'Kernel-Methods'->TranslatedMethod

13: #'Kernel-Methods'->BlockContext

14: #'Kernel-Methods'->MethodContext

15: nil

16: #'Graphics-Primitives'->Bitmap

There are 31 possibilities; the last 16 are nil in my 3.10 image.

 

These will need to be dealt with on a case-by-case basis.

 

One idea is tag all insecure or unnecessary methods on these classes and print out a warning when they are used. When the image becomes completely namespaced (and the SystemDictionary discarded), these methods are discarded.

 

These classes will need to be treated as special cases when initialising Namespaces in an image. These classes should be present in every image anyway, so "kernel" namespaces would need to be specially initialised with these classes. Named literals in these classes would be... left unchanged?? I guess named literals would also have been put in a Namespace and these classes would then still refer to the same literals.

 

Namespaces changelog

 

Comparison to other Namespace implementations:

 

VisualWorks

 

  • VisualWorks supports private namespaces. My implementation may support these in the future.
  • VisualWorks has a Smalltalk namespace. This will not exist.
  • VW sends a message to a Namespace to define a class. My approach sends a message to CodeBuilder. I mikevdg actually quite like their approach; I could be tempted to change.
  • In VW, imports are added on a class-by-class basis. In my approach, imports are added to the namespace.
  • VW does not include class variables in the class definition. My approach doesn't modify this from the original Squeak behaviour (yet).
  • In VW, all Namespaces have a common "Root" namespace. In my approach, there is no global root, but rather the root of a namespace hierarchy is called a Package, and there are many of these in an image. Namespaces in different Packages (in my approach) refer to each other using import lists.
  • As a result of that previous point, many Namespaces can define the same name, even if the name uses a dotted notation, e.g. many namespaces can define Kernel.Object. Code differentiates between these by correct use of the import list.
  • Both VW and my implementation use dotted notation (e.g. Graphics.SymbolicPaintConstants.ButtonHiLite)
  • My approach has no support for "Binding references", except for browser support for searching for a class during development.
  • My approach does not allow a Namespace to override methods in other Namespaces. This is because I couldn't work out how to do this securely.

 

Cincom

 

  • Cincom's implementation does not do nested namespaces.
  • (TODO)

 

References

 

http://wiki.squeak.org/squeak/727

 

Comments (0)

You don't have permission to comment on this page.