MinorFs

Rob Meijer

Issue #177, January 2009

The MinorFs user-space filesystems works with AppArmor to provide a flexible form of discretionary access control.

MinorFs is a set of cooperating user-space filesystems that work with AppArmor to provide a flexible form of discretionary access control that operates at the process level. This type of process-level authority restriction is roughly equivalent to that seen in object-oriented programming, providing least-authority restrictions by parameter passing without requiring the administrative overhead of policy controls seen in mechanisms like SELinux. Least authority also is known as least privilege or POLA (Principle Of Least Authority).

In Linux, access to filesystem data is managed by two different access-control mechanisms. First, there is the basic and familiar UNIX discretionary access-control system. The DoD document “Trusted Computer System Evaluation Criteria” (aka the “Orange Book”) defines discretionary access control as “a means of restricting access to objects based on the identity of subjects and/or groups to which they belong. The controls are discretionary in the sense that a subject with a certain access permission is capable of passing that permission (perhaps indirectly) on to any other subject (unless restrained by mandatory access control)”.

Linux also provides access control through the Linux Security Module (LSM) interface. LSM provides hooks for additional access-control mechanisms, such as mandatory access controls, while leaving the base UNIX discretionary access-control mechanisms untouched. The Orange Book defines mandatory access controls as “a means of restricting access to objects based on the sensitivity (as represented by a label) of the information contained in the objects and the formal authorization (i.e., clearance) of subjects to access information of such sensitivity”.

These two constructs are combined restrictively, which means if either one denies access, access is denied. Well known users of the LSM interface are Security-Enhanced Linux (SELinux), used in Debian and Red Hat, and AppArmor used in SUSE and Ubuntu.

Although the UNIX discretionary access control for filesystem access has remained at the same (simple user level) granularity for decades, mandatory access control has become more fine-grained (process level). This granularity, however, comes at relatively large administrative costs. SELinux, for example, is known among many administrators for the large amount of overhead that comes with maintaining profiles.

Object Orientation Provides the Model

When designing and writing object-oriented (OO) programs, avoiding global variables, using data hiding, passing references between objects and using established design patterns (like proxies and factories) are concepts we are used to and comfortable with, and most of us have come to appreciate the many advantages these techniques offer. What many of us fail to realize when working with these concepts, however, is the fact that part of what we are doing can be considered access control.

If we look at the OO paradigms from an access-control viewpoint, it is easy to see that the model used by OO programs is both discretionary and suitable for the highest granularity. Therefore, you could say that OO programs internally use an extremely fine-grained form of discretionary access control. We must note, however, that this form of access control is actually older than the whole concept of object-oriented programming. The access-control mechanism used implicitly by OO programmers is, in fact, to a large extent equivalent to the access-control mechanisms in use in so-called capability-based systems. Capabilities, often called keys, are an unforgeable authority token that can be passed between programs. In capability-based systems, having a capability gives you the right to use the referenced object within the boundaries specified by the rights associated with the capability. With capabilities, there is no need to check other access-control mechanisms (for example, ACLs); the capability itself contains all the necessary information.

So, why not use this same form of discretionary access control at a slightly coarser level of granularity for access to files and directories by processes? MinorFs aims to do just that, with a lot of help from AppArmor.

First, let's look at how classes, objects and member data, as used in OO design and programming, compare to programs, processes and filesystem data. There are clear indications that we could be dealing with the same set of abstractions at a different granularity level.

You could look at a program the same way you look at a class. A process is an instance of a program (the disk image), the same way that an object is an instance of a class. Most objects have state, in the same way that most processes have state. You could say the same abstractions are there both at the object level of granularity and at the process level of granularity.

Next, we need to map the persistent on-disk directory structures to the same OO model that we just used to model programs and processes. A couple hurdles need to be overcome to accomplish this. First, there is process persistence, which is to say that processes are “not” persistent, so how do they fit the model?

Second, there is pass by reference. If an object wants to share part of its private state with another object that it knows, the object can pass either a copy of or a reference to a part of its internal state. Processes, however, to a great extent are confined to passing copies, not references.

Process Persistence

Programs are persistent; directories and files are persistent, but processes are not. This mismatch makes it impossible to add any persistent on-disk data storage to a process identified by a process ID, because when the process ends, the process ID is no longer valid. The base solution to allowing the OO-like abstractions at the process level of granularity for persistent on-disk storage is to define processes as an incarnation of a so-called pseudo-persistent process. So now, the program still will be equivalent to the class; the pseudo-persistent process is the persistent equivalent to the object, and the on-disk persistent directories and files are equivalent to member data fields. Using this new concept of a pseudo-persistent process gives us the ability to lift the disk data access-control features of AppArmor to a granularity level beyond what is possible with mandatory access control—that is, to the granularity of the pseudo-persistent process, but we don't have the burden of central or human administration, without the administrative overhead that mandatory access control embodies.

Pass by Reference

Where objects in OO languages can pass by reference, most IPC on Linux does not allow pass by reference between processes. One insightful exception to this that early UNIX engineers made was creating the ability of passing file handles over UNIX sockets. You could say that file handles used like this are fully pass by reference. In capability systems, such a reference is called a protected capability or an object capability.

Currently, directory file handles cannot be used as protected capabilities. To overcome this problem, there is a concept from capability-system history that is quite useful. The concept is to use a sparse key string as representation of the reference. That is, we create a relatively long sparse key string that both designates a resource and authorizes access to the resource. This string is called a sparse capability or unprotected capability. This type of capability is somewhat inferior to the protected type of which the UNIX file handle is an example. When combined with protection by AppArmor, it still has many properties that make its usage roughly equivalent to the usage of references in object-oriented languages.

AppArmor

AppArmor is the purely permissive mandatory access-control system used in SUSE and Ubuntu Linux. MinorFs uses AppArmor as its foundation, and in this way, it extends AppArmor so it can be used in a discretionary, even capability-based manner. Although MinorFs might be used separately from AppArmor, its usability is relatively limited. The main reason for MinorFs' limited usability without AppArmor is that by default, processes can access data (like the environment variables or command-line arguments) of other processes by way of the /proc/$PID directories, which (according to MinorFs' philosophy) should be considered private to the process.

This means without AppArmor, processes will, in some cases, be able to steal each other's capabilities through the proc filesystem. Although AppArmor fixes the vulnerabilities posed by the default proc filesystem access rights, MinorFs extends AppArmor. The access-control mechanism provided by MinorFs extends the static least-privilege approach that AppArmor offers with a dynamic least-authority approach. That is, it adds abilities to delegate decomposed and/or attenuated permissions.

The prime property of capability-based security that AppArmor helps us enforce is that processes should not have access to what would be equivalent to global variables. The temp and home directories in UNIX systems in many ways can be considered global variables if we look at them at the process level of granularity.

The way an AppArmor profile works is that it defines a list of permissions that are available for a specific application. For convenience, AppArmor also provides the ability to include sets of permissions with a single include directive.

When designing a system that will use MinorFs, you always should design your separation of privileges setup first. Don't allow your application to become a monolith.

Using AppArmor and MinorFs, you can build privilege-separated applications according to OO or capability paradigms, but even smaller steps can be quite useful. On installation, MinorFs creates a hard link to /bin/bash named /bin/minorbash that has the following AppArmor profile:


#include <tunables/global>

/bin/minorbash {
   #include <abstractions/base>
   #include <abstractions/bash>
   #include <minorfs/systemreadonly>
   #include <minorfs/full>
}

This profile basically gives a large set of read-only permissions but no write permissions to the version of bash named minorbash and to all programs started by it. This means, you simply can run programs with diminished access rights by starting them from a shell script that uses minorbash instead of bash.

MinorFs

Now, for MinorFs itself. MinorFs currently consists of two user-space filesystems. These filesystems are relatively simple Perl scripts implemented using the FUSE Perl module. Each filesystem has its own distinct task. FUSE (Filesystem in USErspace) is a kernel module that allows nonprivileged users to create their own filesystems.

MinorCapFs

MinorCapFs is at the core of MinorFs. Some time ago, the Linux directory and file-access API was extended with a set of new calls—openat(), mkdirat() and so on—that take an additional first argument, a file descriptor, which specifies from where relative paths should be resolved (these calls are to be standardized in a future version of POSIX). Given the fact that file handles in Linux can be communicated between processes and used as capabilities, it seemed like a good idea to look at the new directory handle calls and create or extend an LSM module so that directory handles could be passed as directory capabilities. The main goal was to use a directory handle as a capability to a directory that wouldn't disclose anything about parent directories.

After discussing my ideas with the AppArmor people, it was concluded that I should try to do as much as possible in user space, so I started designing MinorCapFs. The goals of MinorCapFs are to allow (unattenuated) decomposition, delegation and composition of subgraphs. MinorFs defines a sparse capability for each directory tree subgraph.

In order for you or your program to decompose the directory graph, each file and directory is given an extended attribute named cap. This extended attribute holds the full MinorCapFs path containing the sparse capability for the directory subgraph. Using any form of interprocess communication at your disposal, this path can be shared with any process or even with other users on the same system. The receiving user or process can create a symbolic link in another directory subgraph—for example, in order to make the delegation permanent.

Figure 1 shows how you could use the attr command to fetch the cap attribute, and how this attribute can be used as a short strong path or sparse capability to a directory or file. Normally, you should not use the command line for this but instead do the same thing from your program code. The getxattr function can be used to do the same thing that the attr command does in the example above.

Figure 1. MinorCapFs Extended Attributes

Composition is almost as important as decomposition. Where the usage of extended attributes for decomposition may be strange and new, composition uses a construct that we probably are all much more comfortable with, the construct of using symbolic links. Next to decomposition, MinorCapFs provides the ability to create symbolic links in the same way that the filesystems we are used to do. Thus, MinorCapFs combines two basic functionalities for doing simple unattenuated decomposition of directory tree graphs and for doing composition of directory graphs from subgraphs.

You could say that MinorCapFs provides the simplest bare-level form of unattenuated capability-based access control. But, what holds the top-level capability? And, how are subgraphs delegated to individual processes? That's where a second filesystem comes in.

MinorViewFs

As MinorCapFs provides for tree graph decomposition and composition constructs, something has to pass sparse capabilities to processes in order for any process to become able to use MinorCapFs.

To see how we need to solve this, let's take a step back and look at the parallelisms we are trying to exploit. We are trying to make processes into a coarser-grained form of object that, just like objects in any OO language, have private data members. There are two ways to look at the process as such. First, there is the traditional view of nonpersistent processes where all state held by the process disappears when the system reboots or ends for any other reason. You could look at this form of delegation as a better alternative to the troublesome usage of temp directories. Temporary files, by default, would become private to the process until the process delegates them explicitly to other processes.

It is important to note that the temp provision of MinorViewFs is not a reference-counting garbage-collection system. Delegated subgraphs instantly will become invalid at the time the owning nonpersistent process dies.

MinorViewFs delegates subgraphs to individual processes by means of two symbolic links under /mnt/minorfs/priv (Figure 2). Each process reading these symbolic links will have a completely different set of subgraph sparse capabilities delegated to it. The second symbolic link /mnt/minorfs/priv/tmp points to the temporary subgraph described above.

Figure 2. MinorViewFs Links

Pseudo-Persistent Processes According to MinorViewFs

Although delegation of temporary subgraphs to processes is relatively simple, the concept of the same process being an incarnation of some pseudo-persistent process needs a bit more thought.

MinorViewFs looks at pseudo-persistent processes on a so-called n-th claim basis. What it basically boils down to is that if a program is instantiated while two earlier instantiated versions of the program already are running, the new process will claim the third slot. If the system is rebooted, you also will need to restart the first and second instantiation of the program.

Although appropriate for dæmon-like programs, this, indeed, may be inconvenient for programs like editors and other user-driven programs. To work around these problems, and also to work around the problem posed by scripts and Java programs all being instances of the same program, MinorViewFs uses some simple tricks to determine program, or more specifically, program-invocation-based identity.

So how does MinorViewFs determine a program-invocation identity? First, there is the process parent chain. The process parent chain, including both programs and libraries loaded by those programs, contributes to a unique identity for the invocation. If the parent chain is insufficient as an invocation identity, the system administrator could add a config file under /etc/minorfs/.

Here is an example of a config file for the E language interpreter:


<codefile path="/usr/local/e/e.jar" cmdline="true" slots="256">
    <env>DISPLAY</env>
</codefile>

The example config adds the command line to the identifying properties of the program invocation. So, using optional config files, MinorViewFs is able to create and re-create a uniquely identifying set of data that allows it to re-delegate a subgraph to a new incarnation of the same program.

The E language named above takes this concept one step further; it allows large subsystems within an E program to be taken together and be serialized and synchronized to disk storage automatically. What's more, the E language is an object-capability language; thus, combining AppArmor and MinorFs with the E language allows you to combine both least authority and private storage all the way down to the object level of granularity. Although E is a bit of an esoteric language, it is a mature and complete language that is worth considering when doing high-integrity projects.

When a process is started and accesses the /mnt/minorfs/priv/home symbolic link, this symbolic link will point to the same MinorCapFs subgraph as the previous time the program was invoked into the same slot.

Next to being useful to new programs designed with privilege separation and least authority in mind, MinorViewFs also can be used with legacy programs like the SSH client. This does, however, involve the usage of the admin tool 2rulethemall that helps the user bypass the basic process-based access-control mechanism with a per-user password. You can put your unprotected SSH private key in the SSH client's private persistent storage space. Again, no program not run by root other than MinorViewFs, SSH or 2rulethemall would be able to access the private key.

Conclusion

MinorFs brings an extreme (capability-based) form of discretionary access control to your AppArmorized Linux system. It uses a form of access control that embraces delegation as a beneficial thing for security. Although MinorFs still is being developed, and is incomplete, it already should provide a useful and intuitive way to create privilege-separated programs that use filesystem access. It provides a way to protect serialized data stored on disk for persistent processes, and a way to protect process private data. And, it's an alternative to the troublesome usage of temp directories.

Upcoming versions of MinorFs will include a third filesystem, MinorCtkrFs that will implement attenuation in a generic way based on the so-called Caretaker pattern. This MinorCtkrFs should add different kinds of read-only capabilities to files and directories, as well as revocable read/write and read-only capabilities.

Rob Meijer is a computer forensic and security software development professional from the Netherlands. He started his career as a UNIX system administrator, switching one decade ago to software development. In his spare time, he is working on several least-authority-related open-source projects, including MinorFs.