You are on page 1of 59


tools for practical

least privilege
4/8/2018 Plash: tools for least privilege

Plash: tools for practical least privilege

Table of contents

How it works: virtualizing the file namespace
Mailing list
Related systems
Downloading and installing Plash
Installing Plash
Pre-built packages
Building Plash from source
Using Debian source packages
Building without using Debian scripts
Creating Debian packages from SVN
SVN repository
Download previous versions
Running GUI applications
Running Leafpad (a simple text editor)
Running Gnumeric
Running Inkscape
Running command line programs
Running gcc
Running rpm to build a package as a non-root user
Running servers
Running a webmail server
Using the powerbox from Gtk applications
Using the powerbox from XEmacs
The powerbox: a GUI for granting authority
Introduction to powerboxes
What is a file powerbox?
Why "least privilege" is important: an example
How do powerboxes work?
The history of powerboxes
How to run programs to use the powerbox
X Window System security
UI limitations
Reviewing and revoking
Nested use of pola-run
Secure handling of symlinks
Backup and temporary files
Integrations: powerbox for Gtk applications
How it works
Earlier version of the GtkFileChooserDialog replacement
Integrations: powerbox for Emacs/XEmacs
The powerbox API
pola-run: A command line tool for launching sandboxed programs
Environment variables
Plash's sandbox environment
Architecture overview
Symbolic links
Implementation 1/3
4/8/2018 Plash: tools for least privilege

Remaining problems
Parent directories: the semantics of dot-dot using dir_stacks
Directory file descriptors
Why not do interception of system calls using, for example, ptrace?
How does Plash compare with chroot jails?
FAQs: frequently asked questions
pola-shell: A shell for interactive use
Differences from Bash: examples
Bourne shell features missing from pola-shell
Installation endowment
Enabling access to the X11 Window System
Job control
Shell scripts
Argument lists
Executable objects: a replacement for setuid executables
Applying POLA to argument files and other files
Invocations between programs
The process replacement behaviour
Discovering file descriptors
Garbage collection
Linux, job control, and TTY file descriptors
Job control
exec-object limitations
Shell limitations
Communication protocols
Protocol for messages with file descriptors
Object-capability protocol
Closing the connection
Initial state of a newly-created connection
Future extensions
RPC methods
fs_op object
Filesystem objects: files, directories and symlinks
Executable objects
conn_maker object
fs_op_maker object
Version 1.17 (2006-12-23)
Version 1.16
Version 1.11
Running the shell as root
Following symlinks
Documentation overhaul
Other changes
Version 1.10
Version 1.9
Other changes
Version 1.8
New build system
Syntax change
Enabling X11 access
Shell options
Support for directory file descriptors
Version 1.7
Version 1.6
Version 1.5
Region-based memory management
String handling 2/3
4/8/2018 Plash: tools for least privilege

Object system
Reference counting
Encodings for marshalling
Documentation format: XXML, an XML surface syntax
Security vulnerabilities
connect() race condition
chmod() race condition
Running pola-shell as root
Aspects that need more testing
Might be problems in future
Problems running specific programs
GNU Emacs (resolved)
Konqueror (resolved)

Mark Seaborn Up: Contents 3/3
4/8/2018 Plash: tools for least privilege

Plash: tools for practical least privilege

2007/02/26: There is now a wiki for Plash: Documentation from this site is being Download and
gradually moved there. install
Plash is a system for sandboxing GNU/Linux programs. Plash's aim is to protect you from the programs you run Documentation
by letting you run them with the minimum authority and privileges they need do do their job -- this is the Principle contents
of Least Authority (POLA). Plash can run programs in a secure, restricted execution environment with access to a Examples
limited subset of your files.
Plash can be used to run servers, command line tools, and applications with graphical user interfaces: FAQ

Applications with graphical interfaces: You can dynamically grant GUI applications access rights to Mailing list
individual files that you want to open or edit. This happens transparently through the Open/Save file chooser Browse source
dialog box. Plash replaces Gtk's GtkFileChooserDialog so that the file chooser is implemented outside the code
application in a separate process, as a trusted component. This file chooser is known as a powerbox, Related
because it delegates additional power to the application. See examples and screenshots. systems
Servers: You can run a network-accessible server with minimal access rights so that if it is compromised
(e.g. via a buffer overrun bug), the adversary cannot compromise the whole machine. Or you can set up an Roadmap
HTTP or FTP server with a limited view of the filesystem to export handpicked files without having to rely
on the server's application-level access control mechanisms. See examples.
Command line tools: Using Plash, you can run tools with read-only access to their inputs and write access
to their outputs. Sandboxes are lightweight, so you can, for example, create a sandbox for running gcc to
compile a single file. See examples.

Plash virtualizes the file namespace, and provides per-process/per-sandbox namespaces. Plash grants access to
files by mapping them into otherwise empty namespaces. This allows for fine-grained control over dependencies:
You can link a program with specific versions of dynamic libraries by mapping individual files; or you can just
map the whole /usr directory into the program's namespace.

Plash provides two main interfaces for granting access rights to sandboxed processes:

The pola-run tool: This is a command line interface for launching programs to run inside a sandbox. Its
arguments let you grant the sandboxed program access to files and directories. pola-run can be used from
within a sandbox, allowing nested sandboxes.
The powerbox: This is a GUI that works transparently -- it adds a security role to a dialog box that normal
users already use for choosing files. Users therefore do not have to adjust much. However, applications or
their libraries must be changed to make requests via the powerbox component.

pola-shell is another way to launch sandboxed programs. It is a shell with syntax similar to the Bourne shell or
Bash. It lacks many scripting features so is intended for interactive use only.

How it works: virtualizing the file namespace

Plash's sandboxing mechanism works on unmodified Linux kernels (2.6, 2.4 and earlier) and can run normal Linux
executables, provided they are dynamically linked.

Sandboxed processes do not use the kernel's filename-based system calls, such as "open". Plash effectively
disables these system calls by putting the sandboxed process in a very minimal chroot() jail. (It also gives the
sandboxed process a unique, dynamically-allocated UID and GID.)

Instead, a sandboxed process accesses the filesystem by making remote procedure calls (RPCs) across a socket to
a server process. To open a file, the sandboxed process sends an "open" request containing the filename. The
server process can send a file descriptor for a file across the socket in response.

Plash dynamically links sandboxed programs with a modified version of GNU libc (glibc), which replaces the
filename-related calls (such as open()) so that they make RPCs across the socket instead of using the usual system
calls. (See the table of glibc calls to see which functions are affected.)

In most cases this does not seriously affect performance because the most frequently called system calls, such as
read() and write(), are not affected. Once a sandboxed program has opened a file and obtained a file descriptor for
it, there is no further mediation by the server process, and it can use the normal system calls on the file descriptor. 1/2
4/8/2018 Plash: tools for least privilege

See the architecture section for more details.

Mailing list
There is a mailing list for Plash, for announcements and general discussion. You can subscribe through the mailing
list page, or by e-mailing with "subscribe" in the subject line.

Another way to be notified of new releases is to subscribe to the project on its Freshmeat page.

Related systems
Plash is strongly influenced by systems that use capability-based security. Plash itself uses a capability
architecture, and the idea of a powerbox user interface comes from the same tradition.

There are two other existing systems that implement file powerboxes:

CapDesk, a desktop GUI which is based around the E programming language. E is implemented in Java,
and CapDesk uses a Java-based widget set to provide its GUI. Programs must be written in E specially to
run under CapDesk.
Polaris, a system for sandboxing Windows applications, such as Word and Internet Explorer. Polaris is
proprietary software. See the tech report on Polaris.

The powerbox concept appears to have first been proposed by Ka-Ping Yee and Miriam Walker in Interaction
Design for End User Security (December 2000).

Plash has been influenced by the EROS operating system, a research system which is now being developed under
the name CapROS. EROS is based on KeyKOS, which was proprietary. A successor to EROS is being designed,
called Coyotos. Like EROS, Coyotos will be free software.

These additions are planned for the future:

Python bindings for the Plash object system. This will be more flexible than pola-run for setting up
execution environments for programs. This would let you implement Plash objects in Python. For example,
you could create a directory object whose contents listing is chosen dynamically by Python code. You could
implement executable objects in Python.

Persistence: Implement a simple persistent store, allowing capabilities to be saved, such as capabilities for
file and directory objects and executable objects.

Having persistence means that an application's installation endowment (eg. its executable, libraries and
configuration files) can be saved as object references in the persistent store, rather than listed on the
command line in a script that invokes pola-run. When installing an application, it can be granted access to
necessary files and objects using a powerbox.

X Window System access control: Initially I plan to write a proxy for the X11 protocol which will isolate
an (unmodified) X client from other X clients, with the possible exception of allowing access to the

Later versions could add mechanisms to allow X clients to co-operate securely by sharing X resources; X
clients would probably need to be changed to use such mechanisms.

Ultimately it would be good to get this integrated into the X server, when we figure out what is required.

Plash is free software, distributed under the GNU Lesser General Public Licence.

Mark Seaborn Up: Contents 2/2
4/8/2018 Plash: Downloading and installing Plash

Plash: tools for practical least privilege

Downloading and installing Plash

Installing Plash
Pre-built packages

Packages are available for three distributions on i386. To install using APT, add one of the following lines to your /etc/apt/sources.list

For Debian Etch and Debian unstable/sid (using glibc 2.3.6):

deb ./

For Debian Sarge (using glibc 2.3.6; Python and Gtk support not included):

deb ./

For Ubuntu Edgy Eft (using glibc 2.4):

deb ./

And then do:

apt-get install plash

Building Plash from source

Using Debian source packages

Debian source packages can be downloaded using APT, by adding one of the lines above to sources.list along with the corresponding
deb-src line. For example:

deb ./
deb-src ./

Since Plash builds a modified version of glibc, it requires a copy of the glibc source. The glibc source tarball is not included in the
Plash source package, but in a separate binary package such as glibc-source-2.5. This and other build dependencies will be installed
by doing:

apt-get build-dep plash

Get the source package with:

apt-get source plash

and build with:

dpkg-buildpackage -rfakeroot -b -D

Building without using Debian scripts

Build glibc:

./ 2.5 unpack configure build build_extra 1/4
4/8/2018 Plash: Downloading and installing Plash

Plash can be built against glibc 2.5, 2.4 or 2.3.6. Replace "2.5" above with the glibc version you want. Note: glibc 2.4 dropped
support for Linux 2.4 and earlier, hence Plash's support for glibc 2.3.6.

The glibc source tarballs must be present. The script looks for them in the current directory and in /usr/src, and will suggest files to
download if it does not find them. You will probably need one of the following:

glibc-2.5.tar.bz2 (13Mb)
glibc-2.4.tar.bz2 (13Mb)
glibc-2.3.6.tar.bz2 (13Mb) and glibc-linuxthreads-2.3.6.tar.bz2 (~300k)

Building glibc is the most resource-intensive step. It uses about 100 Mb of disc space, and takes 13 minutes on an Athlon XP 3200
machine. Note that if you are getting a new revision of Plash by doing "svn update", glibc usually does not need to be re-built.

If you have checked out the source from Subversion (rather than downloading a tarball), the "configure" script will not be present.
You will need to install autoconf and run:


Build Plash itself:

./configure GLIBC_DIR=glibc-2.5-objs

(again, replace "2.5" with your desired glibc version).

To install Plash, run the following command as root:

./ /

Creating Debian packages from SVN

The Subversion repository contains multiple versions of the debian packaging scripts. By default, "debian" is a symlink pointing to
"debian-etch", but it can be changed to point to "debian-sarge" or "debian-edgy". The script "" can create source
packages for all three variants in one go.


You need Gtk >=2.8 to use the powerbox. Otherwise, use the --without-gtk configure option.
GNU Readline

SVN repository
Bleeding-edge versions of Plash are available from the Subversion (SVN) repository, which is hosted by

Checkout over SVN protocol (TCP 3690):

svn co svn:// plash

Checkout over http:

svn co plash

Plash source daily snapshot

Browse the source repository
More comprehensive information about the repository on

Download previous versions

Version Files Description of main changes
Version 1.17 Source: plash_1.17.orig.tar.gz Add Python bindings for Plash object interface. Add
(23rd December 2006) Browse SVN cow_dir (layered/copy-on-write directories). Add -e
option to pola-run. Update to build with glibc 2.3.6, 2.4
and 2.5. Improved Debian packaging. 2/4
4/8/2018 Plash: Downloading and installing Plash

Version 1.16 Debian package: plash_1.16_i386.deb Rewrite Powerbox for Gtk: now inherits from GtkDialog
(15th March 2006) RPM: plash-1.16-1.i386.rpm and so works with more applications. Overhauled
Source: plash-1.16.tar.gz documentation and build/install process.
Browse SVN

Version 1.15 Debian package: plash_1.15_i386.deb Added Powerbox for Gtk. Fixes to allow Konqueror and
(12th December 2005) RPM: plash-1.15-1.i386.rpm Gnumeric to run.
Source: plash-1.15.tar.gz
Browse SVN

Version 1.14 Debian package: plash_1.14_i386.deb Added powerboxes.

(9th November 2005) RPM: plash-1.14-1.i386.rpm
Source: plash-1.14.tar.gz
Browse SVN

Version 1.13 Debian package: plash_1.13_i386.deb Much-improved build system.

(6th October 2005) RPM: plash-1.13-1.i386.rpm
Source: plash-1.13.tar.gz
Browse SVN

Version 1.12 Debian package: plash_1.12_i386.deb Initial version of pola-run. Fixed gc-uid-locks race
(19th September 2005) RPM: plash-1.12-1.i386.rpm conditions. Include etc. in packages.
Source: plash-1.12.tar.gz
Browse SVN

Version 1.11 Debian package: plash_1.11_i386.deb Major new feature: Add plash-run-emacs tool. Made
(13th August 2005) RPM: plash-1.11-1.i386.rpm changes so that it's safer to run the shell as root.
Source: plash-1.11.tar.gz Documentation has been improved, and converted to
Debian source package: DocBook format.
plash_1.11.dsc, plash_1.11.tar.gz
Browse SVN

Version 1.10 Debian package: plash_1.10_i386.deb Implemented fchdir(). "rm -r", "install -d" and "mkdir -p"
(20th July 2005) RPM: plash-1.10-1.i386.rpm now work.
Source: plash-1.10.tar.gz
Browse SVN

Version 1.9 Debian package: plash_1.9_i386.deb Changed implementation of file namespace construction.
(10th July 2005) RPM: plash-1.9-1.i386.rpm Now possible to add/replace entries in existing directories
Source: plash-1.9.tar.gz (without modifying the directory).
Debian source package: plash_1.9.dsc,
Browse SVN

Version 1.8 Debian package: plash_1.8_i386.deb Overhauled build system for modified glibc. Add option
(22nd May 2005) RPM: plash-1.8-1.i386.rpm for granting access to the X11 Window System (off by
Source: plash-1.8.tar.gz default). New mechanism for setting shell options. Added
Debian source package: plash_1.8.dsc, limited support for directory file descriptors, so that
plash_1.8.tar.gz XEmacs works.
Browse SVN

Version 1.7 Debian package: plash_1.7_i386.deb Major new feature: executable objects.
(1st May 2005) RPM: plash-1.7-1.i386.rpm
Source: plash-1.7.tar.gz
Browse SVN

Version 1.6 Debian package: plash_1.6_i386.deb New argument syntax: "PATHNAME = EXPR", allowing
(18th January 2005) RPM: plash-1.6-1.i386.rpm objects to be attached anywhere in the file namespace.
Source: plash-1.6.tar.gz
Browse SVN

Version 1.5 Debian package: plash_1.5_i386.deb Add recursive read-only directories. Add example
(7th January 2005) RPM: plash-1.5-1.i386.rpm "chroot" program: first tool to use object-capability 3/4
4/8/2018 Plash: Downloading and installing Plash

Source: plash-1.5.tar.gz protocol.

Browse SVN

Version 1.4 Debian package: plash_1.4_i386.deb Implemented object-capability protocol. This is used as
(4th January 2005) RPM: plash-1.4-1.i386.rpm an additional layer in the communication between client
Source: plash-1.4.tar.gz and server.
Browse SVN

Version 1.3 Debian package: plash_1.3_i386.deb Better security: runs processes under dynamically-
(29th December 2004) RPM: plash-1.3-1.i386.rpm allocated user IDs rather than the user "nobody". Add
Source: plash-1.3.tar.gz globbing and file descriptor redirection to the shell.
Browse SVN Implemented bind(), symlink(), utime(), rename() and
Version 1.2 Debian package: plash_1.2_i386.deb Fixed open64() and Implemented pipes in
(18th December 2004) RPM: plash-1.2-1.i386.rpm the shell. Added "!!" syntax to shell. Added support for
Source: plash-1.2.tar.gz "#!" scripts. Added options window for enabling logging.
Browse SVN

Version 1.1 Debian package: plash_1.1_i386.deb Added job control to shell.

(10th December 2004) RPM: plash-1.1-1.i386.rpm
Source: plash-1.1.tar.gz
Browse SVN

Version 1.0 Debian package: plash_1.0_i386.deb First version.

(7th December 2004) RPM: plash-1.0-1.i386.rpm
Source: plash-1.0.tar.gz
Browse SVN

Mark Seaborn Up: Contents 4/4
4/8/2018 Plash: Examples

Plash: tools for practical least privilege


Running GUI applications

Running Leafpad (a simple text editor)
Leafpad is a simple text editor that uses Gtk. You can run it to use the powerbox with a shell script such as:

pola-run --prog /usr/bin/leafpad \
-B -fl /etc \
--env LD_PRELOAD=$PB_SO -f $PB_SO \
--x11 --powerbox --pet-name "Leafpad"

"-B" grants access to /usr, /bin and /lib. On my Debian system, it is necessary to include /etc/ in Leafpad's namespace,
otherwise it will not link (it won't find the X libraries in /usr/X11R6/lib). This is why we do "-fl /etc". Perhaps "-fl /etc/"
would do instead. Leafpad does not need any configuration files from your home directory. Since we can use the powerbox to grant
Leafpad access to files to be edited, we don't initially need to grant it access to anything from your home directory.

Running Gnumeric
Gnumeric is a spreadsheet application, written in C, which uses Gtk. Here is one way to run Gnumeric so that it uses the powerbox:


rm -rv tmp/gnumeric
mkdir -p tmp/gnumeric


pola-run \
--prog /usr/bin/gnumeric \
-B -fl /etc \
-tw /tmp tmp/gnumeric \
-tw $HOME tmp/gnumeric \
-fw /dev/urandom -fw /dev/log -f /var/lib/defoma \
--env LD_PRELOAD=$PB_SO -f $PB_SO \
--x11 --powerbox --pet-name "Gnumeric"

Gnumeric requires a number of configuration directories to exist (inside the user's home directory). If they don't exist, it tries to
create them, and exits if it can't. In this example, we substitute a temporary directory (tmp/gnumeric) for our real home directory,
using "-tw $HOME tmp/gnumeric". This ensures that Gnumeric runs cleanly, from scratch, without picking up existing configuration
files from previous runs (which, of course, may not be what you want). Gnumeric launches gconfd (a service which deals with
configuration files). Usually this process gets shared between GNOME applications. In this case, we want to isolate Gnumeric, so
that it has its own instance of gconfd. The sharing works by creating shared sockets inside /tmp, so we disable sharing by giving
Gnumeric its own private instance of /tmp (which we map to tmp/gnumeric).

Running Inkscape
Inkscape is a vector graphics application which uses Gtk. It is quite complex. It seems to be written in C, and it deals with complex
vector image file formats (e.g. SVG), so it may well have buffer overrun bugs. If you run Inkscape on an SVG file downloaded from
the Internet, it could be a malicious file that exploits a bug, so it's worth sandboxing Inkscape. Running Inkscape is similar to running

pola-run --prog /usr/bin/inkscape \
-B -fl /etc \
-f /proc \
--env LD_PRELOAD=$PB_SO -f $PB_SO \
--x11 --powerbox --pet-name "Inkscape" 1/2
4/8/2018 Plash: Examples

One difference from before is "-f /proc". Inkscape reads "/proc/stat" and "/proc/self/stat" -- perhaps something to do with the garbage
collection library it uses -- and it exits if these are not available. So we grant access to "/proc"; however, this is not a great idea and it
should be reviewed because it might reveal sensitive information. Note that granting access to "/proc/self" will not actually work
under Plash, because the Linux kernel treats it specially: the information Linux returns from it depends on the PID of the process that
is asking. When running under Plash, a server process asks on behalf of the application process.

Running command line programs

Running gcc
The following invocation:

gcc -c code.c -o code.o

can be changed to:

pola-run --prog /usr/bin/gcc \

-a=-c -fa code.c -a=-o -faw code.o \
-B -f .

Running rpm to build a package as a non-root user

pola-run --prog /usr/bin/rpm -B -f /etc \
-a=-bb \
-tal /stuff/plash.spec ../plash.spec \
-f ~/projects/plash \
-fl ~/projects/plash/glibc \
-f ~/projects-source/plash \
-t /usr/src/rpm/SOURCES/plash-$PLASH_VERSION.tar.gz plash-$PLASH_VERSION.tar.gz \
-tw /usr/src/rpm/BUILD build \
-tw /usr/src/rpm/RPMS/i386 out \
-tw /var/tmp tmp

Running servers
Running a webmail server
pola-run \
--prog -f \
-f \
-fl ~/Mail \
-fw mail-db \
-fw /tmp \
-f /etc/protocols \
-f /etc/hosts \
-t /lib /debian/lib \
-t /usr/bin /debian/usr/bin \
-t /usr/share /debian/usr/share \
-t /usr/lib/perl5 /debian/usr/lib/perl5 \
-t /usr/lib/perl /debian/usr/lib/perl \
-f /usr/lib/plash \
-f ~/projects/sparkymail --log

Mark Seaborn Up: Contents 2/2
4/8/2018 Plash: Screenshots

Plash: tools for practical least privilege


Using the powerbox from Gtk applications

Saving a file from Gnumeric: Here, the powerbox is accessed from the "File => Save As" menu item. 1/3
4/8/2018 Plash: Screenshots

Importing an image file into an existing document in Inkscape: Here, the powerbox is accessed through the "File => Import"
menu item. Note that you can choose to grant read-only access to the file: Inkscape doesn't need to write to the file if you're only
importing it.

One limitation of the powerbox implementation is that you can't preview images in the file chooser dialog.

The interesting thing about these screenshots is that they are not very interesting! They just show opening or saving a file via a familiar
file chooser dialog box.

The difference from usual is that the file chooser now grants the application the right to access the file in question. The file chooser is
asking the user to make a security decision, although the user does not need to be aware of that.

Once you have chosen a file using the powerbox, you don't need to confirm the application's right to access it further. So selecting "File
=> Save" will save a document without any annoying "Are you sure?" boxes popping up.

In these examples, the windows belong to two different processes. The application runs in a sandboxed process, which gets granted
access to the individual files that the user picks via the file chooser. The file chooser dialog is provided by the powerbox manager, which
runs in a separate process. The powerbox manager has access to all the user's files and can delegate selected files to the application.

There are three visible differences from normal file chooser dialogs:

The title bar displays the "pet name" of the application (that is, a name the user chooses to give the application beforehand), so
that the user can tell what entity they are granting authority to.
There is an option to grant read-only access.
There is a "reason" field which gives the application the chance to say why it wants the user to pick a file to grant to it.
Unmodified Gtk applications do not fill this out. Usually it's not necessary because the dialog is opened in response to the user
picking a menu item such as "File => Save As".

The powerbox manager happens to use Gtk to display the file chooser, but it could equally use any other widget set; it can be a different
widget set to the application, as the XEmacs example shows.

Using the powerbox from XEmacs 2/3
4/8/2018 Plash: Screenshots

Saving a file from XEmacs: Here, the powerbox has been opened by typing C-x C-s (in a buffer that doesn't have a filename yet)
or by selecting the "File => Save As" menu item. Usually, typing C-x C-f (find-file) causes XEmacs to prompt for a filename
via the minibuffer (using Tab for filename completion). Plash's XEmacs/powerbox integration (written in elisp) replaces the
minibuffer-based prompt with a dialog box.

Mark Seaborn Up: Contents 3/3
4/8/2018 Plash: The powerbox: a GUI for granting authority

Plash: tools for practical least privilege

The powerbox: a GUI for granting authority

Introduction to powerboxes
What is a file powerbox?

A file powerbox is a kind of file chooser dialog box, and it works the same from the user's perspective. The difference is that as well
as telling the application which file to access, it dynamically grants the application the right to access the file.

This helps provide security: It means that the application does not have to be given access to all of the user's files by default. This is
an example of applying the principle of least privilege/authority: the aim is to give the program the authority it needs to do its job,
but no more.

Why "least privilege" is important: an example

Suppose you run Gnumeric to view a spreadsheet you downloaded from the Internet. Gnumeric might not be a malicious program,
but suppose it has a buffer overrun bug -- quite possible considering that it is written in C -- and the spreadsheet exploits that bug.

If Gnumeric runs with all your authority, the dodgy spreadsheet can read any of your files (such as "~/sensitive-file.txt", or your
private keys from "~/.ssh") and send them back to its creator.

But if Gnumeric runs with minimum authority, the malicious spreadsheet can't do anything except write to the file it was opened
from, and open a powerbox to request a file. The application cannot specify a default pathname for the powerbox to open, so for the
spreadsheet to get access to a sensitive file, the user would have to specifically choose that file. The malicious spreadsheet would find
it very hard to get access to ".ssh": why would the user choose ".ssh" if Gnumeric opened a powerbox out of the blue without a good

How do powerboxes work?

In order for the powerbox file chooser to provide security, it cannot be implemented by the application and its libraries. It must be
implemented as a separate, trusted component, and it must run in its own protection domain.

The idea is that the file chooser has a trusted path to the user, so only the user can enter a filename into it. This allows the system to
distinguish between requests made by the user and requests made by the application.

The history of powerboxes

Powerboxes have been implemented in a couple of other systems already:

CapDesk, an environment based around the E programming language, implemented in Java.

Polaris, a restricted environment for Windows that runs normal Windows programs.

How to run programs to use the powerbox

Here is an example of running the simple text editor Leafpad so that it uses the powerbox:

pola-run --prog /usr/bin/leafpad \

--env \
-B -fl /etc \
--x11 --powerbox --pet-name "Leafpad"

X Window System security

The biggest limitation is that the X Window System provides no security. The security of a powerbox relies on the powerbox having
a secure path to the user, so that the user can enter a filename into the powerbox but the application can't. However, under X, this isn't
true: one X client can spoof keypresses by sending keypress events to another client.

So the present system only raises the bar to a successful attack, rather than ruling an attack out. 1/4
4/8/2018 Plash: The powerbox: a GUI for granting authority

UI limitations

Currently, the powerbox is the only way you can dynamically add files to the namespace of an application that has been launched
using pola-run. It would be useful to have a command line tool (similar to gnuclient, as used with Emacs) for granting an application
access rights to further files.

Reviewing and revoking

There is currently no way to review what authority has been granted to applications using the powerbox, and no way to revoke this

Nested use of pola-run

Nested use of pola-run with the --powerbox option is not very useful at present: the nested instance of pola-run will provide its own
powerbox manager. At the moment, the powerbox_req_filename object is only able to attach a file into an existing file namespace
and return a filename. It needs to be extended so that it can return a reference to a file or slot object. The nested instance of pola-run
can then attach this reference into the namespace it created.

Secure handling of symlinks

As it stands, using Gtk's file chooser, the powerbox does not provide a secure UI for handling Unix symlinks. The user is not shown
symlink destinations at all, yet the powerbox manager will follow symlinks, even though the symlink destination could have been
written to the filesystem by an untrusted application.

A possible solution is to change the file chooser to display symlink information, and perhaps require the user to double click on a
symlink to follow it.

Backup and temporary files

Backup files are not handled. Arguably, when the user chooses FILE, the powerbox should also grant access to FILE~ and #FILE#
(the latter is used by Emacs' auto-save feature).


There is no persistence. An application's access rights are not saved across sessions.

Integrations: powerbox for Gtk applications

The powerbox/Gtk integration reimplements Gtk's GtkFileChooserDialog interface so that it opens a file powerbox for choosing files
or directories. This means that existing, unmodified Gtk applications can use the powerbox.

GtkFileChooserDialog is replaced using an LD_PRELOADed shared object (, which replaces the
gtk_file_chooser_dialog_*() functions. This shared object will work across different versions of Gtk. (Perhaps in the future, it could
be compiled into Gtk.)

The powerbox/Gtk integration does not change the older GtkFileSelection interface, which is deprecated and exposes too many
internal details to change.

How it works

The replacement GtkFileChooserDialog class inherits from GtkDialog, and hence from GtkWindow etc. However, it must prevent
GtkWindow from ever opening a window on the screen; this is done by the powerbox manager instead. GtkFileChooserDialog
achieves this by overriding the GtkWidget "map" method with code that does not pass the call on to GtkWindow. Instead, the "map"
method invokes the powerbox API.


Selecting multiple files in one go is not implemented yet.

The dialog's "extra widget" is not supported: it does not get displayed at all. This is a serious problem, as it means file type
selection widgets (as used by, e.g., Gimp) are not displayed. To fix this requires doing inter-client widget embedding. A
temporary fix would be to display the extra widget in a separate top-level window.

The preview widget is not supported. It does not get displayed. 2/4
4/8/2018 Plash: The powerbox: a GUI for granting authority

Filters are not yet implemented. GtkFileChooserDialog does not pass file type filter parameters to the powerbox manager. This
is not a serious problem; it is just a convenience.

GtkDialog offers the application an unconstrained choice over the action buttons that appear in the dialog box, while the
powerbox always displays Open/Save and Cancel buttons. GtkFileChooserDialog must make a guess at which of GtkDialog's
actions to map the powerbox's Open/Save action to. If GtkDialog has more than one action besides Cancel, only one will be

Confirmation and warning dialog boxes:

Some applications implement file overwrite confirmation dialog boxes themselves. In this case, two confirmation boxes will
appear sequentially, because the powerbox also asks for confirmation.

Usually, when choosing a file results in the application opening a confirmation or warning box, the file chooser stays open, and
the user can then pick a different file. However, with powerbox/Gtk, the file chooser is closed when the user chooses a file.

The GtkFileChooserIface interface allows applications to interact with the file chooser widget while it is open -- for example,
to find out which file is currently selected. This will not work with the powerbox/Gtk integration, because the powerbox API is
a simple call-return interface, and it does not give the application any way to interact with the file chooser while it is open.

GtkFileChooserButton and GtkFileChooserWidget are not replaced.

Earlier version of the GtkFileChooserDialog replacement

My first attempt at a replacement GtkFileChooserDialog class did not inherit from the GtkDialog class, on the grounds that it did not
need to open a GtkDialog window itself. It inherited from GtkObject and nothing else.

This caused some problems, because applications expect GtkFileChooserDialog to inherit from GtkDialog (and so, indirectly, from
GtkWindow, GtkWidget, etc.) as documented. This did not cause applications to crash: the Gtk API functions just print a warning
and return if passed objects that don't belong to the expected class.

The old version is still included, as gtk-powerbox-noninherit.c.

Integrations: powerbox for Emacs/XEmacs

The powerbox.el Emacs Lisp module changes Emacs to use a powerbox. It replaces Emacs' read-file-name function, which usually
prompts for a filename using the minibuffer. This function is used when you type C-x C-f, or when you choose "Open" from the
"File" menu. The replacement read-file-name function opens a "File Open" powerbox instead.

To run XEmacs to use the powerbox, use:

pola-run --prog /usr/bin/xemacs -B -f /etc --cwd / \

--powerbox --pet-name "XEmacs" --x11 \
-a=-l -a=/usr/share/emacs/site-lisp/plash/powerbox.el

GNU Emacs 21 does not work under Plash. The CVS version (to be released as GNU Emacs 22) has reportedly been fixed.

The powerbox API

The powerbox manager is compiled into the pola-run program launcher. When the --powerbox option is used, pola-run will pass
the application being launched an object under the name powerbox_req_filename.

An application can invoke the object powerbox_req_filename to request a file from the user. In response, the powerbox manager
will open a file chooser. If the user selects a filename, the powerbox manager attaches the file (or file slot) into the application's file
namespace, and returns the filename to the application.

When the application invokes powerbox_req_filename, it can pass some arguments, such as:

Whether it wants a file or a directory.

Whether it is opening or saving a file.
A textual description of why it wants the file.
A start directory. The powerbox manager checks that this directory is already in the application's namespace, so that the
application can't confuse the user into granting it the wrong file.
The X window ID of the parent window. The powerbox manager uses this to mark the powerbox window as a child of its
parent, using the WM_TRANSIENT_FOR window property (a window manager hint).

There is a simpler helper program called powerbox-req which invokes the powerbox_req_filename capability. This is a command
line program which is used by the Emacs integration. 3/4
4/8/2018 Plash: The powerbox: a GUI for granting authority

The --pet-name argument provides a name for the powerbox manager to put in the title bar of the powerbox, so the user can tell
which application the request comes from.

The powerbox manager uses Gtk's original GtkFileChooserDialog to provide a file chooser.

Mark Seaborn Up: Contents 4/4
4/8/2018 Plash: pola-run: A command line tool for launching sandboxed programs

Plash: tools for practical least privilege

pola-run: A command line tool for launching

sandboxed programs

--prog filename
[ -f[awls]... pathname
| -t[awls]... pathname pathname
| -a string
| --cwd dir
| --no-cwd
| --copy-cwd
[-B] [--x11] [--net]
[--powerbox [--pet-name name]]

pola-run runs programs under the Plash environment. It starts a process with a subset of the caller process's authority. The caller can
grant the callee read-only or read-write access to specific files and directories, which can be mapped at any point in the callee's
filesystem namespace.

pola-run provides a programmatic interface. It is intended to be used from Bash shell scripts and other programs. Since it works via
the command line, it can be used from virtually any language, in contrast to a C API. pola-run is different from the Plash shell, which
does not provide a programmatic interface: Plash shell commands are intended to be written by hand, not generated automatically.

pola-run constructs two things to pass to the callee program: a list of argument strings (as received by main()), and a file namespace.
The -a option adds strings to the argument list, in order. The -f and -t options add files and directories to the file namespace. These
can be combined: -fa and -ta add an object to the file namespace and a corresponding filename to the argument list.

Unlike the Plash shell, pola-run grants access to no files by default. The -B option will grant access to a fairly safe but large
installation endowment, which includes /usr, /bin and /lib. (Currently this is different to the shell's default installation endowment,
which includes /etc as well.)

pola-run can be used to invoke executable objects, in which case the callee can have access to objects that the caller doesn't have
access to.

NB. The implementation is incomplete. Some things mentioned here aren't implemented yet.

For arguments that take one parameter, there are two forms to choose from: "--arg param" and "--arg=param". The first form may
be more convenient to generate in a C program. The latter may be more readable in hand-written shell scripts.
--prog filename

This gives the filename of the executable to be invoked. If it doesn't contain a slash, it will be looked up in PATH.

The filename is resolved in the callee's namespace, not the caller's namespace.

Note that if the executable is a "#!" script, access to the interpreter specified in the "#!" line will not be implicitly granted.
-f[awls]...[,option]... pathname

Grant access to a file or directory (given by pathname), and optionally add pathname to the argument list. This takes the
following single-letter options. The multi-character options must be separated by commas.

Add pathname to the argument list. Equivalent to adding the argument "-a pathname". 1/3
4/8/2018 Plash: pola-run: A command line tool for launching sandboxed programs

Grant read/write access to the slot specified by pathname. By default, without this option, read-only access is granted to
the file or directory specified by pathname.

Follow symbolic links (the "l" is for "foLLow"). If symbolic links are encountered when pola-run resolves pathname,
these links will be followed. When the symlink occurs at the last element in a pathname, this will cause pola-run to grant
access to both the symlink object and the destination object (or the slots containing them, when the w option is used). If
pathname resolves to a directory, this option does not affect the treatment of symlinks inside the directory.

Grant permission to create symbolic links. (Only relevant if w is used.)


Grant read/write access to the file or directory object, but not the slot, specified by pathname. This is useful for granting
access to writable devices such as /dev/null.

Grant permission to connect to Unix domain sockets (as the w option does), but without granting write access to file and
directory objects.
-t[awls]...[,option]... dest-pathname source-pathname

Grant access to a file or directory (source-pathname), but attach it to a different pathname in the file namespace(dest-
pathname). Optionally add dest-pathname to the argument list. This takes the same options as -f.

Note that "-foptions pathname" is not exactly equivalent to "-toptions pathname pathname". -t will not introduce symlinks at
the directory components of dest-pathname, and it will fail if there are any symlinks in these locations.
-a string

Add string to the argument list.

--cwd dir

These options set the current working directory (cwd) for the process.

--copy-cwd will use the calling process's cwd. This is the default.

--no-cwd unsets the cwd, so that using a pathname relative to the cwd will give an error.

--cwd sets the cwd to a directory given by a pathname.

In any case, if the directory's pathname does not exist in the namespace created for the process, the cwd will be left unset.

These options also affect how the pathnames in other arguments are interpreted. You can use multiple cwd arguments. An
argument pathname is resolved relative to the most recent one. The final cwd argument also sets the process's cwd.

Grant access to a default installation endowment. This is equivalent to the options:

-fl /usr
-fl /bin
-fl /lib
-fl,objrw /dev/null
-fl,objrw /dev/tty


Grant access to the X11 Window System. This is currently equivalent to the options:

-fl,socket /tmp/.X11-unix/
-fl ~/.Xauthority 2/3
4/8/2018 Plash: pola-run: A command line tool for launching sandboxed programs

In the future, --x11 may work using an X11 proxy.


This grants access to some files that are important for accessing the Internet. When Plash gains the ability to deny a process
access to the network, this option will gain the function of passing network access on to the callee.

Currently, this is equivalent to:

-fl /etc/resolv.conf
-fl /etc/hosts
-fl /etc/services


Grants the callee program an object "powerbox_req_filename" which can be used to open a file powerbox. A file powerbox is a
file chooser dialog box which can grant the callee program access to files. The powerbox will dynamically attach files or
directories into the callee program's file namespace.

Currently, it only makes sense to use this when the caller runs with the user's namespace, since the file chooser will display
directory contents for the caller's namespace. In the future, the --powerbox option will be able to pass on the caller's powerbox
request object rather than always creating a new one.
--pet-name name

This provides a name to use in the title bar of powerbox windows, so that the user can identify which application is making the

In Bash:

gcc -c foo.c -o foo.o

In pola-shell:

gcc -c foo.c => -o foo.o + .

This would become:

pola-run -B --prog=gcc -a=-c -fa=foo.c -a=-o -faw=foo.o -f=.

Run Bash with the same filesystem namespace as the caller. This is useful for testing Plash:

pola-run -fw=/ --prog /bin/bash

Environment variables
pola-run sets the following environment variables:


LD_PRELOAD: pola-run must treat this specially to ensure that it is preserved across the invocation of run-as-anonymous (the fact that
this is a setuid executable usually causes LD_PRELOAD to be unset).

XAUTHORITY: pola-run looks at this when the --x11 option is used.

Mark Seaborn Up: Contents 3/3
4/8/2018 Plash: Plash's sandbox environment

Plash: tools for practical least privilege

Plash's sandbox environment

Architecture overview
Plash limits the ability of a process to open files by running it in a chroot environment, under dynamically-allocated user IDs. The
chroot environment only contains one file, an executable to exec to start the program running in the process.

Rather than using the open() syscall to open files, the client process sends messages to a server process. One of the file descriptors
that the client is started with is a socket which is connected to the server. The environment variable PLASH_COMM_FD gives the
file descriptor number. The server can send the client open file descriptors across the socket in response to `open' requests (see

The server can handle multiple connections. If the client wishes to fork() off another process, it first asks the server to send it another
socket for a duplicate connection.

GNU libc is re-linked so that open() etc. send requests to the server rather than using the usual Unix system calls. The dynamic linker
(/lib/ or, equivalently, /lib/ is similarly re-linked. execve() is changed so that it always invokes the dynamic linker
directly, since the chroot environment does not contain the main executable and the kernel does not provide an fexecve() system call.
The dynamic linker is passed the executable via a file descriptor.

The file server uses its own filesystem object abstraction internally. Filesystem objects may be files, directories or symbolic links on
the underlying filesystem provided by the Unix kernel. They may also be implemented entirely in the server. The server has its own
functions for resolving pathnames and following symbolic links which do not use the kernel's facility for following symbolic links.

The shell starts up a new server process for each command the user enters. The shell and the file server are linked into the same
executable and the shell uses the same filesystem object abstraction. The shell simply uses fork() to start a new server.

User IDs are allocated by the setuid program run-as-anonymous. It picks IDs in the range 0x100000 to 0x200000 (configurable by
changing, and opens lock files in the lock directory /usr/lib/plash-chroot-jail/plash-uid-locks so that the same
UID is not allocated twice. The lock directory goes inside the chroot jail so that the sandboxed processes can also spawn processes
with reduced authority (though this is not done yet). Therefore `chroot-jail' needs to go on a writable filesystem, so you may need to
move it.

The setuid program gc-uid-locks will garbage collect and remove UID lock files for UIDs that are no longer in use. It works by
scanning the `/proc' filesystem to list currently-running processes and their UIDs. When the shell starts, it runs gc-uid-locks.

glibc library calls and whether they are altered by Plash

Treatment Function
Intercepted and reimplemented open, mkdir, symlink, unlink, rmdir, stat, lstat, readlink, rename, link, chmod, utimes, chdir,
entirely fchdir, getcwd, opendir/readdir/closedir, getuid/getgid
Intercepted but reimplemented using fork -- duplicates the connection to the server first
the original system call execve -- invokes execve syscall on dynamic linker directly
connect, bind, getsockname -- changed for Unix domain sockets
fstat -- changed for directory FDs
close, dup2 -- changed to stop processes overwriting or closing the socket FD that is
used to communicate with the server

Not intercepted read, write, sendmsg, recvmsg, select, dup, kill, wait, getpid (and others)

Symbolic links
If we pass a directory as an argument to a program, it may contain symbolic links to anywhere. Since processes may now have
different namespaces, we have a choice of namespaces in which to resolve the destinations of the symbolic links. Do we resolve them
in the user's namespace, or the process's namespace?

If we resolve symlinks in the user's namespace, and we allow the process to create symlinks to arbitrary destinations, it could create a
symlink to `/' and thereby grant itself access to all of the user's filesystem. Instead, we could try to restrict the ability of a process to 1/4
4/8/2018 Plash: Plash's sandbox environment

create symlinks, so that it can only create symlinks to files and directories that it already has access to. But since symlinks are
interpreted relative to their position in the filesystem, which can change, it would be difficult to make this robust. Furthermore, the
problem of pre-existing symlinks remains. A user should be able to tell what files and directories they're granting access to based on
the command invocation. Granting access also to files and directories that are symlinked to, perhaps from deep inside a directory,
violates this, because there is little constraint on the destinations of symlinks.

Resolving symlinks in the process's namespace makes more sense. It follows the normal semantics of symlinks under Unix, which is
that symlinks are simply a convenience that *could* be implemented by the process itself rather than by the kernel.

Ultimately, the solution is to do away with symbolic links and replace them with object references.

If we are to implement these semantics, we must be careful not to use the kernel's ability to follow symlinks. There is not a
straightforward option for turning off following symlinks in the underlying filesystem. When we give a pathname such as `a/b/c' to
the kernel, if `a/b' is a symbolic link the kernel will always follow it, interpreting it in its namespace.

The approach used in the file server is to set the current working directory to each component of the pathname in turn. For each
component, do:

lstat() on the leaf name. If it's a symlink, do readlink() and interpret the link.

Otherwise, if it's a directory, do open(leaf, O_NOFOLLOW | O_DIRECTORY). If O_NOFOLLOW or O_DIRECTORY are

not supported, we can do fstat() to check that the object opened is the same as the one we lstat()'d (it may have changed
between the system calls).

Do fchdir() to set the current directory to the directory.

Obviously this requires more system calls than allowing the kernel to resolve symlinks.

Note that the server must never send the clients FDs for directories. A client could use a directory FD to break out of its chroot jail.

Remaining problems

The Unix kernel can be regarded as providing a set of capability registers (file descriptors) that can contain directory object
references, along with a special capability register (the current working directory) relative to which pathnames are resolved.
References can be copied from a normal register to the special register using fchdir(). References can be copied from the special
register to the normal registers using open(".").

Unfortunately, this model falls down in two places:

Directories with `execute' but not `read' permission cannot be opened with open(). One can chdir() into them, but not fchdir()
into them.

Arguably, Unix should let you open() such directories but not read their contents using the resulting FD.

This could be worked around, but no workaround is implemented yet.

link() is unusual in that it takes two pathname arguments. It is difficult to use safely (without the kernel following symlinks).
We have no guarantee that the source file (or destination) is the one we intended to link. Any check will be vulnerable to race

The same applies to rename().

Under Plash, link() and rename() are only implemented for the same-directory case.

Parent directories: the semantics of dot-dot using dir_stacks

A directory may have different parent directories in different namespaces. Furthermore, a directory may appear multiple times in the
same namespace, and so have multiple parents in that namespace. `..' does not fit well into a system based on object references.
However, it is widely used by Unix programs, so we have to support it.

Rather than using the `..' parent directory facility provided by the underlying filesystem, the file server interprets `..' itself.

The semantics is that the parent of a directory is the directory that it was reached through, after symlinks have been expanded.

This means that the filename resolver maintains a stack of directory object references, called a dir_stack. When resolving the
pathname `/a/b/..', it will first push the root directory onto the stack, then directory objects for `/a' and `/a/b', and then it will pop `/a/b'
off the stack, leaving `/a' at the top of the stack as the result. 2/4
4/8/2018 Plash: Plash's sandbox environment

If `/a/b' is a symlink to `g/h', however, the filename resolver does not push `/a/b' onto the stack (since `/a/b' is not a directory object).
It pushes `/a/g' and then `/a/g/h' onto the stack. Then, when it interprets `..' in the pathname, it pops `/a/g/h' off the stack to leave `/a/g'
(the result) at the top.

The server represents the current working directory as one of these directory stacks. One of the consequences of these semantics is
that if the current working directory is renamed or moved, the result of getcwd() will not reflect this.

This approach means that doing:


has no effect (provided that the first call succeeds). This contrasts with the usual Unix semantics, where the "leafname" directory
could be moved between the two calls, giving it a different parent directory. This is partly why programs like "rm" use fchdir() -- to
avoid this problem.

Directory file descriptors

Plash supports open() on directories. It supports the use of fchdir() and close() on the resulting directory file descriptor.
However, it doesn't support dup() on directory FDs, and execve() won't preserve them.

Directory file descriptors require special handling. Under Plash, when open() is called on a file, it will return a real, kernel-level file
descriptor for a file. The file server passes the client this file descriptor across a socket. But it's not safe to do this with kernel-level
directory file descriptors, because if the client obtained one of these it could use it to break out of its chroot jail (using the kernel-
level fchdir system call).

A complete solution would be to virtualize file descriptors fully, so that every libc call involving file descriptors is intercepted and
replaced. This would be a lot of work, because there are quite a few FD-related calls. It raises some tricky questions, such as what
bits of code use real kernel FDs and which use virtualised FDs. It might impact performance. And it's potentially dangerous: if the
changes to libc failed to replace one FD-related call, it could lead to the wrong file descriptors being used in some operation, because
in this case a virtual FD number would be treated as a real, kernel FD number. (There is no similar danger with virtualising the
system calls that use the file namespace, because the use of chroot() means that the process's kernel file namespace is almost
entirely empty.)

However, a complete solution is complete overkill. There are probably no programs that pass a directory file descriptor to select(),
and no programs that expect to keep a directory file descriptor across a call to execve() or in the child process after fork().

So I have adopted a partial solution to virtualising file descriptors. When open() needs to return a virtualized file descriptor -- in this
case, for a directory -- the server returns two parts to the client: it returns the real, kernel-level file descriptor that it gets from opening
/dev/null (a "dummy" file descriptor), and it returns a reference to a dir_stack object (representing the directory).

Plash's libc open() function returns the kernel-level /dev/null file descriptor to the client program, but it stores the dir_stack object
in a table maintained by libc. Plash's fchdir() function in libc consults this table; it can only work if there is an entry for the given
file descriptor number in the table.

Creating a "dummy" kernel-level file descriptor ensures that the file descriptor number stays allocated from the kernel's point of view.
It provides a FD that can be used in any context where an FD can be used, without -- as far as I know -- any harmful effects. The
client program will get a more appropriate error than EBADF if it passes the file descriptor to functions which aren't useful for
directory file descriptors, such as select() or write().

Why not do interception of system calls using, for example, ptrace?

Another way to do what Plash does is to intercept system calls.

One way to do this is to use the ptrace mechanism, which is available in standard versions of Linux. Using ptrace, all the syscalls a
process makes can be handled by another process. The problems with ptrace are security and performance. Firstly, fork() cannot be
handled securely with ptrace. Secondly, redirecting system calls with ptrace is slow, but it can't be done selectively. ptrace doesn't let
you redirect some syscalls (such as "open") while letting others through (such as "read"). (See David Wagner's Master's thesis,
"Janus: an approach for confinement of untrusted applications".)

systrace provides a mechanism that is similar to ptrace. It provides better performance, because it allows system calls to be
intercepted selectively. It allows race-free handling of fork(). However, it is not part of standard releases of Linux. Using it requires
recompiling your kernel and rebooting. Plash is intended to be immediately usable without recompiling your kernel. That said, it
would be useful to add systrace support to Plash in addition to its current approach.

Ostia provides a different mechanism intercepting system calls. Rather than redirecting a system call to a second process, it will
bounce a system call back to the process that issued it. Then, much like in Plash, the process makes the request via a socket. This
approach is simpler than systrace. Unlike Plash, it doesn't require modifying libc. A separate library handles the syscalls that get 3/4
4/8/2018 Plash: Plash's sandbox environment

bounced back. Ostia is implemented by a Linux kernel module. Unfortunately, the code is not publicly available. (See "Ostia: A
Delegating Architecture for Secure System Call Interposition" by Tal Garfinkel, Ben Pfaff and Mendel Rosenblum, 2004.)

Plash could benefit by using syscall interception. Using chroot and UIDs, Plash is able to control a process's ability to access the
filesystem and interfere with other processes. However, Plash does not prevent a process from connecting to or listening on network
sockets. This could be done if there was a way for Plash to prevent a process from doing connect() and bind() system calls.

How does Plash compare with chroot jails?

Plash provides functionality similar to chroot(). The Linux kernel's chroot() system call can be used to run a program in a different
file namespace (ie. root directory). chroot jailing is a well-known technique, though not used very frequently due to its limitations.

The facilities for creating new namespaces for use with chroot are limited. You can only put individual files into the chroot
environment by copying or hard linking them. It's not possible to grant read-write access to individual directory entries. Though you
can't hard link directories, you can put directories into a chroot environment using "mount --bind", but this can't be used to grant only
read-only access to a directory.

chroot environments are heavyweight. It is not practical to create one for every invocation of a program. To do so, you would have to
delete the copied files and directories, and remove any mount point entries, when the process you started had finished. If a program
starts child processes, it's hard to tell when this is. As a result, chroot environments are usually static.

Furthermore, the chroot() call is only available to the root user. (This is a consequence of the way chroot() interacts with setuid

Plash implements its security using a chroot environment, but this is largely just an implementation detail. Plash uses chroot() to take
authority away from a process, but it uses file descriptor passing to give limited authority back to the process.

Plash moves the interpretation of filenames so that it is done in user space. It allows directories to be implemented in user space. This
allows the creation of file namespaces to be more flexible. Files, directories and directory entries (slots) can be mapped anywhere in a
directory tree. Since the directory tree for a file namespace is stored in a server process, tidying up is simple: the server process exits
when no clients are connected to it.

Mark Seaborn Up: Contents 4/4
4/8/2018 Plash: FAQs: frequently asked questions

Plash: tools for practical least privilege

FAQs: frequently asked questions

If Plash relies on replacing libc, doesn't this mean that processes can get around the access restrictions by making system calls

No. Plash takes away a process's authority by putting it in a chroot() jail and running it under a freshly-allocated user ID. This
stops all of the filename-related system calls from doing anything much.

The modified glibc is not used for taking away authority. It is only used for giving authority back via a different channel. glibc
will communicate with a server process via a socket; this is how the filename-related Unix calls such as open() are
implemented. The server can send the process file descriptors via the socket.

If a sandboxed program bypasses glibc, it will only be able to see the contents of the chroot jail. If you link a sandboxed
program with the regular glibc, it probably won't work.

Why don't you intercept libc calls using an LD_PRELOADed library rather than using a replacement

Plash needs to be able to intercept all calls to functions such as open(). Using an LD_PRELOADed library can only replace
open() as seen from outside of It is not able to replace's internal uses of open(). These include:

fopen() calls open()

calls to open() to read configuration files such as /etc/hosts, /etc/hosts, /etc/resolv.conf
calls to open() to read locale data

Other tools, such as fakeroot, fakechroot and cowdancer, use LD_PRELOADed libraries which replace fopen() as well as
open(), but they are not able to handle the other cases.

It used to be that you could intercept's internal calls by defining __open and __libc_open in an LD_PRELOADed
library. But newer versions of glibc resolve these symbols at static linking time now, so you can't. This was changed for
efficiency, so that there are fewer relocations to do at dynamic link time, and so that the calls don't have to go through a jump
table. There are also some cases in where the "open" syscall is inlined, such as when using open_not_cancel (a macro).

More importantly, Plash needs to replace the dynamic linker ( so that it doesn't use the "open" syscall, and you
can't do that with LD_PRELOAD.

Why don't you use Linux's ptrace() syscall to intercept system calls instead?

Firstly, performance: ptrace() is slow, because it intercepts all system calls, and the monitor process can only read the traced
process's address space one word at a time. In contrast, Plash does not need to intercept frequently-used calls such as read()
and write() at all.

Secondly, ptrace() can only be used to allow or block a traced process's system calls. This leads to TOCTTOU (time-of-check
to time-of-use) race conditions when checking whether to allow operations using filenames, particularly when symlinks are
involved. ptrace() on its own does not let us virtualize the file namespace, as Plash does.

Systrace addresses some of the problems of ptrace(), but it is not included in mainline versions of the Linux kernel.

It used to be that there was a race condition in which a newly forked process would not initially be traced, which mean that
ptrace() was not secure for sandboxing programs that need to use fork(). I believe this has now been fixed.

Mark Seaborn Up: Contents 1/1
4/8/2018 Plash: pola-shell: A shell for interactive use

Plash: tools for practical least privilege

pola-shell: A shell for interactive use

Differences from Bash: examples

The syntax of pola-shell is similar to Unix shells such as the Bourne shell or Bash. Here are some examples of command invocations
using pola-shell:

ls .

Arguments that were implicit before must now be made explicit. With the Bourne shell or Bash you can write `ls' to list the current
directory's contents. With pola-shell you must add `.' to grant access to the current directory.

gcc -c foo.c => -o foo.o

Files are passed to the program as read-only by default. Adding the `=>' operator to a command invocation allows you to grant write
access to a file. Files that appear to the right of `=>' are passed to the program with write access.

Directories to the left of `=>' will be passed as recursive (or transitive) read-only: files and directories that they contain will also be

make + => .

If you want to grant access to a file or directory without passing the filename as an argument, you can use the `+' operator. Files that
appear to the right of a `+' are attached to the namespace of the process being run, but the filename is not included in the argument

The `=>' operator binds more tightly than `+'.

echo "Hello, world!"

The shell distinguishes between filename arguments and plain string arguments so that it can tell which files to grant access to.
Arguments beginning with a hyphen (`-') are interpreted as plain strings, but otherwise you must quote arguments to prevent them
from being interpreted as filenames.

tar -cvzf { => foo.tar.gz } dir1

If you want to put a read-write file before a file that should only be read-only in the argument list, you can limit the scope of the `=>'
operator by enclosing arguments in curly brackets { ... }.

xclock + ~/.Xauthority => /tmp/.X11-unix

You can run X Windows programs if you give them access to ~/.Xauthority, which contains a password generated by the X server,
and /tmp/.X11-unix, which contains the socket for connecting to the X server. Programs must be given write access to a socket in
order to connect to it.

grep 'pattern' file | less

Pipes work as in conventional shells.


If you want to execute a command in the conventional way, without running the process with a virtualised filesystem, in a chroot jail,
etc., you can prefix it with "!!". This can be applied to individual command invocations in a pipeline. The syntax for command
invocations is the same whether "!!" is used or not, but when it is used, files listed after the "+" operator are ignored.

cd directory

Changing directory works as before. 1/4
4/8/2018 Plash: pola-shell: A shell for interactive use

Bourne shell features missing from pola-shell

The following features are provided in the Bourne shell and Bash but not in pola-shell.

Environment variables. pola-shell doesn't provide any way to set environment variables, and it doesn't perform any substitution
of environment variables in arguments.
Backtick substitution.
Loops and conditionals. "if" and "while" and "for ... in" are not provided. Shell functions are not provided.
The "&&" and "||" operators.
Here-documents (ie. redirecting input using something like "<<EOF").

Installation endowment
A program's installation endowment is the set of files, directories and other objects that it needs and should have access to regardless
of the parameters you give to the program. It consists of libraries, configuration files, other executables -- and for interpreted
programs, source files. These are files that are in a sense "part of" the program.

On Unix, pola-shell can't tell exactly what the installation endowment of a program is or should be. Unix does not have this
information, because programs are usually given access to everything the user can access.

So, pola-shell has a default installation endowment. It grants read-only access to the directories /bin, /lib, /usr and /etc, and also
read-write access to the device files /dev/null and /dev/tty.

The default installation endowment is not configurable yet. However, you can change it on a per-program basis by using executable

Enabling access to the X11 Window System

The shell has an option for automatically granting programs access to the X11 Window System. When this is switched on, a
command such as:

xpdf foo.pdf

is equivalent to:

xpdf foo.pdf + ~/.Xauthority => /tmp/.X11-unix

This option is switched off by default because X11 is not secure! X servers provide no isolation between the clients connected to
them. One client can read keyboard input that goes to other clients, grab the screen's contents, and feed events to other clients,
including keypress events. So potentially, an X client can gain the full authority of the user.

The solution to this will be to write a proxy, through which an X client will connect to the X server, which will prevent it from
interfering with other X clients.

How to switch on this option (short version):

Either: From the shell, enter:

plash-opts /x=options 'enable_x11' 'on'

Or: To enable it for all shell sessions, you can create a file "~/.plashrc" file containing this (note the semicolon):

plash-opts /x=options 'enable_x11' 'on';

and start pola-shell with the command:

pola-shell --rcfile ~/.plashrc

(In order to make it as predictable as possible, pola-shell doesn't read any options files by default, so you have to specify options files

Job control 2/4
4/8/2018 Plash: pola-shell: A shell for interactive use

As with other shells, you can start a job in the background by putting "&" at the end of the command. Or, having run a job in the
foreground, you can suspend it by pressing Control-Z.

The command "fg <N>" puts a job in the foreground. <N> is a job number; it is not prefixed with a "%", unlike in Bash.

The command "bg <N>" resumes a job in the background.

There is no command for listing the currently active jobs yet.

Shell scripts
pola-shell has only rudimentary support for shell scripts. You can get the shell to run a script file on startup with the --rcfile option.
The "source file" command will run file as a script.

Each command in the script file must be terminated with a semicolon ";".

pola-shell doesn't accept any leading space in the script. This is a bug.

There is no error handling: if a command exits with a non-zero return code, it doesn't stop the script.

By default, the shell does not read any scripts on startup.

--rcfile file

Executes the given script on startup. Does not switch off interactive mode.

By default, the shell does not read any scripts on startup.

-c command

Execute the given command, and then exit. Disables interactive mode.

Argument lists
arglist1 => arglist2

By default, files and directories are passed as read-only. The "=>" operator lets you pass files and directories with read-write
access. Objects to the right of "=>" are passed as read-write slots, so the object doesn't have to exist in advance.
arglist1 + arglist2

Files and directories that appear to the right of the "+" operator are not included in the argument list (the one used in execve()),
but they are attached into the file namespace of the process.

Arguments that are not filenames should be quoted, unless they begin with '-'.

You can attach objects to arbitrary points in the file namespace. Here, expr typically evaluates to a file, directory, or executable
object. This will include pathname in the argument list.
{ arglist }

You can limit the scope of "+" or "=>" using curly brackets.

IO redirection. You can change the file descriptors that are passed to the process.

Commands 3/4
4/8/2018 Plash: pola-shell: A shell for interactive use

cd pathname

Sets the current directory.

fg job-number

Puts the given job in the foreground. (Job numbers are not prefixed with `%', unlike in Bash.)
bg job-number

Puts the given job in the background.

def var = expr

Binds the object reference returned by the expression to a variable.


Returns the object reference that is bound to the variable.

F pathname

Returns the file or directory object at the given path. Will follow symbolic links.

mkfs args...

This expression returns a fabricated directory object containing the files listed in args. The object resides in a server process
started by the shell.

args is processed in the same way as argument lists to commands, so read-only access will be given for files that are listed
unless "=>" is used, and objects can be attached at points in the directory tree using path=expr.

capcmd command args...

This built-in expression is similar to a normal command invocation, except that it expects the resulting process to return an
object reference as a result. The shell passes the process a return continuation argument (return_cont; see the PLASH_CAPS
environment variable), which the process invokes with the result.

This expression doesn't wait for the process to exit: the process will typically act as a server and stay running in the
background to handle invocations of the object that it returned.

If the process drops the return continuation without invoking it (which will happen if it exits without passing the reference on),
the expression results in an error.

Mark Seaborn Up: Contents 4/4
4/8/2018 Plash: Executable objects: a replacement for setuid executables

Plash: tools for practical least privilege

Executable objects: a replacement for setuid


Plash extends the concept of executables -- which are anything that can be invoked via Unix's execve() call -- so that in addition to
executable data files, you can have executable objects. In this case, execve() works by invoking the object via a method call.
Executable objects can be attached to the filesystem tree and unmodified Unix programs can call them. Executable objects can be
constructed from Unix programs as well.

The executable objects feature allows for fine-grained control over how processes are constituted, in particular their file namespaces.

This is similar to chroot() environments under Linux. chroot() also allows a process's root directory (its file namespace) to be
changed. It can be used to run different Linux distributions on the same machine, change the libraries a program dynamically links
with, etc. However, Linux has only limited, heavyweight mechanisms for creating file namespaces. Plash's mechanisms are
lightweight, flexible, and not restricted to the superuser, and Plash can treat the files that a program receives as arguments separately
from its library files and configuration files.

Applying POLA to argument files and other files

We can divide the files that a process uses into two sets, Arg and Env. Arg is the set of files that are passed as parameters to the
program. Env consists of the remaining files: libraries, configuration files, files that would be installed by a package manager -- files
that the program is in some sense "linked with". Plash has always provided control over the Arg set, applying the Principle of Least
Authority (POLA) to it. However, Plash has a default setting for the contents of the Env set. The use of executable objects lets you
change that default on a per-program basis and apply POLA to all the files a program accesses.

By default, Plash maps the system's "/usr", "/lib", "/bin" and "/etc" directories (as read-only) into the file namespace of processes that
it starts, along with "/dev/null" and "/dev/tty" (as read-write) -- this is the default Env set. Any other files or directories are mapped
into the file namespace if and only if they are listed on the command line -- this is the Arg set.

In this default mode of operation, POLA is applied to files in the user's home directory, but not to system files. The programs you use
do not have to be declared in advance. This way, Plash can be used almost as a drop-in replacement for non-POLA shells like Bash.
You can run Unix programs using command lines that are not too different from their equivalents under Bash, because Plash's default
Env set covers most of the program's actual Env set.

We can do a bit better if we are prepared to declare a program before using it, in order to provide some information about the
program that is not provided in a Unix installation. Plash lets you create an executable object and bind it to a variable, specifying the
Env portion of the program's file namespace. Given this control, you can include files that are not in Plash's default (such as
configuration files in your home directory) or leave out files that are in Plash's default -- this helps get back the convenience of a non-
POLA shell such as Bash while providing better security. Perhaps more importantly, you can control not only whether a filename is
mapped in the namespace, but what file it maps to -- this provides something you couldn't do before.

It is possible to install two Linux distributions on one computer, and run one inside the other in a chroot() environment. However, the
interoperability between these two sets of programs is very limited. Linux doesn't normally provide a fine-grained mechanism for
granting a program access to files outside its chroot() environment, and the mechanisms for creating chroot() environments are
limited: you can hard link files (but not directories, and not across partitions), and you can use "mount --bind" on directories (but not
individual files). Furthermore, the chroot() call is only available to the superuser. It's difficult enough to use this for a couple of
installed distributions; to do it on a per-program basis is totally impractical.

In contrast, Plash provides lightweight mechanisms for creating file namespaces (which are simply directories, although they do not
have to be stored on a Linux filesystem). Executable objects can be self-contained and provide their own execution environment,
which allows for better interoperability between programs: a process can invoke an executable object which uses a different file
namespace (root directory) to the caller for files in its Env set, yet the executable object can receive its Arg set from the caller.

Invocations between programs

This document mainly focuses on applying POLA when the user invokes an executable using the shell. It doesn't give much attention
to the cases in which one program invokes an executable using execve(): in this case, we desire that the caller apply POLA and not
pass too much authority on to the callee, and we desire that the callee not be confusable. If the caller doesn't apply POLA and the 1/5
4/8/2018 Plash: Executable objects: a replacement for setuid executables

callee is confusable -- which will be true if they are unmodified Unix programs -- and if the two have Env sets that clash -- that is, the
same filename maps to different files in each -- then we have some basic workability problems, not just security problems.

I hope to discuss these problems, and some solutions, in a forthcoming document.

I'll look at creating an executable object for the Unix command line program `oggenc', which encodes WAV files as Ogg Vorbis files.
(Ogg Vorbis is like the MP3 format, but a bit smaller and free of patent problems.) To invoke `oggenc' with Plash you might do:

oggenc input_file.wav => -o output_file.ogg


In this case, the resulting process's file namespace will contain:

/usr/bin/oggenc (read-only)
/usr, /lib, /bin, /etc (read-only)
/dev/null (read-write)
under the pathname of the current working directory: input_file.wav (read-only), output_file.ogg (read-write slot)
/dev/tty (read-write)

However, it happens to be that `oggenc' doesn't need to access "/etc" or all of "/usr". We could define an executable object for running
`oggenc' that gives the program an execution environment containing less:

def my_oggenc =
capcmd exec-object '/usr/bin/oggenc'
/x=(mkfs /usr/bin/oggenc /usr/lib /lib)

[This needs to be entered on one line when using the shell interactively. Alternatively, you can put it in a file and load it with "source

This will create an executable object and bind it to the variable "my_oggenc". To invoke the object, we use the same syntax as

my_oggenc input_file.wav => -o output_file.ogg


In this case, the the resulting process's file namespace will contain:

/usr/bin/oggenc (read-only)
/usr/lib, /lib (read-only)
under the pathname of the current working directory: input_file.wav (read-only), output_file.ogg (read-write slot)
/dev/tty (read-write) [actually, not included in current version]

While in (1), "oggenc" is treated as a filename and searched for in PATH, in (2), "my_oggenc" is recognised by the shell as a bound
variable. The shell doesn't start a new process in this case, it just invokes the executable object that "my_oggenc" is bound to. The
shell creates a namespace from the arguments, which it passes to "my_oggenc", but it doesn't include "/usr", "/lib", "/bin" and "/etc"
as before -- the "my_oggenc" is expected to provide the files it needs itself.

Suppose we don't want to install "oggenc" and the libraries it uses in our system's "/usr" directory. Maybe we don't have access to
that directory, because we don't have root access. Maybe we have older versions of those libraries in "/usr" which some other
program uses, and we don't want to risk messing that program up by upgrading its libraries. Maybe we just want to organise our files
differently from usual. Perhaps we are running RedHat, but a Debian distribution is installed under "/debian", and we want to use
Debian's version of `oggenc'.

def my_oggenc =
capcmd exec-object '/usr/bin/oggenc'
/usr/bin/oggenc=(F /debian/usr/bin/oggenc)
/usr/lib=(F /debian/usr/lib)
/lib=(F /debian/lib))

[NB. This requires that Plash is installed in the Debian distribution as well, so that will still be taken from /usr/lib/plash/lib
rather than /lib.] 2/5
4/8/2018 Plash: Executable objects: a replacement for setuid executables

These declarations still give `oggenc' a lot of files it doesn't need. We could give a tighter definition that lists exactly those files that
`oggenc' needs in its execution environment. `oggenc' is fairly simple: it doesn't use a huge number of dynamically-linked libraries,
and it doesn't need any configuration files.

Under Linux, we can find out the dynamic libraries that an executable file uses with the "ldd" command:

bash$ ldd /usr/bin/oggenc => /usr/lib/ (0x40028000) => /usr/lib/ (0x4009c000) => /lib/i686/ (0x400bb000) => /usr/lib/ (0x400dd000) => /lib/i686/ (0x42000000)
/lib/ => /lib/ (0x40000000)

[Run this under Plash using "!!ldd /usr/bin/oggenc".]

Given this information, we can make a new definition:

def my_oggenc =
capcmd exec-object '/usr/bin/oggenc'

[Future work will be to provide tools to help with constructing a definition like this.]

("/lib/" is the dynamic linker and doesn't need to be included.)

Suppose we want another program to be able to invoke `my_oggenc'. We can attach the object into a filesystem with a syntax like

bash + /my-bin/oggenc=my_oggenc

[NB. I don't use `/bin/oggenc=my_oggenc' because it's not yet possible to attach objects inside other attached directories, such as
`/bin/oggenc' inside `/bin', which is attached implicitly.]

This runs Bash with the pathname `/my-bin/oggenc' mapped to `my_oggenc'. You can then run `my_oggenc' from inside Bash. This
is a good way in general to test out the file namespaces that Plash creates.

The process replacement behaviour
Normally, execve() replaces the current process. Method calls don't and can't have that behaviour: the callee does not even have to
start a new process.

The modified libc is responsible for emulating the process replacement behaviour. execve() (and the other functions in the `exec'
family which use it) will test whether the filename it is given resolves to an executable object or a regular file. This test uses the
"Exep" method. Note that this is different to the shell: the shell chooses its behaviour according to whether the command name is a
bound variable or not.

If execve() is given an executable object, it invokes it (passing the root directory, file descriptors, etc.). When the method call returns,
this means the new process has exited; it gives the exit code. libc's execve() wait for the method call to return, and then exits, using
the same exit code.

Plash does not modify libc's wait() and waitpid().

This is slightly unsatisfactory in three respects:

It doesn't let P return the correct wait() status code to its parent when the process created by X dies with an unhandled signal
(such as SIGSEGV).

It doesn't let P notify its parent when the new process is stopped (by SIGSTOP or when the user presses Ctrl-Z).

kill() doesn't work as expected: it sends a signal to the process that is waiting, not the process it spawned. 3/5
4/8/2018 Plash: Executable objects: a replacement for setuid executables

There is an extra process hanging around, filling up the process table and taking up memory (and holding onto open file
descriptors -- though this could be fixed) but not doing much else.

The solution to this would be to modify wait() and waitpid(). This would not be too bad because they can only be used on child
processes. Modifying kill() as well would be trickier and less desirable, because it involves a global namespace of process IDs, and
we would like to avoid global namespaces.

Discovering file descriptors

libc's execve() finds out which file descriptor indexes a process has open simply by trying to dup() each index in turn, upto a high
index number. If your program uses FDs with big FD numbers (eg. >1000), this may cause problems. Although the Linux `proc'
filesystem can be used to find out what file descriptors a process has open, this is not available in the Linux chroot() environment
Plash uses to run programs in, and there's no way to use it securely.

Garbage collection
exec-object will exit when the reference to the object it provides is dropped, and it has no more processes to handle.

Linux, job control, and TTY file descriptors
File descriptors for TTYs under Unix do not behave like capabilities in the sense that the kernel takes a process's "process group"
into account when the process does IO on a TTY file descriptor. This is part of the Unix job control mechanism. A process will be
stopped (with SIGTTIN) if it tries to read from a TTY when it is not part of the TTY's current process group.

I don't think this is a good design. So far, however, it has not a problem because the processes started by `exec-object' can simply set
their process group ID to the one specified in the "exec" invocation. That lets them read input from the terminal.

However, processes also have a "session ID". Typically, the processes running under a given terminal window run in their own
distinct session. A process cannot set its process group ID to a process group that belongs to a different session. So if an exec-object
instance E, started from one terminal window W1, is invoked by a process in another terminal window W2, E won't be able to start a
process P that can read input from the user in W2, even if P has the appropriate TTY file descriptor. This may be a problem in the

I can see two ways around this:

Just arrange for all the relevant processes to be running under the same session ID. This would only work if we're not using
existing terminal emulators (xterm, gnome-terminal, etc.). It might not work at all.

Virtualise IO on file descriptors to use method calls on objects instead. There would be a lot of libc functions to modify in
order to do this properly, but this has other uses.

Job control
You can start a process via the shell using an object invocation, and you can stop the process by pressing Ctrl-Z, but the shell is not
informed that the process has been stopped, so the shell will not return control to the user and display a prompt.

This needs to be fixed. It is a deficiency in the specification of the "Exeo" method call.

exec-object limitations
exec-object doesn't provide any control over the arguments and environment variables it passes to the processes it starts.

exec-object doesn't start its child processes with a different UID, so the child process could kill it, ptrace() it, etc. (exec-object should
use "run-as-anonymous" like the shell does.)

Shell limitations
The shell does not provide a mechanism for sharing object references with other instances of the shell, with other users, or across the

The shell does not allow for recursive definitions using "def". 4/5
4/8/2018 Plash: Executable objects: a replacement for setuid executables

The shell only supports "capcmd CMD ARGS..." where CMD is an executable file, but not where CMD is an executable object, and
it doesn't support running CMD in the standard Unix way (as the `!!' syntax does).

A "capcmd !! CMD ARGS..." expression would allow the use of existing setuid executables from programs running under Plash.

A "capcmd VAR ARGS..." expression would make it possible to have a single process provide multiple executable objects, ie:

def factory_maker = capcmd factory-maker-maker

def echo = capcmd factory_maker '/bin/echo' ...
def ls = capcmd factory_maker '/bin/ls' ...

Mark Seaborn Up: Contents 5/5
4/8/2018 Plash: Communication protocols

Plash: tools for practical least privilege

Communication protocols

Protocol for messages with file descriptors

Implemented by comms.c.

The first protocol is used to send messages over a socket. It simply divides the stream into messages. Each message may contain data
and file descriptors.

Each message comprises:

int32: "MSG!"
int32: size of data payload in bytes (not necessarily word-aligned)
int32: number of file descriptors
data payload, padded to word (4 byte) boundary

See the man pages sendmsg(2), recvmsg(2) and cmsg(3) for details about how file descriptors are sent across sockets.

Object-capability protocol
Implemented by cap-protocol.c.

This is layered on top of the message protocol. It allows references to an arbitrary number of objects to be exported across a

Objects can be invoked, passing data, file descriptors and references to other objects as arguments. Object references can be dropped,
allowing the storage associated with the reference to be reclaimed; the storage associated with the object itself can also potentially be

There are two endpoints to a connection. Each may export object references to the other. The protocol is symmetric -- it doesn't
distinguish between client and server. For the sake of explanation, however, let us call the endpoints A and B. Everything below
would still hold true if A and B were swapped.

At any given point, A is exporting a set of object references to B. Each reference has an integer ID. These are typically interpreted as
indexes into an array (the `export table'), so that when A receives an invocation request from B, it can look up the object in the array
and invoke it.

The set of object references that A exports to B may be extended with new references by messages that A sends to B (but not by
messages B sends to A). These object references may be removed by messages that B sends to A (but not by messages A sends to B).

Messages in the protocol contain object IDs, which contain two parts. The lower 8 bits are a `namespace ID'. This indicates whether
the reference is to an object exported by A or by B, and whether a newly-exported reference is single-use. The rest of the object ID is
the reference ID (an index into an export table).

The possible namespace IDs are:


The messages that A may send to B are:

"Invk" cap/int no_cap_args/int cap_args data + FDs

Invokes an object X. X is denoted by the object ID `cap'. X must be an object that B exports to A, so `cap' may only use the
RECEIVER namespace. (Since A is sending, B is the receiver.)

`cap_args' is an array of object IDs of length `no_cap_args'. These denote objects to be passed as arguments to X. These object
IDs may use any of the three namespace IDs: 1/3
4/8/2018 Plash: Communication protocols

For the RECEIVER namespace, this refers to an object that B exports to A.

For the SENDER namespace, this indicates that A has added a new reference to the set of objects it exports to B. From
this point on, B may send A messages referring to this ID (except that B will refer to the object with the RECEIVER
namespace instead of SENDER).

The SENDER_SINGLE_USE namespace works the same as SENDER, except that it indicates to B that the reference is
single use. B may only invoke this object once. Once B invokes the object, the reference becomes invalid. (However, B
may pass the object as an argument without this restriction.)

The message may include file descriptors to pass as arguments to X.

When B receives this message, it invokes X with the specified arguments. If `cap' is a reference that is exported as single-use,
B removes the reference from its export table.

"Drop" cap/int

Drops a reference. `cap' is an object ID that B exports to A, so `cap' may only use the RECEIVER namespace.

When B receives this message, it removes the reference `cap' from its export table. B may also delete the object X that `cap'
denotes if there are no other references to X.

Closing the connection

Violations: If either end receives a message that is illegal, such as messages that contain illegal object IDs, it may choose to terminate
the connection. This would mean closing the file descriptor for the socket. Assuming there are no other copies of this file descriptor
in the system (in this or other processes), the other end will get an error when it tries to read from its socket, and also regard the
connection as broken. Having closed the connection, an endpoint is free to delete its export table, and possibly free the objects it
contained references to.

In general, endpoints are free to close the connection anyway, if they want to.

When no references are exported from B to A or A to B, it is conventional to close the connection, because it is of no use: no
messages can legally be sent on it. Conventionally, A will close the connection rather than sending a "Drop" message for the last
reference that B exports to A, when A exports nothing to B.

Initial state of a newly-created connection
A and B start off holding socket file descriptors connected to each other (typically created by socketpair()). A exports M references to
B; these are given IDs 0 to M-1. B exports N references to A; these are given IDs 0 to N-1.

The numbers M and N must be made known to both A and B by some means outside of the protocol, just as the file descriptors are
obtained by some means outside of the protocol.

If A and B have differing views about what M and N are, one will probably send messages that the other sees as a protocol violation,
and the latter may close the connection.

Of course, M and N and the file descriptors can be sent in invocations using the protocol. See the conn_maker object.

Also see the PLASH_CAPS environment variable.


With most invocations, you want to receive a result (even if it's just an indication of success or failure). In these cases, an object X is
invoked with a message starting with "Call". The first object argument is a `return continuation', C. When it has finished, X invokes
C with arguments containing the results.

What happens if C is never invoked? This might happen if a connection is broken. C will get freed in this case, perhaps as a result of
a "Drop" message, and this can be used to indicate to the caller that the call failed.

What happens if C is invoked more than once? C should simply ignore any invocations after the first one.

A return continuation is typically exported as a single-use capability. This is not so much to stop it being invoked more than once
(because subsequent invocations can easily be ignored), but more to prevent the build-up of exported references: 2/3
4/8/2018 Plash: Communication protocols

When A repeatedly calls B, B might fail to drop the references to the return continuations that A passes it after invoking them.
This would cause A's export table to fill up with useless references. A could not legally re-use the IDs for these references
according to the protocol. However, if A passes the return continuations to B as single-use references, B cannot legally use
their IDs after invoking them, so A can re-use the IDs and free up space in its export table. (If B does invoke an already-
invoked single-use reference, it is violating the protocol and A might close the connection as a result.)

However, this was not the immediate motivation for adding single-use references to the protocol. More importantly: and (the dynamic linker) both need to make calls to objects in order to open files, etc. So they both need to
pass return continuations, and allocate IDs for them. If the return continuations' IDs are invalidated after each call, and can allocate the IDs without regard to each other. It is much simpler when they don't need to co-ordinate but can
still share the same connection. Each return continuation can be exported with the same ID; these are the only objects exported
from this end of the connection.

The same issue arises when passing control to a new process image using "exec".

Without single-use references, might make a call and receive an "Invk" message as a result (but not wait for any
further messages). Then might make a call, then listen for a result and receive a "Drop" message for a reference it never
exported. would treat this as a protocol violation and shut down the connection. With single-use references, the "Drop"
message is unnecessary, because it is implied.

Future extensions
The protocol does not provide a facility for message pipelining, ie. letting A invoke the result of a call to B before the call returns
(saving the time of a round trip).

Such a facility involves letting A's messages add entries to B's export tables. A would be able to choose IDs for references that B
exports. It would no longer be the case that B allocates all the IDs that it exports.


These environment variables are used to set up the connection and objects for standard services, like access to the filesystem.

PLASH_COMM_FD contains the number of a file descriptor for a connection to a server.

PLASH_CAPS says how many objects are exported by the server over the connection, and what they are. It is a semicolon-separated list
of names for services. The index of a service name in the list is the object ID for the service.

For example, "fs_op;conn_maker;;;something_else" says that conn_maker has object ID 1 and something_else has object ID 4.

Standard services are:

return_cont (this is passed by the "capcmd" expression)

Mark Seaborn Up: Contents 3/3
4/8/2018 Plash: RPC methods

Plash: tools for practical least privilege

RPC methods

fs_op object
This object implements all the standard Unix filesystem calls that operate on pathnames: open(), mkdir(), unlink() and so on. You can
construct one of these objects given a root directory.

This object has one piece of state: the current working directory (cwd). This is allowed to be unset, in which case any operation that
it relative to the cwd will return an error.


The request is given before "=>"; possible replies come after.

"+ FD" indicates that a message includes a file descriptor argument.
"+ foo/obj" indicates that a message includes an object reference.


// duplicate the connection -- called before the fork() syscall

// (now obsolete; will be removed)
"RFrk" + FD

=> "Okay" + fs_op/obj

"Gdir" pathname
=> "Okay" + dir/obj
Resolves `pathname' to get a directory, and returns the directory object.

=> "Okay" + dir/obj
Same as <<"Gdir" "/">>.

"Gobj" pathname
=> "Okay" + obj
Resolved `pathname' to get any object; will follow symlinks.

// open() call
"Open" flags/int mode/int filename
"ROpn" + FD
"RDfd" + FD + dir_stack/obj // This is returned when open() is used on a directory.
// FD is for /dev/null, and the object is a dir_stack.
"Fail" errno/int

// stat() and lstat() calls

"Stat" nofollow/int pathname
"RSta" stat
"Fail" errno/int

// readlink() call
"Rdlk" pathname
"RRdl" string
"Fail" errno/int

// chdir() call
"Chdr" pathname
"Fail" errno/int 1/4
4/8/2018 Plash: RPC methods

// fchdir() call: takes a dir_stack object as returned by open()

"Fchd" + dir_stack/obj
"Fail" errno/int

// getcwd() call
"RCwd" pathname
"Fail" errno/int

// list contents of directories: opendir() + readdir() + closedir()

"Dlst" pathname
// same as `struct dirent' format:
"RDls" (inode/int type/int name_size/int name)*
"Fail" errno/int

// access() call
"Accs" mode/int pathname
"Fail" errno/int

// mkdir()
"Mkdr" mode/int pathname
"Fail" errno/int

// chmod() call
"Chmd" mode/int pathname
"Fail" errno/int

// utime()/utimes()/lutimes() calls
"Utim" nofollow/int
atime_sec/int atime_usec/int
mtime_sec/int mtime_usec/int
"Fail" errno/int

// rename() call
"Renm" newpath-length/int newpath oldpath
"Fail" errno/int

// link() call
"Link" newpath-length/int newpath oldpath
"Fail" errno/int

// symlink() call
"Syml" newpath-length/int newpath oldpath
"Fail" errno/int

// unlink() call
"Unlk" pathname
"Fail" errno/int 2/4
4/8/2018 Plash: RPC methods

// rmdir() call
"Rmdr" pathname
"Fail" errno/int

// connect() on Unix domain sockets

"Fcon" pathname + FD
"Fail" errno/int

// bind() on Unix domain sockets

"Fbnd" pathname + FD
"Fail" errno/int

// part of execve() call

// Arguments are:
// * command pathname (in "cmd" and "cmd-len" below)
// * a list of string arguments (in "ref" and "data")
// The RExe result tells the client what it should pass to the exec syscall.
// The RExo result returns an executable object which the client must invoke
// with full arguments, including the root directory.
"Exec" cmd-len/int cmd ref/int data
"RExe" cmd-len/int cmd argc/int (arg-len/int arg)* + FD
"RExo" + CAP
"Fail" errno/int


stat = dev ino mode nlink uid gid rdev size blksize blocks atime mtime ctime
(all ints)

Filesystem objects: files, directories and symlinks contains the definitions for these methods.

Executable objects
Executable objects are like files. They respond to the "fsobj_stat" method. You generally can't open them with the "file_open"
method -- this will give "Permission denied". They support two methods besides the usual file methods.

// Test whether this is an executable object.

// Executables that are just files will not respond to this.
=> "Okay"

"Exeo" ref/int data

=> "Okay" return_code/int
The data is an array of pairs:
* ("Argv", x): x is an array of strings representing argv
* ("Env.", x): x is an array of strings representing the environment
(usually each string is of the form "X=y")
* ("Fds.", x): x is an array of (i, FD)
* ("Root", obj): obj is the root directory
* ("Cwd.", string): pathname of current working directory
(this can be omitted, in which case process will have no defined cwd)
* ("Pgid", int): process group ID to set for the new process
(this is optional, but reading from the console won't work without
setting it, and neither will Ctrl-C or Ctrl-Z)
The invocation returns when the process started has exited. It returns
the exit code that `wait' returns for the process.

conn_maker object
This has one method: 3/4
4/8/2018 Plash: RPC methods

"Mkco" M/int + (N objects)

=> "Okay" + FD + (M objects)

This creates a new connection on which the N objects are exported. It returns "Okay" and a file descriptor for the new connection.
The new connection also imports M objects. The method call returns these M objects.

So far this is only used with M = 0.

fs_op_maker object
This has one method:

"Mkfs" + root_dir/obj
=> "Okay" + fs_op/obj

This creates an fs_op object (see above) with root_dir as the root directory. The current working directory is initially unset; you can
set it with the "Chdr" (chdir) method.

Mark Seaborn Up: Contents 4/4
4/8/2018 Plash: News

Plash: tools for practical least privilege


Version 1.17 (2006-12-23)

New facilities:

Python bindings for the Plash object interface. A Python implementation of pola-run is available in the source package.
cow_dir: Provides layered directories, with which reads come from one directory tree while writes go to another. Eventually
this will provide a copy-on-write directory facility (currently it does not allow writing to files that are in the read layer at all).
This is usable via the Python bindings but not through pola-run (see python/examples/


Now looks up executable names in PATH, unless --no-search-path (a new option) is given.
Security bug fix: Ensure that "-t" grants read-only access when "w" flag is not given. (Previously, "-t" would always grant read-
write access, ignoring the "w" flag.)

New option: "-e". This gives a way to specify the executable name and its arguments without prefixing each one with "--prog"
and "-a". It is more in line with the interfaces of other Unix commands that invoke executables, such as "chroot" and "xterm".

Usage: -e <executable-name> <arg>...

Equivalent to: --prog <executable-name> -a <arg1> -a <arg2> ...

"-e" swallows the remaining arguments, so it must appear last in the argument list.

Improvements to logging facilities. New option: "--log-file <file>", sends log output to a file. The logging format has been
changed slightly: two characters summarise whether the operation was a read or a write, and whether it succeeded or failed.


Now uses glibc 2.3.6 for Debian (rather than glibc 2.3.5).
Supports building glibc 2.4 and 2.5. glibc 2.4 is used for the Ubuntu package. This is only partial support -- the new *at()
functions are not properly implemented yet.

glibc functions:

lchmod() now implemented.

Add partial implementation of chown()/lchown(): succeeds when no owner/group change is requested.
getsockopt() is now intercepted so that UID/GIDs can be faked for the SO_PEERCRED case.
Fixed getcwd() to pass glibc's io/tst-getcwd test case. getcwd(NULL, size) when size>0 now returns an error if size is not large
enough.'s close() function is now intercepted correctly.


Will now set the current working directory where possible instead of leaving it undefined.


Branches of the packaging scripts are included for building under Debian sarge (excluding Python and Gtk support), and
Ubuntu edgy (using glibc 2.4 instead of 2.3.6).
The Plash source package no longer includes the glibc source. Instead, glibc source tarballs are in a separate binary package
which puts them under /usr/src. Plash Build-Depends on the glibc source package.

Version 1.16
The powerbox/Gtk integration code has been rewritten so that the replacement GtkFileChooserDialog class inherits from GtkDialog
(and hence from GtkWindow, GtkWidget, etc.). This works much better than the previous approach. It works with more Gtk

Version 1.11 1/7
4/8/2018 Plash: News

The major new feature in this version is the "plash-run-emacs" program. This lets you start an XEmacs process and then grant it
access to individual files and directories, as you need to edit them.

You can start XEmacs from the Plash shell with the following commands:

plash-opts /x=options 'enable_x11' 'on'

def edit_file = capcmd plash-run-emacs

Then edit the file "foo.txt" with:

edit_file => foo.txt &

This works like gnuserv (in fact, it calls some of gnuserv's Elisp code). It grants access to foo.txt to plash-run-emacs, which adds it to
XEmacs' file namespace. Then it asks XEmacs to open a window to edit the file.

"edit_file" is a shell variable which is bound to an executable object. I have introduced two tools for exporting Plash object
references to other instances of the shell. In the shell where you bound the "edit_file" variable, do:

plash-socket-publish => /tmp/emacs /x=edit_file

Then you can use the following command in another instance of the shell to make the object available there:

def edit_file = capcmd plash-socket-connect => /tmp/emacs

(You can use plash's "--rcfile" switch to execute this on startup.)

This only works with XEmacs at present, not GNU Emacs. GNU Emacs has problems running under the Plash environment. It
doesn't like being started using "/lib/ /usr/bin/emacs": it fails with a "Memory exhausted" error. This needs more

This functionality is fairly awkward. One major improvement will be to implement a "powerbox". XEmacs would be able to request
a "File Open" dialogue box, through which the user would grant it access to files.

Running the shell as root

It's now safer to run the Plash shell as root.

Before, the default installation endowment included "/dev/null" and "/dev/tty" as read/write slots. A malicious program could delete
or replace "/dev/null" and "/dev/tty" if the shell had that authority. Now they are attached as files, not slots. Programs are not passed
on the authority to delete them or create objects in their place.

However, it's not yet completely safe to set the "enable_x11" option when running the shell as root. In this case, the shell grants read-
write access to the "/tmp/.X11-unix" directory.

Following symlinks

Suppose "link.txt" is a symbolic link to "file.txt". If you run the command:

cat link.txt

then the shell follows the symlink and includes both "link.txt" and "file.txt", as read-only, in cat's file namespace.

Previously, if you did "cat => link.txt", the shell would grant read/write/create access to the "link.txt" slot, but it would not follow the
symlink. This part was not fully implemented. Now it is: The shell will follow the symlink and grant read/write/create access to the
"file.txt" slot.

This was necessary for making the "edit_file" command follow symlinks.

However, I've come to realise that having the shell follow symlinks is more dangerous than I originally thought. A command that is
run multiple times with the same arguments, and granted read/write/create access to a slot, could get the shell to give it write access
to the root directory. The shell's security would be rendered useless.

Part of this problem is inherent in symbolic links: they store a string, not an object reference.

We could fix this by providing alternatives to symlinks, such as hard links that work with directories and across partitions. An ideal
solution would involve persistence. But this would be difficult and complex to do under Unix. It would be hard to integrate with
existing filesystems. 2/7
4/8/2018 Plash: News

Furthermore, it doesn't address the problem of how to deal with symlinks that exist on your system already. A simpler but not ideal
solution would be for the shell to indicate when an argument is a symlink, and to provide a quick way of replacing or augmenting it
with the object it points to. This at least provides some form of review. For this to be usable, the shell will have to use GUI features.

Documentation overhaul
I have mostly converted the documentation to DocBook format, including the man pages (which were in POD format before). I
couldn't face writing XML by hand, so I have created an alternative surface syntax for XML.

However, the documentation still needs work.

Other changes

Fixed a bug when using the shell's non-interactive mode (its "-c" option).

Version 1.10
New in this version is an implementation of fchdir().

There are a number of programs that need fchdir(), including "rm -r", "install" and "mkdir -p".

fchdir() sets the process's current directory given a file descriptor for a directory.

Usually, under Plash, the open() function will return a real, kernel-level file descriptor for a file. The file server passes the client this
file descriptor across a socket. But it's not safe to do this with kernel-level directory file descriptors, because if the client obtained one
of these it could use it to break out of its chroot jail (using the kernel-level fchdir system call).

So, for directories, the file server's open() method returns a dir_stack object, which is implemented by the file server rather than by
the kernel. Under Plash, libc's open() function returns a kernel-level file descriptor for the device /dev/null (a "dummy" file
descriptor), but it stores the dir_stack object in a table maintained by libc. Plash's fchdir() function in libc consults this table; it can
only work if there is an entry for the given file descriptor number in the table.

Creating a "dummy" kernel-level file descriptor ensures that the file descriptor number stays allocated from the kernel's point of view,
and it ensures that passing the file descriptor to functions such as select() or write(), which aren't useful for directory file descriptors,
gives an appropriate error rather than EBADF.

Plash's dir_stack objects are a bit different from its directory objects. Under Plash, a directory object doesn't know what its parent
directory is -- multiple directories can contain the same object. This property is important because processes have their own private
namespaces. Plash implements the ".." component of filenames using dir_stacks. A dir_stack is a list of directory objects
corresponding to the components of a directory pathname. For example, dir_stack for the pathname "/a/b" would contain the
directory object for "/a/b" at the head, then the directory for "/a", then the root directory. It also contains the names "b" and "a"; this is
used to implement getcwd().

This approach means that doing:


has no effect (provided that the first call succeeds). This contrasts with the usual Unix semantics, where the "leafname" directory
could be moved between the two calls, giving it a different parent directory. This is partly why programs like "rm" use fchdir() -- to
avoid this problem.

Note that dup(), dup2() and fcntl()/F_DUPFD will not copy directory file descriptors properly under Plash; only the kernel-level part
is copied because Plash does not intercept these calls. Similarly, directory file descriptors will not be preserved across execve() calls.
This is unlikely to be a problem in practice. It could be fixed if necessary.

Version 1.9
In this version, I have changed the implementation of how file namespaces are constructed for processes.

When the shell processes a command, it constructs a tree representing the filesystem namespace, indicating where to attach
filesystem objects (files, directories and symlinks) in the namespace. For example, the command:

some-command /home/fred/my-file.txt /bin/new-prog=EXPR

would produce a tree like the following: 3/7
4/8/2018 Plash: News

* etc: ATTACH
* usr: ATTACH
* lib: ATTACH
* bin: ATTACH
* new-prog: ATTACH
* home
* fred
* my-file.txt: ATTACH

Each node in the tree is a "struct node".

At the paths "/usr", "/lib", etc. are attached real directory objects that correspond to directories on a Linux filesystem.

The tree nodes for "/" and "/home", however, do not correspond to any directory on a Linux filesystem. The shell traverses this tree,
and for these tree nodes, it creates "fabricated" directory objects that are implemented by a server process. This is implemented in
build-fs.c. Fabricated directories are implemented in filesys-obj-fab.c.

In the old version of the code, the information in a "struct node" was copied to create a fabricated directory.

Also, when it reached a node that had an object attached to it, the code would not look at any other nodes attached below the node.
So, in the example above, "/bin/new-prog" would be ignored because a directory was attached at "/bin". "/bin/new-prog" would not
be visible in the filesystem. The code did not have a way of combining the "new-prog" entry with the contents of "/bin".

In the new version of the code, the information in a "struct node" is not copied. There is a new kind of fabricated directory object (see
build-fs-dynamic.c) which has a pointer to a "struct node". This means that the tree nodes can be modified, and the changes will be
visible to processes using this directory structure.

Furthermore, the new code allows objects to be attached below directories that are attached to the tree. The new fabricated directory
objects can combine directory listings so that "/bin/new-prog" will be visible in the example above (as well as other entries in the
directory attached at the path "/bin"). This is similar to union directories, but the semantics are slightly different.

This change means that two things are immediately possible:

When you run a program you can grant it read-only access to a directory, but read-write-create access to an object inside that
directory. (This means that the caveat mentioned in the note for version 1.5 no longer applies.)

For example, previously this command would not work as expected:

gcc -c foo.c => -o foo.o + .

It would fail to grant write access to foo.o. Now it does so.

Using the "PATH=EXPR" syntax, you can add entries to or replace entries in directories, without changing the directory,
including those that are implicitly included in a process's file namespace, such as "/bin" and "/usr". (This means that the caveat
in the note for version 1.6 no longer applies.)

This change is an important step for a couple of features that are planned for the future:

Implementing a "powerbox" for GUI programs. The user could, over the lifetime of a process, grant it access to files in
addition to the ones it was granted when it was created. These files will be attached into the filesystem by modifying the "struct
At the moment, Plash doesn't grant access to "/tmp" by default. But it could grant every program access to its own private
temporary directory, mapped into the file namespace as "/tmp". Below this we'll need to attach "/tmp/.X11-unix/X0" -- the
socket that X11 clients use to connect to the X server.

This facility is similar to mount points in Linux and Plan 9 (mount points are system-wide in Linux but per-process in Plan 9).
However, it has slightly different semantics. In Linux, mount points are implemented on the basis of the identity of directories, using
their inode numbers; one directory is redirected to another directory. In Plash, attaching objects works on the basis of pathnames, not
directories' identities.

In both Linux and Plan 9, a directory must exist before you can mount another directory at that point to replace it. This is not the case
in Plash. When you attach an object at "/bin", it adds a "bin" entry to the root directory. When you then attach an object at "/bin/foo",
the directory at "/bin" will be unioned with a "foo" entry. Mount points are limited to directories, while Plash allows you to attach
files, symlinks and other objects too.

Other changes
In libc, the functions set{,e,re,res}{u,g}id() have been made into no-ops, which always return indicating success. 4/7
4/8/2018 Plash: News

This is to deal with programs such as mkisofs and GNU make which make pointless calls to setreuid32(). mkisofs and make call
setreuid32() with the current UID. Ordinarily this should succeed and do nothing. But Plash's libc fakes the current UID: it has
getuid() return the shell's UID (stored in the environment variable PLASH_FAKE_UID) rather than the process's UID. mkisofs and
make's call to setreuid32() will fail and they will exit.

The reason for faking the UID was to get gnuclient to work -- gnuserv uses the user's UID in the filename of the socket it creates in
/tmp. But maybe this was not worth it. Either way, UID-related functions in libc aren't useful under Plash and can be turned into no-
ops. Ideally, they should be logged.

Version 1.8
New build system

You can now build glibc for Plash automatically. (Previously, building glibc involved manually intervening in the build process.)

Syntax change

I have swapped the precedences of the "+" and "=>" argument list operators in the shell. "=>" now binds more tightly than "+". This
means that:

command a => b + c => d

means the same as:

command { a => b } + { c => d }

Enabling X11 access

The shell now has an option for automatically granting programs access to the X11 Window System. When this is switched on, a
command such as:

xpdf foo.pdf

is equivalent to:

xpdf foo.pdf + ~/.Xauthority => /tmp/.X11-unix

This option is switched off by default because X11 is not secure! X servers provide no isolation between the clients connected to
them. One client can read keyboard input that goes to other clients, grab the screen's contents, and feed events to other clients,
including keypress events. So potentially, an X client can gain the full authority of the user.

The solution to this will be to write a proxy, through which an X client will connect to the X server, which will prevent it from
interfering with other X clients.

How to switch on this option (short version):

Either: From the shell, enter:

plash-opts /x=options 'enable_x11' 'on'

Or: To enable it for all shell sessions, you can create a file "~/.plashrc" file containing this (note the semicolon):

plash-opts /x=options 'enable_x11' 'on';

and start the Plash shell with the command:

plash --rcfile ~/.plashrc

(In order to make it as predictable as possible, Plash doesn't read any options files by default, so you have to specify options files

Shell options 5/7
4/8/2018 Plash: News

I have removed the "opts" command from the shell, which used to open an options window using Gtk. There is now an external
program which does the same thing, which you can run from the shell (so the shell is no longer linked with Gtk). You can run this
program with the command:

plash-opts-gtk /x=options

The shell creates an object -- which it binds to the "options" variable -- for setting and getting options.

Support for directory file descriptors

Plash now has partial support for using open() on directories. XEmacs can now run under Plash. XEmacs will just open() a directory
and then close() the file descriptor it got, and this is all Plash supports at the moment.

A complete solution would involve virtualising file descriptors, so that every libc call involving file descriptors is intercepted and
replaced. This would be a lot of work, because there are quite a few FD-related calls. It raises some tricky questions, such as what
bits of code use real kernel FDs and which use virtualised FDs. It might impact performance. And it's potentially dangerous: if the
changes to libc failed to replace one FD-related call, it could lead to the wrong file descriptors being used in some operation, because
in this case a virtual FD number would be treated as a real, kernel FD number. (There is no similar danger with virtualising the
system calls that use the file namespace, because the use of chroot() means that the process's kernel file namespace is almost entirely

However, a complete solution is complete overkill. There are probably no programs that pass a directory file descriptor to select(),
and no programs that expect to keep a directory file descriptor across a call to execve() or in the child process after fork().

So I will be using a partial but safe solution: When Plash's libc needs to return a directory file descriptor to the main program, it does
open("/dev/null") and returns a real file descriptor. This has the effect of allocating a FD number (so the kernel can't re-use the slot),
and it provides a FD that can be used in any context where an FD can be used, without any harmful effects -- at least, as far as I

If a program uses fchdir() or getdents() on the resulting FD, it will just get an error in the current version of Plash. If I want to
implement these calls in the future, it will just be a matter of having open() record in a table any directory FDs that it returns; fchdir()
and getdents() can do a lookup in this table. dup(), dup2(), close() and fcntl() (with F_DUPFD) will have to be changed to keep this
table up-to-date. Maybe execve() should give a warning if there are FDs that won't be kept in the new process. Frequently-called
functions like read() and write() will not have to be changed.

Version 1.7
This version adds a major new feature, executable objects. See NOTES.exec.

Version 1.6
The shell now lets you start processes with existing files and directories attached to arbitrary points in the filesystem tree. For

gcc -c /arg/foo.c=(F bar.c) => -o out.o

The directory `/arg' does not need to exist in the real filesystem. It will be created in the fabricated filesystem that `gcc' receives.

The general form of this new kind of argument is "PATHNAME = EXPR", where the pathname may be relative to the root directory
or the current directory. At present, the only kind of expression is "F PATHNAME", which returns the file or directory object at that
pathname (following symlinks if necessary).

The command also receives the pathname being assigned to ("/arg/foo.c" in the example) as an argv argument, unless the argument
occurs to the right of a "+" operator. For example, you can give a process a different /tmp directory using:

blah + /tmp=(F ~/new-tmp)

The difference between writing

blah a/b/c


blah a/b/c=(F a/b/c) 6/7
4/8/2018 Plash: News

is that if any of the components of the path `a/b/c' are symbolic links, in the first case the constructed filesystem will include those
symbolic links and the objects they point to, whereas in the second case, `a', `a/b' and `a/b/c' will appear as directories and files.

The `=' argument syntax does not force the object being attached to be read-only, even if the argument appears to the left of `=>'. A
future extension will be to let you write "(read_only(F file))" as an expression.

This only lets you attach existing files. A future extension will be to let you write "path $= (S file)", where the "S" expression returns
a slot object, and "$=" attaches a slot to the filesystem. (Slots represent a location in which a file, directory or symlink may be created
or deleted.)

One caveat is that if you do

blah + /a/b=EXPR1 /a=EXPR2

the binding for `/a/b' does not appear; it is overridden by `/a'. The directories `/bin', `/usr', `/etc' and `/lib' are implicitly attached to the
filesystem that is constructed, so this means you can't yet attach new objects within these directories.

Version 1.5
Recursive read-only objects are now implemented, and the shell will pass objects as read-only by default. There is one caveat to this.
If you enter a command like this:

blah a => a/b

then `blah' will get read-only access to `a' but it won't get writable access to `a/b'. Fixing this requires a new kind of proxy object
which I'll implement in a later version.

It's now possible for a process to use the object-capability protocol that I introduced in the previous version to create a restricted
environment to run a child process in. As an example, there's a "chroot" program. It basically asks the server to return a reference to
the directory it wants to chroot into, given a pathname for it. Then it creates a new fs_op object (which resides in the server process)
for handling filesystem requests, using that directory as the root, and replaces its existing fs_op object with that one.

Normally, use of "chroot" is restricted to root under Unix, because it's dangerous in the presence of setuid executables. (You can hard
link a setuid executable into your chroot directory and replace the libraries it uses with your own code.) But Plash doesn't provide
setuid executables, so it's safe. Another mechanism will be provided instead of setuid.

Mark Seaborn Up: Contents 7/7
4/8/2018 Plash: Internals

Plash: tools for practical least privilege


Region-based memory management

See region.h

A lot of the memory management in Plash is done using regions. Regions work like this:

1. You create a region. This allocates a medium-sized block (eg. 1k) initially.
2. You allocate blocks from the region.
3. You free the whole region. This frees all the blocks that were allocated from it.

Regions don't provide a way to free individual blocks in the region. This means that allocation is fast, because there's no
fragmentation within a region. Deallocation is fast too, because all the blocks are freed in one step.

The main reason for using regions in Plash is convenience. Deallocation becomes much less of a burden, and is easier to get right, so
the chances of memory leaks are reduced. If a complex structure is region-allocated, you don't need to traverse the structure to free
each node individually.

Using regions works well when allocation and deallocation follow the structure of the function call tree: when a structure is not used
outside of the function that allocates it or the parent of the function that allocates it. When this is not the case, and the amounts of
storage allocated are large, regions are not so good.

Plash uses reference counting for storage management when regions are not appropriate.

It is possible to attach explicit finalisers to a region, so that when the region is free, other resources are freed, eg. file descriptors can
be closed.

String handling
See region.h

Plash has some functions for handling strings of bytes, which are used for text and binary data. There are facilities for constructing
strings, and decomposing strings.

The seqf_t type ("flat sequence") is a struct which represents a string stored contiguously in memory.

The seqt_t type ("tree sequence") represents a string that need not stored contiguously. It is a tree structure containing strings to be
concatenated together.

You can concatenate seqt_ts using the functions cat2, cat3, cat4, etc. These all use a region to allocate nodes.

The flatten function turns a seqt_t into a seqf_t.

Object system
See filesysobj.h

Originally the object system was just used for file, directory and symlink objects, but I extended it so that references to objects can be
exported to other processes using Plash's object-capability protocol. There are other kinds of object now.

An object reference has type "struct filesys_obj *". "cap_t" is an alias for this.

There are a number of methods defined. Every object supports all the methods. That is, it is valid to call a method on any object, even
if the method isn't relevant to that object. If the method isn't relevant, it will return an error code (for those methods that can return
error codes). 1/4
4/8/2018 Plash: Internals

Every object contains a pointer to a vtable, which contains function pointers implementing the methods. This design is simple, but it
means all the vtables need to be recompiled when we add new methods. Since the vtables are sparse (ie. most methods aren't relevant
to a given object), we have a program, "", which generates the C code for constructing the vtables.

Reference counting
All objects have a reference count. To increment the count, use "inc_ref(obj)". To decrement the count, use "filesys_obj_free(obj)"
(this will free the object when the count hits zero). Some references are owning references -- you are supposed to free them. Some
references are non-owning -- they are "borrowed" from the caller.

The choice of whether a function argument is an owning or a non-owning reference is based on a trade-off between convenience and
minimising the lifetime of objects. You have to look at the comments for a function to see whether its arguments are passed as
owning or non-owning.

In order to make a method call across a connection using the object-capability protocol, the method's arguments must be marshalled -
- converted into a string and an array of objects/FDs, and then converted back again at the other end.

Marshalled arguments are represented by "struct cap_args".

The most general methods are "cap_invoke" and "cap_call", which only use marshalled arguments. "cap_invoke" is send-only and
asynchronous; it returns immediately and does not itself get a reply. "cap_call" is synchronous and returns a result.

The other methods call, or are called by, "cap_call".

For remote objects, other methods marshal their arguments and call "cap_call", which in turn calls "cap_invoke".

When local objects receive a remote request, "cap_invoke" handles this request and calls "cap_call", which unmarshals the
arguments and calls the relevant method.

The reason the other methods exist is that they are more efficient to use for calls within a process, and they are more convenient.

For some methods, marshalling is not implemented, so these methods can't be used remotely.

Encodings for marshalling

In some cases, marshalling code is written by hand. This is how the methods for fs_op work at present.

The program generates marshalling code for other methods. It uses compact descriptions of a method's arguments,
such as "mode/int leaf/string".

Documentation format: XXML, an XML surface syntax

I was writing the documentation in DocBook format, but I have switched to using HTML tags, with some Perl code to generate a
contents page. In both cases, I have been using an alternative surface syntax for HTML/XML. I'm calling the new surface syntax
XXML. (Let's say it stands for Extra Extra Medium Large.)

XXML is a bit more convenient than the usual XML surface syntax:

You don't have to write the tag name in the closing tag. Curly brackets are used for grouping.
In some cases, you don't have to write closing tags at all. A tag can apply to the rest of the line. (No DTD is required to
determine the tag nesting.)
When writing XML, it's easy to forget that '<' and '&' are special characters that have to be escaped. '\' and '{' and '}' are
perhaps a bit more unusual. Furthermore, XXML provides the ">>" syntax in which you don't have to quote special characters
at all.
It's easier to do search and replace on the name of a tag.

Here's an example:

XXML {\tag attr={value}: body}

XML <tag attr="value">body</tag> 2/4
4/8/2018 Plash: Internals

The curly brackets are a grouping construct which is independent from the tag. They introduces blocks. When a tag's attributes are
terminated with a colon, the tag works greedily: it applies to all of the text up to the end of the block.

XXML {\tag1: \tag2: body}

XML <tag1><tag2>body</tag2></tag1>

If you end a tag's attributes with a hyphen "-" rather than a colon ":", the tag consumes upto the end of the line. This is useful because
no closing bracket is required.

\li- list item 1
XXML \li- list item 2
\li- list item 3

<li>list item 1</li>
XML <li>list item 2</li>
<li>list item 3</li>

Ending a tag with a tilde "~" will consume the following paragraph, ie. upto the next empty line.

\li~ blah
\li~ paragraph


Those last examples use the HTML tags "ul" and "li". DocBook has longer names for the same thing, "itemizedlist" and "listitem",
and it also requires you to use "para" tags in the "listitem"s.

\listitem\para- list item 1
XXML \listitem\para- list item 2
\listitem\para- list item 3

<listitem><para>list item 1</para></listitem>
XML <listitem><para>list item 2</para></listitem>
<listitem><para>list item 3</para></listitem>

Note that

\listitem\para- text

is an abbreviation for

\listitem- \para- text

XXML provides some help in writing paragraphs, so you don't have to write "p" or "para" tags explicitly. The "ps" tag will treat
empty lines in the input as paragraph breaks. It is expanded out by a postprocessor. This is similar to how you write paragraph breaks 3/4
4/8/2018 Plash: Internals

in TeX.

A paragraph.
Another paragraph.

<para>A paragraph.</para>
<para>Another paragraph.</para>

A tag without a body can be terminated with a semicolon ";".

XXML \hr size=4;

XML <hr size=4>

The ">>" syntax is a way to write literal data without having to remember to quote backslashes or curly brackets.

>>int main()
XXML >> printf("Hello world!\n");
>> return 0;

<pre>int main()
XML printf("Hello world!\n");
return 0;

There is no special escape sequence for writing XML entities. A tag whose name begins with "E" is converted into an entity. It
should not have attributes or a body. (Note that tag names are case sensitive.)

XXML \Eamp;

XML &amp;

You can specify the literal characters '\', '{' and '}' by prefixing them with another backslash.

Mark Seaborn Up: Contents 4/4
4/8/2018 Plash: Issues

Plash: tools for practical least privilege


Security vulnerabilities
connect() race condition
Problem: connect() on Unix domain sockets follows symlinks, and there is no way to switch this off.

filesys-obj-real.c calls lstat() to determine whether a directory entry is a symlink. If not, real_file_socket_connect() can be used; this
will call connect(). Between lstat() and connect(), another process might replace the socket with a symlink.

Applicability: An adversary, A, in one sandbox cannot exploit this on its own; it requires a conspirator, B. A and B can conspire so
that A gets access to an arbitrary socket S that is in A's server's namespace but not in A's namespace. A and B must have write access
to some common directory. B does not have to have access to socket S (it only needs to know its pathname in A's namespace). This
exploit can only occur if B is not in the same sandbox as A.

Exploit: see tests/socket-symlink-race.

Note: bind() does not follow symlinks. It behaves like open() with O_CREAT|O_EXCL. (There may be some other Unix variants
where this is not true, perhaps including old versions of Linux.)

Possible solutions:

Hardlink socket into a temporary directory before calling connect().

The server has a private directory in which no server will create symlinks on behalf of a client. The server's
real_file_socket_connect() method hardlinks the socket file into the private directory, checks that the object that got hardlinked
is not a symlink, and then calls connect() on it.

Problem: this doesn't work across devices.

We could have a list of directories into which we try to create hard links. These are tried in turn. The user/administrator can set
these up to cover all the filesystems.

Plash would have to create a directory per user ID inside these directories (as in /tmp) -- incidentally, this means the sticky bit
isn't necessary, because processes with other user IDs can't delete the directories while they're in use. What should the default
be? "/tmp/plash/" This would be enough for opening X11 sockets. How should this be configured? Via an environment
variable? We would want to unset this so that programs run under Plash don't see it. Via a configuration file in /etc?
/etc/plash/hardlink-dirs could contain a list of directories, each on a separate line.

We could try creating a hard link in the same directory, but that doesn't work if the server doesn't have write access to the
directory. (But then, if the server doesn't have write access *and other servers don't either* the symlink attack cannot be carried
out.) Doing this in /tmp/.X11-unix creates a file that is owned by root in a directory owned by root (with the sticky bit set),
which then cannot be deleted. (The sticky bit means you can unlink an object only if you own it, or you own the directory.)

It turns out that link() doesn't work between directories that are on the same device but have been mounted using "mount --bind".

I thought it might be possible to open() the domain socket and then do connect() on /proc/self/fd/N (which would effectively operate
on the inode rather than the file descriptor). However, open() does not work on domain sockets.

We could have a setuid tool for doing connect() that does the following:

creates a subdirectory in the socket's directory, making it inaccessible to all except root. (The socket's directory would be the
cwd: the caller would pass its leaf name.)
hard links the socket into this private directory.
checks that the inode number is as expected.
does connect()
deletes the copied domain socket, deletes the directory


It might be possible to misuse the setuid tool to create extra directories inside directories that the caller does not usually have
access to. This can be partly addressed by checking that the file is a domain socket: then this only applies to directories that 1/4
4/8/2018 Plash: Issues

contain domain sockets.

Bigger problem: the subdirectory could be renamed and replaced with a symlink. This can be solved: chdir() in the directory,
do stat(".") and check that it is root-owned and not world readable. Then do link(): the source path can be given as
"../<leafname>": this is not itself reliable, but we check that the correct source object was linked anyway.

Another possibility is to have a lock that is shared between Plash servers, to ensure that no server creates a symlink while another is
in the process of connecting to a domain socket. * The lock would have to be per-user, rather than system-wide, otherwise one user
could deny service to another by holding the lock indefinitely. This means the symlink race could be exploited (only) by conspiring
programs running under different users' Plash environments, with write or connect access to a common directory. * This doesn't
protect against symlinks created programs not running under Plash. * The lock would need to be held around rename() calls, because
a symlink can be put in place using rename(). * Could use a flock() lock, stored under /tmp/plash-<uid>.

Hard linking won't work on read-only filesystems, but that's okay, because you can't create domain sockets on those in the first place.

See <>: SSH authentication agent follows symlinks via a UNIX

domain socket

chmod() race condition

There is a similar symlink race condition in using chmod(), which follows symlinks.

glibc exports an lchmod() function, but it isn't implemented (it always returns ENOSYS).

We do the equivalent of lchmod() by opening the file and doing fchmod(). Problem: open() fails if read permissions aren't set
(even if the user owns the file). So this call can't enable read permissions if they are not currently enabled.

For now, this probably isn't a serious problem. I expect that chmod() will mostly be used for setting the executable bit. Note
that this problem makes it possible to unset a permission and then not be able to change it back (from within Plash).

Note that root *can* open() a file even if no read permissions are enabled for it. So we *could* create a setuid root tool to
implement lchmod() that opens a file (using O_NOFOLLOW) and then switches back to the original user identity before doing

Note that fchmod() checks the calling process's user identity and looks at the owner of the FD's file inode.

A more serious problem: you can't open() a socket, so you can't do fchmod() on a socket.

This is causing Konqueror to fail. It runs kdeinit, which does bind() on /tmp/ksocket-mrs/kdeinit__0, and then tries to chmod()
the socket.

(This is dubious practice, because there's a race: if the socket is created with excessive permissions, it will be accessible for a
while. However, it is created in a non-world-readable directory. This looks like a just-in-case measure.)

utimes() is similar to chmod(): glibc exports an lutimes() function, but it isn't implemented (it always returns ENOSYS). We use
open() and futimes(). futimes() uses /proc/self/fd/N.

Running pola-shell as root

When X11 access is enabled, /tmp/.X11-unix is mapped as a writable slot. It should be a writable object in a read-only slot.

stat64() doesn't work properly

The server processes are included as part of the job with the client processes in the job. The server has the same process group ID,
and the shell will wait for it. This is convenient (for printing the exit status), but wrong. If the user presses Ctrl-C, and the client
handles SIGINT and survives, the server will still be killed, but the client will become mostly useless.

libc's object-based execve() ignores the close-on-exec flag

Shell: build-fs.c: If you have the command "cmd foo", and `foo' is a symlink, the symlink will be followed and the shell will also
grant access to the destination of the link. If you have the command "cmd => foo", the symlink is not followed. This is inconsistent.
Actually, I have realised that following the symlink is not good from a security point of view.

Aspects that need more testing 2/4
4/8/2018 Plash: Issues

libc thread safety

Might be problems in future

Re-entrancy: run_server_step() is called while waiting for a reply on a return continuation object. It will handle incoming requests --
these should be queued instead. I don't think this actually causes any bugs, since there are no TOCTTOU problems in the code.
(There aren't really any invariants that are broken during a method call.)

No resource accountability (not really a bug)

Make sure that messages are encoded and decoded properly on 64-bit and other-endian machines. Currently I assume sizeof(int) ==

Sending on a socket is never queued. This could lead to DoS of servers. It could potentially lead to deadlocks, if both ends of a
connection send at the same time (this doesn't happen at the moment because all connections are client-server and call-return).

There may be cases where libc calls should preserve errno but don't.

Behaviour that might need changing:

build-fs.c attaches copies of symlinks into processes' file namespaces, so the process won't see them change when they change in the
real filesystem. This may not be expected. Actually, symlinks are immutable and the inode would change if you replaced one.

Problems running specific programs

GNU Emacs (resolved)
When run under Plash, GNU Emacs 21 prints the following and exits:

emacs: Memory exhausted--use M-x save-some-buffers RET

The fault lies with GNU Emacs; it has been fixed in CVS (not yet released as GNU Emacs 22).

The problem also occurs if you do:

/lib/ /usr/bin/emacs

which is what Plash does internally.

The problem is that the use of address space changes when you invoke directly: the brk() syscall changes where it
allocates memory from. brk() starts allocating from after the BSS (zero-initialised) segment of the executable that was invoked by
exec(). For normal executables this is after 0x08000000. But gets loaded at 0x80000000, so brk() follows from
somewhere after that, regardless of what executable subsequently loads.

Emacs allocates memory using malloc(), which uses brk(), and so it gets an address with one of the top 4 bits set, which it can't

I would guess that Emacs' use of the top 4 bits hasn't changed but rather Emacs 22 uses mmap() to allocate memory rather than

This issue is also mentioned in:

Konqueror (resolved)
Qt: 3.3.3 KDE: 3.3.2 Konqueror: 3.3.2

Konqueror has a problem starting up seemingly related to fam (a File Alteration Monitor). If it connects to fam's TCP port but then
fails to connect to the Unix domain socket that fam creates in response, Konqueror fails (actually, kded fails).

Solution: disable the fam daemon.

Running a subprocess from XEmacs would give: 3/4
4/8/2018 Plash: Issues

sendmsg: Bad file descriptor

recvmsg: Bad file descriptor
[2622] cap-protocol: [fd 5] to-server: connection error, errno 9 (Bad file descriptor)
Can't exec program /bin/sh
Process shell exited abnormally with code 1

XEmacs was closing the file descriptor that Plash uses.

This is fixed: Plash's libc will refuse to close that file descriptor.

Mark Seaborn Up: Contents 4/4
4/8/2018 Plash: Copyright

Plash: tools for practical least privilege


Plash is Copyright (C) 2004, 2005, 2006 Mark Seaborn

Plash is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as
published by the Free Software Foundation; either version 2.1 of the License, or (at your option) any later version.

Plash is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more

You should have received a copy of the GNU Lesser General Public License along with Plash; if not, write to the Free Software
Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA.

Mark Seaborn Up: Contents 1/1