A Pipe Namespace in the Portal Filesystem
The portal filesystem allows a daemon running as a userland program to pass descriptors to processes that open files belonging to its namespace. It has been part of the *BSD operating systems since 4.4 BSD. I recently added a pipe namespace to its FreeBSD implementation. This allows us to perform scatter gather operations without using temporary files, create non-linear pipelines, and implement file views using symbolic links.
The idea for the pipe namespace came from brainstorming by two colleagues. Vassilios Karakoidas suggested, some months ago, that it would be cool to be able to create file views, in a way similar to the SQL views. He proposed to implement them as a filesystem in kernel module. Later on, when we discussed this idea with John Ioannidis, John proposed to look at the portal filesystem. I can't say that my first encounter with its manual page left me enlightened, but after locating and reading the corresponding excellent Usenix paper, I understood how it worked, and knew this was the way to go.
The portal filesystem capitalizes on the ability of modern Unix kernels to pass file descriptors between processes. When a process tries to open a file within its namespace the filesystem passes tha name to a daemon (in practice, a leftover instance of its mount process). The daemon performs any operations it requires in user space, and then passes back a file descriptor to the process that opened the file. Thus for operations that can be done in user space and expressed in the end as a file descriptor, no kernel hacking is required.
At the time I layed my hands on it the kernel filesystem
supported the following namespaces (all examples assume that the
portal filesystem is mounted in /p
.
-
The fs namespace, which opens the named file, starting back at the root directory.
As an example, the file
/p/fs/etc/motd
would be an alias for the file/etc/motd
-
The tcp namespace, which takes a slash separated hostname and a
port and creates an open TCP/IP connection.
Thus, the following command will display the greeting of the FreeBSD
SMTP server.
head -1 /p/tcp/mx1.freebsd.org/smtp
-
The tcplisten namespace, which takes a slash separated hostname and port and
creates a TCP/IP socket bound to the given hostname-port pair.
An an example, the following command implements a (single-threaded) echo server.
while : do (exec 3<>/p/tcplisten/ANY/echo && cat -u <&3 >&3) done
The pipe namespace I implemented executes the named command, starting back at the root directory. The command's arguments can be provided after the command's name, by separating them with spaces or tabs. Files opened for reading in the pipe namespace will receive their input from the command's standard output; files opened for writing will send the data of write operations to the command's standard input. Errors are logged through the system logging facility.
Here are some interesting uses of the pipe namespace. Note that none of the cases involve the use of temporary files.
-
Gather data from two sources.
The following example will
verify that two remote files are identical:
diff -q '/p/pipe/usr/bin/fetch -o - \ ftp://ftp1.freebsd.org/pub/FreeBSD/README.TXT' \ '/p/pipe/usr/bin/fetch -o - \ ftp://ftp2.freebsd.org/pub/FreeBSD/README.TXT'
-
Scatter data to two sinks.
The following example will record a remote CD ISO image and calculate
its checksum:
fetch -o - ftp://ftp5.freebsd.org/.../disc.iso | tee '/p/pipe/usr/local/bin/cdrecord -' | md5
-
Create file views.
The following example will
create an XML view of the password file:
This example requires the following passwd2xml script:
ln -s '/p/pipe/usr/local/bin/passwd2xml /etc/passwd' \ /etc/passwd.xml"
#!/usr/bin/awk -F: -f BEGIN { print "<?xml version=\"1.0\" ?>" print "<passwd>" } { print "<entry>" print "\t<uname>" $1 "</uname>" print "\t<password>" $2 "</password>" print "\t<uid>" $3 "</uid>" print "\t<gid>" $4 "</gid>" print "\t<name>" $5 "</name>" print "\t<home_dir>" $6 "</home_dir>" print "\t<shell>" $7 "</shell>" print "</entry>" } END { print "</passwd>" }