Something I’ve been working on for a while now is mounting select filesystems from within unprivileged (i.e. user namespace) containers. Though I’m still working on getting this into upstream Linux, support for mounting fuse and ext4 was recently merged as an opt-in feature in the xenial kernel. The instructions below will help you get started if you’d like to give this a try.
But first a word of warning: this feature is experimental. As far as I know it should be stable, but there are known security concerns, specifically with mounting ext2/3/4 volumes1. It should only be enabled in trusted environments where potentially malicious users do not have shell access to your system.
In order to use this feature, you will need a 4.4.0-6.21 or later kernel in Ubuntu xenial. To follow these instructions you will also need to have lxd installed (help for this can be found here).
The first thing you need to do is flip the module parameters to enable user namespace mounts for fuse and/or ext4.
A lxd container running under the default profile will not have the device nodes needed for mounting (/dev/fuse for fuse and some block device for ext4, e.g. /dev/loop0) and will not be permitted to mount by AppArmor. We need to create a lxd profile which will include the device nodes and run the container without AppArmor confinement. This will be used on top of the lxd default profile.
Now create a new container using this profile.
Testing it out
Let’s try fuse first. Launch a shell in your new container:
Inside the container we can create an ext2 filesystem and mount it using fuseext2, adding the ‘-o force’ option to make the mount read/write.
You can now browse and modify the filesystem. To unmount run:
For ext4 we’ll create the filesystem outside of the container and set this as the backing store for /dev/loop0. If loop0 is already in use you can pick another loop device, in which case you’ll also need to modify the nsmout lxd profile to make that device available in the container.
Back in the host run:
Then back in the container run:
This filesystem can be unmounted in the usual way.
The known security concern is that a malicious user could provide a crafted filesystem to exploit bugs in the in-kernel filesystem parsing code, or the user could change the backing store at runtime. Either of these could lead to kernel crashes or worse. ↩