User space file system comments

Posted on July 9, 2011 by Tommy McGuire
Labels: linux, unix
This is a comment I made on the Gluster blog, about a post on one of Linus Torvalds' incendiary comments on user-space filesystems.

Way back in the dim mists of time, when microkernels were hot, Linus was having flame wars with AT, and the HURD actually sounded like it might someday be usable, I worked on the IBM microkernel project writing filesystem benchmarks. The idea was to build OS/2 (!), AIX, and other OS personalities on top of Mach, as multiple user-space servers.

There were a lot of things wrong with the implementation (getpid() required a trip through Mach to an OS server), but the one thing that was a continual problem was filesystem performance. At one point, reading a page off disk was CPU bound. The issue wasn’t context switches, really, which can themselves be made arbitrarily fast, but carrying pages of data along with the context switch. The original copy-on-write turned out to be a pessimization—a very-well-written memcpy took a little less time to copy a page than the vm system took to set up the page share and copy-on-write. Then, if you eventually had to copy the page….

I never figured out what was going on with the virtual memory system, but I’ve seen the same issue over and over. The MkLinux paper at the (1st and only, AFAIK) FSF conference is one of my favorite performance papers ever. The table showing performance comparisons looks good, as long as you didn’t notice that the top third was dhrystones; MkLinux didn’t actually slow the processor down. The Scout OS, a good research uK OS reported excellent performance, but didn’t separate user-space processes into different memory domains. The numbers reported by the security-enhanced follow-on, whose name I can’t remember but which did use multiple memory domains, was much worse. If you read between the lines of that paper comparing it to Linux, it would still have been a small multiple slower than Linux if it only used 2 protection domains instead of 3 for filesystem accesses.

I’m as big a fan of user-level filesystems as anyone. They provide a neat and protected way of doing things that would otherwise not be as clean. On the other hand, to say “user space advantages far outweigh kernel space advantages” as a general rule is a pretty thoroughly blinkered view, too.

Oh, and if you want the references to the MkLinux, Scout, and whatever-that-other-one-was papers, I’ll try to round them up.
active directory applied formal logic ashurbanipal authentication books c c++ comics conference continuations coq data structure digital humanities Dijkstra eclipse virgo electronics emacs goodreads haskell http java job Knuth ldap link linux lisp math naming nimrod notation OpenAM osgi parsing pony programming language protocols python quote R random REST ruby rust SAML scala scheme shell software development system administration theory tip toy problems unix vmware yeti
Member of The Internet Defense League
Site proudly generated by Hakyll.