Wednesday, December 17, 2014

Haskell's biggest flaw

And then a great, deep voice, like as that of James Earl Jones, spake unto me, saying,

The greatest flaw of Haskell is the inability to define the identity function in a point-free style.

Monday, December 1, 2014

Quote o' the day: uh....

From bestpcinfos on Hacker News:

Don't forget to put the tarp over the ham basin before departing for olfactory shoes.

I've got no idea what it means, but it does sound like good advice.

Saturday, November 22, 2014

Link o' the day: How I start—Haskell

Chris Allen has a very good walk-through for starting Haskell programmers (and for projects using cabal): How I Start: Haskell.

(Hey, wha'da'ya' want? It's NaNoWriMo and I'm way behind. And busy.)

Thursday, October 30, 2014

Turing machines vs. other models of computation

Every once in a while, I hear someone complain about the fascination with Turing Machines, in comparison with other models of computation such as lambda calculus, general recursive functions (well, ok, maybe not that one), etc.

The following is Philip Wadler's answer, from "Propositions as Types", an otherwise equally good paper:

Turing’s most significant difference from Church was not in logic or mathematics but in philosophy. Whereas Church merely presented the definition of λ-definability and baldly claimed that it corresponded to effective calculability, Turing undertook an analysis of the capabilities of a “computer”—at this time, the term referred to a human performing a computation assisted by paper and pencil. Turing argued that the number of symbols must be finite (for if infinite, some symbols would be arbitrarily close to each other and undistinguishable), that the number of states of mind must be finite (for the same reason), and that the number of symbols under consideration at one moment must be bounded (“We cannot tell at a glance whether 9999999999999999 and 999999999999999 are the same”). Later, Gandy [16] would point out that Turing’s argument amounts to a theorem asserting that any computation a human with paper and pencil can perform can also be performed by a Turing Machine. It was Turing’s argument that finally convinced Gödel; since λ-definability, recursive functions, and Turing machines had been proved equivalent, he now accepted that all three defined “effectively calculable”.

In other words, a Turing Machine is built for an argument based on the capabilities of a human being, making the argument that it encodes effective calculation stronger. As in Gödel's case, it is not clear that lambda calculus does encode effective calculation.

Further, the Turing Machine formalism has a number of immediately-obvious extensions, such as multiple tapes, that are easily proven to be equivalent to the original machine. As a result, the subsequent, loose, argument that it is closed in power above further strengthens its argument.

Saturday, September 27, 2014

On shellshock

The 'net is currently focused on Shellshock, leading to interesting discussions of the responsibility for the problem. Recently, someone posted Not a bash bug, which was linked on Hacker News. The argument there is that:

I would argue that the bash security concern is not a bug. It is clearly a feature. Admittedly, a misguided and misimplemented feature, but still a feature....

It is an old precept for security on unix systems, that environment variables shall be controlled by the parent processes, and an even older and more general precept that input data shall be validated.

That was my initial opinion of the bug, as well. The parent processes are in control of the environment and should be validating input.

On the other hand, after thinking about it, there are a number of reasons why I decided that this is at best a misfeature of Bash.

  • It is incredibly undocumented. I've been a Unix guy for over 25 years, and I've been using Bash for most of that time. (Sorry, David Korn.) I've used Bash a lot. But I've never heard of this thing.

  • It violates some ill-defined, personal, un-thought-about assumptions about environment variables. An environment variable with executable code? That's as terrifying as LD_LIBRARY_PATH, and that is very well known. One reason I've probably missed this feature is that it is something I would never consider using.

  • In my opinion, it's almost impossible to secure this on the parent process' side. Sure, the parent can look for magic Bash strings, but.... This isn't just Apache, it's potentially every other network accessible program that calls a shell, and that is a very common thing to do in Unix.

  • Finally, consider some of the special behavior of execlp and execvp:

    If the header of a file isn't recognized (the attempted execve(2) failed with the error ENOEXEC), these functions will execute the shell (/bin/sh) with the path of the file as its first argument. (If this attempt fails, no further searching is done.)

    You could end up starting a shell without knowing.

One comment on HN is interesting:

The original author of bash (a friend of mine, which is why I have this context) has been being interviewed by various newspapers today regarding shellshock, and finds the idea that he might have anticipated the number of ways people integrated bash into various systems (such as with Apache allowing remote control over environment variables when running software in security domains designed to protect against unknown malicious users) quite humorous. Apparently, it has been an uphill battle to explain that this was all coded so long ago that even by the time he had already passed the project on to a new developer (after having maintained it for quite a while himself) the World Wide Web still wasn't a thing, and only maybe gopher (maybe) had been deployed: that this was even before the Morris worm happened...

I certainly understand the impossibility of anticipating "the number of ways people integrated bash into various systems", but the idea of installing a facility for executing back-channel code was certainly sketchy at the time. Further, why is the feature still there? We stopped using rsh and telnet long ago, right?