Saturday, August 30, 2014

Link o' the day: Maciej Cegłowski is my new waifu

I'm behind the times, I know, but I just found Dabblers and Blowhards by Maciej Cegłowski and I find it an acceptable piece of text. This essay, a response to Paul Graham's "Hackers and Painters", is the source of many unique quotes such as, "In his essays he tends to flit from metaphor to metaphor like a butterfly, never pausing long enough to for a suspicious reader to catch up with his chloroform jar," or his definition of:

  • Painters apply colored goo to cloth using animal hairs tied to a stick.

which you may object to, until you realize it exactly matches his description of:

  • Computer programmers cause a machine to perform a sequence of transformations on electronically stored data.

Some thumbs up.

Saturday, August 23, 2014

More wisdom from mcguire

Names have been changed to protect the identities of those who may not wish to admit they know the Great Sage.

(06:53:46 PM) Mittens: Have you fixed everything yet?!?!
(06:54:00 PM) mcguire: I have fixed some things.
(06:54:20 PM) mcguire: Some things cannot be fixed and some things have not been fixed yet. Some things may never get fixed.
(06:54:30 PM) mcguire: Other things may be fixed, then broken again.
(06:54:35 PM) Mittens: Some things are idiots.
(06:55:23 PM) Mittens: Some things are big idiots.
(06:55:28 PM) mcguire: Finally, there are those things which, like the great blue heron, are neither fixed nor broken, but must merely be understood. Or failing that, just accepted.
(06:55:37 PM) Mittens: Or shot.

Saturday, August 9, 2014

Letterpress cheating in Rust 0.11.0, part 2

I have finally completed upgrading all of the assorted toy Rust programs, my ports of Jeff Knupp's Creating and Optimizing a Letterpress Cheating Program in Python, to 0.11. I also re-executed the notoriously crappy benchmarks.

These programs look for all of the words that can be made from a given set of letters, based on the system dictionary. The argument was "asdwtribnowplfglewhqagnbe", which produces 7440 results from my dictionary with a possible 33,554,406 combinations made from those letters.

Language Program Rust 0.6
Duration
(seconds)
Rust 0.7
Duration
(seconds)
Rust 0.8
Duration
(seconds)
Rust 0.9
Duration
(seconds)
Rust 0.11
Duration
(seconds)
Python alternatives/presser_one.py 49.1 48.6 47.8 39.0 59.4
Nimrod alternatives/nimrod_anagrams 12.3 18.0
Python alternatives/presser_two.py 12.8 12.6 12.3 11.6 17.2
Rust anagrams-hashmap-wide 9.3 15.4 12.1 19.6 15.7
Rust anagrams-vectors-wide 11.8 13.1 12.2 16.8 12.4
Rust anagrams-vectors 8.0 8.2 11.9 8.1 11.0
Rust anagrams-hashmap 6.0 35.5 7.2 7.0 9.3
C alternatives/anagrams-vectors 8.0 5.8 5.8 6.0 9.6
Python alternatives/presser_three.py 6.0 6.3 6.0 5.8 8.3
Rust anagrams-vectors-tasks 27.1 13.8 4.2 4.6 7.7
Rust anagrams-djbhash-tasks 6.2 5.5
Rust anagrams-hashmap-mmap 4.8 10.6 7.3 6.3 2.9
Rust anagrams-djbhashmap 2.8 2.5
C alternatives/anagrams-hash 0.9 1.0 1.0 0.9 1.4

The programming languages and versions for this run are:

  • Python: Python 2.7.6, with Python 2.7.3 and 2.7.5 for previous versions.
  • C: gcc 4.8.2, with 4.6.3 and 4.8.1 for the prior runs, all with -O3.
  • Nimrod: Nimrod 0.9.4 this time, 0.9.2 last, compiled with -d:release.
  • Rust: Rust 0.11.0, compiled with -O.

The various versions of the programs take slightly different approaches. Those with hashmap use a hashtable to store the anagram dictionary while those with vector use a sorted array and binary search to look up anagrams. Those with djbhash use an alternative hashtable implementation, based on the DJB hash algorithm and Python's dictionary implementation. The mmap version, as well as both of the C versions, import the dictionary via mmap rather than reading. All of the programs are single threaded, except for the wide and tasks versions. The wide versions split the dictionary into segments and have each thread search all of the possible combinations in its reduced dictionary. The tasks versions allow each task to have a copy of the full dictionary and the master process divides round-robins the combinations to the tasks. The parameters of each were tuned a while back and have not been adjusted.

Friday, August 8, 2014

Type-safe C?

I'm rather proud of an answer to Robert Harper's discussion of C typing that I wrote as a comment to a post, Six Points about Type Safety.

The post includes the footnote:

Dr. Robert Harper describes such a type safe analysis of C in a comment here

and while I largely agree with the six points, I disagree with Harper (and don't feel the need to carry the disagreement to wherever Harper posted his comment).

He says, in part, "For example, C is perfectly type safe. It’s semantics is a mapping from 2^64 64-bit words to 2^64 64-bit words. It should be perfectly possible to call rnd(), cast the result as a word pointer, write to it, and read it back to get the same value. Unix never implemented the C dynamics properly, so we get absurdities like 'Bus Error' that literally have no meaning whatsoever in terms of C code."

I don't believe this to be the case for two reasons, philosophical and definitional.

In the first place, if Unix, etc., "never implemented the C dynamics properly", we are very definitely into the discussion of the status of the concept of "unicorns", given that no actual unicorn exists. As a result, I feel perfectly free to assert anything without any real worry about contradiction---what is he going to do, declare me a heretic? Further, philosophically---my previously existing, unreasoned prejudices---I find his stance silly.

In the second, he is wrong about the definition of C. It's semantics, for any reasonable definition of the "semantics of C", are not a mapping from any number of any sized words to other words. The C standard, which is not formal but which is C in a real sense, pretty clearly says that the operation he describes is either undefined or implementation defined or otherwise similarly verboten. In which case, Unix does implement C dynamics properly. He and many others may not include SIGSEGV in their mental model of C's semantics, but that does not mean that he nor they are right.

Those are both significant problems, although the first is the worse. In what sense can one talk about the semantics of a language if no implementation of the language follows those semantics (even assuming those are the semantics of the definition of the language)? I do know that such is not a useful thing to do.

[Update] ...and the post on Hacker News for the Six Points article has been declared dead. Boy, do I love HN.

Wednesday, July 23, 2014

Quote o' the day: Ouchie!

From a discussion of the Hemingway app on Hacker News, jaxn writes,

Could someone create a gmail plugin to filter my outgoing emails through this. Maybe it would help to make my emails clearer and more persuasive.

To which fl0wenol replies,

No, because instead of discerning if you're a dullard by the quality of your writing at a glance, I'll instead have to stumble through mechanically de-voiced inanity. At least without Hemmingway your stream-of-consciousness as committed to typed form can be examined and admired as a unique reflection of your particular damage.

And suddenly I've gone from frothing at Hemingway, "utilize", and "gift" (as a verb, for no particular reason) to looking for some pain-relieving ointment

And it's nice to know that I'm not the only one who wants to apply an extra "m" to Papa H.