On Complete Gibberish

Posted on August 15, 2017 by Tommy M. McGuire

Or, Programming language syntax that I don’t like

We interrupt our regularly-scheduled nonsense (Wait! Did someone make up a schedule? Nobody told me! – mcguire) for a brief screed on programming language syntax.

Syntax vs. semantics. Image stolen from Gary Larson, The Far Side, by the Linguistic Society of America.

By and large, I am pretty flexible regarding programming language syntax. I’m not one of those people who hate Lisp because of all of the silly parentheses or who won’t use Python or Haskell due to “significant whitespace”, or even have strong feelings about semicolons. Lisp’s fully-parenthesized nature adequately expresses the structure of the code (and eliminates any ambiguity!) and frankly is not all that hard to deal with. If you have trouble matching a parenthesis, you’re probably doing something wrong. In Python, indentation also describes the structure of the code, and does so without the redundancy of braces or what-not.

If you really, truly, don’t like significant whitespace, use Fortran:

There is a useful lesson to be learned from the failure of one of the earliest planetary probes launched by NASA. The cause of the failure was eventually traced to a statement in its control software similar to this:
    DO 15 I = 1.100
when what should have been written was:
    DO 15 I = 1,100
but somehow a dot had replaced the comma. Because Fortran ignores spaces, this was seen by the compiler as:
    DO15I = 1.100
which is a perfectly valid assignment to a variable called DO15I and not at all what was intended.

And I just don’t get the hatred for semicolons. Especially when it leads to stark, raving insanity, like JavaScript’s semicolon-insertion rules. Just tell me whether they’re supposed to separate statements or terminate them (and terminating is a little easier to use); other than that I don’t care.

But there are three programming languages whose syntax rubs me the wrong way. (Light your flamethrowers now!)

ATS

The first is a language that, by all rights, I should love. (And I kinda do. I just don’t use it enough.) ATS is a brilliant language. It starts from a loosely ML-like base and adds a strong, very powerful type system that includes linear and dependent types; it’s forte is garbage-collection-less systems programming. (When I started playing with ATS1 (ATS/Anairiats), its tutorial and standard library had a wonderful staged approach, where the language could be used with a garbage collector and simplified library to appear to be ML, while introducing dependent typing and other features incrementally. I’m not sure if the current ATS2 (ATS/Postiats) follows suit, but even if it doesn’t, ATS is more than worth learning. Plus, it’s got some great names: I just love saying “Anairiats”.

On the other hand…

ATS’ ML basis becomes something of a liability when combined with its take on dependent and linear types. Lemma functions and so-forth manipulate a more-or-less invisible proof state, but must be sequenced to preserve their meaning, along with the functions using the proof state. The results often look something like:

fun ... = let
  val x = ...
  val (prf | y) = ... // Yeah, prf is proofy stuff, y is a value.
  val () = ...
  val z = ...
  val () = ...
in
  ()
end

Couple that with ATS’ function templates (for polymorphism), with parameters looking like {n:nat} and weird constructions like t@ype (for a non-machine-word-sized type) and you have a language that I find infelicitous for its common use and horrendously complex.

I admit, I haven’t used ATS enough to do more than play a bit. It looks fantastic, and seems like it would fit exactly what I want out of a language. But…I just can’t.

Ada

Ada is another language that I ought to like. Fundamentally, it is a safer C, with a bunch of extra, if nice, features. Sure, before C++ became C++, Ada was the definition of a large language, but it was always intended to be a safe, usable systems language. And who doesn’t want that? No one, right?

The one bad part is that Ada is case-insensitive. If Content_Free(Data) is the same as if content_free(data) is the same as iF ConTenT_fReE(DATa). Case-obliviousness in itself is not a bad thing, although it throws away information that could otherwise be used. But the problem is the conventions that have grown up around Ada capitalization.

 while I < J and then Values(I) <= Values(L) loop
    pragma Loop_Invariant (L <= I and I <= J and J <= U);
    pragma Loop_Invariant (for all K in L .. I => Values(K) <= Values(L));
    I := I + 1;
 end loop;
 pragma assert (for all K in L .. I - 1 => Values(K) <= Values(L));

 while I < J and then Values(L) < Values(J) loop
    pragma Loop_Invariant (L <= I and I <= J and J <= U);
    pragma Loop_Invariant (for all K in J .. U => Values(L) < Values(K));
    J := J - 1;
 end loop;
 pragma assert (for all K in J + 1 .. U => Values(L) < Values(K));

 if I < J then
    pragma assert (Values(I) > Values(L) and Values(L) >= Values(J));
    Swap(Values, I, J, Original, Witness);
    pragma assert (Values(I) <= Values(L) and then Values(L) < Values(J));
 end if;

Apparently, keywords like while, and, and then are in lowercase. All of the user-defined identifiers are (a) capitalized, (b) in camel-case, and (c) with underscores. I feel like I’m typing in German and the language is all, “Sprichst du Deutsch?” and then it’s all “Animals vill be bred und Schlaughtered!” and then every time I hit enter, I’m all, “Yeee-haw!” and riding a bomb out of a bomber’s bay and there’s blinding flashes of light and I start worrying about all of our precious bodily fluids. (Grain alcohol and distilled water!)

It’s just too stressful, you see.

(This is all about syntax, I told you. I didn’t even mention that it’s apparently impossible to find documentation for the Ada standard library online. In fact, the only documentation I’ve found is the Texinfo version of the Ada Reference Manual, ISO/IEC 8652:2012(E). And looking for the standard library doccies in there…ew. But I’m not complaining about that here.)

Nim

I did, in fact, play with Nim for a bit. Python-ish syntax, semi-optional semi-real-time garbage collection, yadda, yadda, all sorts of neat features. Then there’s this, from the Nim Manual:

Identifier equality

Two identifiers are considered equal if the following algorithm returns true:
proc sameIdentifier(a, b: string): bool =
  a[0] == b[0] and
      a.replace(re"_|–", "").toLower == b.replace(re"_|–", "").toLower
That means only the first letters are compared in a case sensitive manner. Other letters are compared case insensitively and underscores are ignored.

This rather unorthodox way to do identifier comparisons is called partial case insensitivity and has some advantages over the conventional case sensitivity:

It allows programmers to mostly use their own preferred spelling style, be it humpStyle, snake_style or dash–style and libraries written by different programmers cannot use incompatible conventions. A Nim-aware editor or IDE can show the identifiers as preferred. Another advantage is that it frees the programmer from remembering the exact spelling of an identifier. The exception with respect to the first letter allows common code like var foo: Foo to be parsed unambiguously.

Historically, Nim was a fully style-insensitive language. This meant that it was not case-sensitive and underscores were ignored and there was no even a distinction between foo and Foo.

So. Yeah.

I really wonder if anyone is actually taking advantage of the extra flexibility. I don’t wonder about the readability of the result, outside of a sufficiently advanced Nim-aware editor. Personally, I think I’d rather contaminate my precious bodily fluids.

Conclusion

There you have it; the only reasons, and the only three languages, that I don’t care for the syntax of. Excess flexibility, unfamiliar community conventions, and most importantly, unsuitability are the only issues I have.

Well, ok. Except for COBOL.

           ADD 1 TO X GIVING Y.
      *  Alcoholic yet?

Where was my grain alcohol?

Aug 16 2017: Added link to source of Fortran quote.