Curious. Are there examples of programming languages that allow spaces in identifiers? Obviously, it would need to be designed for that.
I'm against the case-sensitive nature of some programming languages, and file systems for that matter. While it makes sense in a computing context (faster), you invent a new mode, just for the computer..
(Fortran was probably case-insensitive because upper-case letters were used first, and when lower-case letters were introduced, care was taken to neatly map them into the character set bit-wise, so one could simply clear the sixth binary digit to turn a lower-case letter into its upper-case equivalent, then continue the string comparison.)
Of course, with Unicode, you need a more complicated check, or you could just ignore Unicode in the language itself for identifiers, yet allow it in literals, data types, etc.
> Curious. Are there examples of programming languages that allow spaces in identifiers? Obviously, it would need to be designed for that.
At least Tcl allows spaces in identifiers, but to 'use' such identifiers one does have to add a bit of extra 'sugar' to prevent the code parser from interpreting the spaces as token separators:
Example of spaces in a variable name:
$ rlwrap tclsh
% set "var name with spaces" "contents of the variable"
contents of the variable
% puts ${var name with spaces}
contents of the variable
% set "var name with spaces"
contents of the variable
Example of spaces in a procedure (function) name (the first line defines the procedure):
% proc {my space proc} {string} {puts "'my space proc' called with string='$string'"}
% {my space proc} "hello how are you"
'my space proc' called with string='hello how are you'
% "my space proc" "the quick brown fox"
'my space proc' called with string='the quick brown fox'
% set pn "my space proc"
my space proc
% $pn "this that and the other"
'my space proc' called with string='this that and the other'
Algol 68 allows spaces in identifiers. That means one has to use one of the "stropping" techniques to distinguish keywords from identifiers--case stropping (IF p THEN foo ELSE bar FI), quote stropping a la the old IBM Algol F Algol 60 compiler ('if' p 'then' foo 'else' bar 'fi'), and at least one other I don't remember offhand.
Case-sensitivity makes sense because it enforces uniformity of code. Case insensitivity only matters when you want to write identifiers differently at different locations. The only place where I see this could be useful is when you use a library that uses a different convention.
The IMO better way to solve this is to set a convention for your programming language and enforce it with the compiler (at least with warnings).
As you say, it's about conventions, but with a case-Sensitive system you are more likely going to HAVE TO enforce naming conventions, because there's a distinction now. Otherwise you'd write "MidiPort"/"MIDIPort"/"midiPort" or what have you.
Keep in mind we wouldn't need to care about enforcing case in a style guide, if case didn't matter, because there are no distinctions, and you'd be more inclined to write it the natural way; no camelCase to overrule ambiguation in writing "MIDI Port" as midiPort, or MidiPort, MIDIport, etc.
Case-sensitivity only creates unnecessary dissonance, and leads to clever uses of that system, adding even more choice; and as we know from The Matrix, the problem is choice. ;)
So if we keep it closer to how we would normally read and write words, I think there would be less dissonance about that aspect of programming, or naming files for that matter.
I approve of case-sensitivity where an initial Capital letter indicates that Something is Publically accessible and it really doesn't matter what happens after that. Hence, we could have:
In contrast to this, I feel it makes sense to have lowercase mean that something is private. Hence, no camelCase:
x y z variable longer-variable-name
I've never been that comfortable about appending numerals to the end of identifiers to disambiguate them as I feel that this is a sign that they ideally ought to be subscripted and implemented as arrays. I much prefer hyphens to underscores but would ideally like to use individual words separated by spaces. This can only work if you have an IDE that hides all the underscores (which are incredibly ugly and serve no useful purpose in printed material these days) as you input them and outputs NBSPs instead and then uses similarly suppressed prefix sigils to style your raw input text into an output which conforms to traditional Mathematical notation. Hence, we could have:
/foo_bar + /bar_qux
become:
foo bar + bar qux
similarly, the following is not a problem if you take advantage of the syntax rule that requires at least one space either side of an operator. Hence, we could have:
/foo_bar / /bar-qux
become:
foo bar / bar-qux
i.e. the / sign isn't echoed when you initially type it as it is expecting a letter, but when the IDE receives whitespace it belatedly echoes it as the operator symbol as it is now sure that it isn't a suppressed sigil.
> Case-sensitivity makes sense because it enforces uniformity of code.
I don't see that case-sensitivity helps to achieve uniformity of code that much. Factors like code structure, common design patterns, and source code formatting are more important. The approach to the structure and design of an application or library is something that each individual development group decides for themselves. Source code formatting can (and should) be enforced by formatting tools.
Having used a case-insensitive language for a while (Object Pascal) I find that developers tend to follow the case convention of a given software project anyway and if they don't the case-sensitive typos aren't an issue. They don't make the code harder to understand and it all compiles.
Reading Erlang is really nice partially because all variables are capitalized (enforced by the compiler). You know immediately which parts of the code are what.
It actually drives me a little nuts that I can't do the same thing in Elixir (compiler enforced lowercase) because so much of the code looks the same.
Similarly, R (which is basically a Lisp with C-like syntax) allows spaces in identifiers, though you'll have to construct usages of such identifiers using quote() or backticks.
Yes, I alluded to that by way of the sixth bit, I just had no reference to whether it was something they chose to do, or had no other choice at the time.
VHDL allows arbitrary text in its extended identifiers. The feature was added for easier interop with other tools that have less restrictive rules than VHDLs normal identifiers.
I'm against the case-sensitive nature of some programming languages, and file systems for that matter. While it makes sense in a computing context (faster), you invent a new mode, just for the computer..
(Fortran was probably case-insensitive because upper-case letters were used first, and when lower-case letters were introduced, care was taken to neatly map them into the character set bit-wise, so one could simply clear the sixth binary digit to turn a lower-case letter into its upper-case equivalent, then continue the string comparison.)
Of course, with Unicode, you need a more complicated check, or you could just ignore Unicode in the language itself for identifiers, yet allow it in literals, data types, etc.