The C section is so wrong it's unbelievable. Or why I created the Pastebin below...

pjmlp · on Nov 15, 2015

Yeah, here is some additional info:

This is the typical C fanboism that sells C as the AD of systems programming.

Here are some other languages, for those that don't want to go through the article, about well known systems programming languages before C had any relevance outside AT&T.

- Algol on B5000 (1961)

- PL/I on System 360 (1964)

- PL/M on CP/M and AS/400 firmware (1972)

- Mesa on Xerox PARC Alto (1970)

- XPL on Sytem/360 (1971)

- HAL/S on AP-101 (1970)

- JOVIAL on 465L (1970)

- CPL on Titan and Atlas (1963)

- BCPL on IBM 7094 (1970)

But lets pretend C (1972) was the first one.

Interesting enough, just like they did with Go, C dropped most of the features used by the other systems programming languages of the day.

nickpsecurity · on Nov 15, 2015

Appreciate the list. I'll incorporate this into my links in future somehow. You should leave off CPL in the future as it was designed but never implemented. The inability of EDSAC to compile it was very reason BCPL and later C existed.

drivingmenuts · on Nov 15, 2015

Most of those languages are as close to dead and buried as can be, except maybe on an academic's or aging hobbyist's bookshelf, or perhaps running on a ghost machine in a forgotten room at a large corporation or government office. The majority of people who programmed in those languages in their heydays have long since retired.

Sure, there were languages before C that did Important Things, but C works as a starting point of discussion because most programmers today have at least heard of it, even if they never use it.

Aside from NASA or JPL, who would even start a new project in any of those languages, because of one important paradigm?

nickpsecurity · on Nov 15, 2015

Read my link... just the numbered list if you're in a hurry cuz it takes only minute or two. Any discussion of C vs other languages should be able to explain why C is the way it is and why a replacement should be different. If C wins, it should be due to being the best design for systems programming. Any information about C should also be accurate.

As pjmp said, article rewrites history by falsely claiming it was first HLL and that it was designed for portability. Neither were true. Additionally, competing languages (even on PDP-11!) had better maintainability, safety/security, and ability to code in the large. Many were portable with a few performing as well. BCPL was whatever could compile on a 1960's computer: nothing more. C was BCPL w/ structs and byte-orientation.

So, should we start with one of the better languages that resemble good languages of today to define attributes for systems programming? Or should we start with a semi-language that had no design, tons of problems, and all so it could compile on a 1960's machine? I think the former is the obvious choice. That those supporting the latter lie about C's origin, design, competition at the time, and so on to push it instead is... more disturbing.

Hence, me countering it readily.

"Aside from NASA or JPL, who would even start a new project in any of those languages, because of one important paradigm?"

There were repeatedly languages from that time available. The best for modern 3GL users were Wirth's line: industrial Pascal's, Modula-2, Modula-3 (especially), Oberon's. All were used to write OS's on minimal hardware with more readability and safety than C. Ada was rougher but safer and did the job too. LISP and ML derived languages only got better and better over time with Racket & Ocaml being the best today. I've even seen Ocaml used with an 8-bit runtime (!). There were also macro-assemblers that focused on semi-HLL's for ultra-efficiency and optimization. LLVM comes to mind. ;)

This stuff didn't fade into obscurity or only for obscure platforms. Good versions stayed around in industrial form for decades with little adoption while people tried building on something that wasn't designed for all its uses. Just compiling on an EDSAC in 1960's.

drivingmenuts · on Nov 15, 2015

You need to re-read the OP's linked article.

It's not an academic paper or even anything like.

It's. Notes. From. A. Presentation.

It's not the entire history of compiler languages and the lineages and branchings thereof which make Game of Thrones look like Jane and Dick.

If you're going to spend four sentences speaking about a core language, that is still in active use, that the audience will have heard of then C is a far better goto language than any of the ones listed above.

> All were used to write OS's on minimal hardware with more readability and safety than C.

Awesome. That decade abended years ago. They are no longer germane to a discussion at a conference in 2015, unless you're deep diving in language lineage.

Which the author clearly wasn't.

Edit: my bad. It was seven sentences.

pjmlp · on Nov 15, 2015

That is all fine and dandy if the presenter didn't lie to the audience:

"Before C there was assembly language...."

Is clearly a misinformation given to the audience.

Just because it wasn't a scientific paper doesn't make it less wrong.

The presenter has clearly driven all in the audience that don't know better, that C was the first high level systems information programming language.

Had he said that C was the systems programming language that won the hearts of the systems programmers we wouldn't be discussing this.

This misinformation has been happening since the mid-90.

drivingmenuts · on Nov 15, 2015

""Before C there was assembly language" is technically correct, unless you know something I'm unaware of.

That there were other languages between those two isn't important in the context of creating an example in one presentation and it for-sure is not factually wrong.

Just so we're absolutely crystal-clear here, how, exactly, would you have phrased the paragraph about C, without wandering off into the endless maze of computer language lineages, but keeping the same general intent, in the context of the remainder of the presentation?

pjmlp · on Nov 15, 2015

It factually wrong because there were options to do programming without Assembly before C existed.

One such example is the Burroughs B5000 system with Algol in 1961. This is just one example from many, available to anyone that cares to a little research.

I would have phrase it like this:

"Before C there were other systems programming languages, but for various reasons they lost to C and now it is widespread through the industry as we know it."

Followed by his note on how all computers in the room have their OSes coded in C.

cwyers · on Nov 15, 2015

Okay, but would phrasing it that way add ANYTHING to the point he's trying to make? Would it help explain his point? (And it's not factually wrong, you're just deliberately misreading it so you can flog the dead horse about the ancient lost civilizations of programming languages before C.)

pjmlp · on Nov 15, 2015

It wouldn't fool his audience that C was the very first systems programming language to exist.

However telling the truth is worthless, lost time and doesn't contribute to the learning of younger audiences.

Please excuse me, I have some books to burn.

nickpsecurity · on Nov 16, 2015

"C certainly wasn’t the first high level language, but it was one of the first to take portability seriously"

That's totally wrong. The C philosophy and structure were almost 100% due to BCPL with minor changes. BCPL was portable by accident due to its simplicity. They just got rid of every good feature of a systems language to the point that the remainder, barely a language, could be put on about anything.

Thompson tweaked BCPL into B to make it work on his terrible PDP-7. That wouldn't work on a PDP-11 so more tweaks by Ritchie into C. That couldn't even handle UNIX alpha so that added structs. That's C 1.0. A year later, people started porting it because it was barely a language & could fit in their hardware. All that was in my link about its history from documents and people that made it.

http://pastebin.com/UAQaWuWG

"Before C there was assembly language"

Also wrong and contradicting his claim about HLL's before C. As pjmlp listed, there were numerous systems languages in production before C was designed. It's not like we made a huge leap of faith into structured HLL with C. We actually cut out maintenance, reliability, readability, integration, etc capabilities of existing languages (esp CPL) to iterate into C. And, later on with good hardware, people started adding many of those back because C was garbage that led to all kinds of problems.

We can certainly accept a need to learn C language and tooling for working with existing OS or app code. That's called legacy system effect. However, we should never push false justifications for that which detract from superior language design that would've enabled more robust, maintainable, productive development. We can accept both the what and the why of C at the same time.

As a side effect, more people might explore features of C alternatives to build the next, best, systems language. Wirth has been doing that for years. Ada still gets updated with one version able to prove absence of many, runtime errors at compile time. Clojure is showing LISP's advantages and it was once used for OS's w/ incredible flexibility + robustness. So on and so forth. Gotta counter the C disinformation and misinformation so enough people know why they should do the opposite of it so some [more] eventually will.

Note: There's also the value of removing inaccuracies in historical writings that have negative effects. Mere editing and revision. I hear that's important to some people, too.

nickpsecurity · on Nov 15, 2015

I think it's funny you disregard historical accuracy in a presentation about the legacy of a language where the author looks "at historical examples to set the stage" with an image of "the History of Programming Languages." You're really grasping at straws trying to argue actual truth or historical evidence aren't relevant in such a post. Worse, any reader reading such things would assume the person did some research on it.

Then, from that point, almost everything the author says about C's history, purpose, and effect were wrong. The only thing correct is everything in the room at the time was written in C [probably]. Which is actually meaningless given it's a historical accident far as the language itself goes.

So C's legacy is that most, mainstream OS's are written in C. The reason? Thompson and Ritchie used C for UNIX plus shared it with lots of people with poor hardware. That's it. It wasn't designed, it wasn't intentionally portable, it wasn't better than anything then or now... nothing really about C itself justified it sticking around. Purveying such myths makes people think it's technically superior and that we should keep investing into it and similar languages for system use. Very damaging myth. So, I counter it everywhere I see it.

"If you're going to spend four sentences speaking about a core language, that is still in active use, that the audience will have heard of then C is a far better goto language than any of the ones listed above."

I can bring up C without lying about it's history, portability, and so on. The author was probably mislead by others rather than intentionally doing it, though. Hence the need for these comments.

"That decade abended years ago. They are no longer germane to a discussion at a conference in 2015, unless you're deep diving in language lineage."

Many are still around, in commercial use, and some updated. Still better than C at robust, systems programming. I agree the author doesn't really need to go into them nor did I ask that in original comment. Just that any statements are accurate in a post about history and legacy. This is why I didn't gripe about "C++ also codified the ideas of zero-cost abstractions," which existed before C++. Yet, C++ brought them to widespread attention. Probably the author's intent and so no gripe from me.

Didn't you consider the fact that I left the rest of the write-up alone in my comment? That it might be for a reason I had lazer-like focus on the inaccuracies in the C entry rather than attacking the presentation as a whole. I get what the article is about and that's why it should maintain historical accuracy more than most.

pjmlp · on Nov 15, 2015

No, that is called rewriting history.

Maybe we should totally ignore what the Greeks, Romans and Egyptians among other civilizations brought to mankind, because we weren't born in those days.

agumonkey · on Nov 15, 2015

That says something about human nature, do we care about the genes or the face that carries them ?

gajjanag · on Nov 15, 2015

Thanks for the illuminating link.

Nevertheless, I unfortunately do not see an easy solution in the near future that avoids C/C++ much as I would love to ditch them. Let us leave aside the question of legacy code and focus on future systems projects.

General issue: C and C++ have an advantage in the sense that we understand their many problems well due to their long history, and they have formal specs that are far more carefully written than most others, where documentation is spotty and formal specs are often lacking. Example: I have not studied Rust closely, but it seems to me that there are sections that essentially must be placed under "unsafe" for performance or low level access. The oft-repeated claim is that it minimizes the surface, since it localizes the unsafe sections. I have not seen any hard, formal guarantees in either the spec or implementation as to what is precisely meant by this. I am very interested to know how I can reason that the unsafe section has zero side effects outside of it. Such a statement in general must be false, since the unsafe block clearly communicates with the rest of the program, else it is dead code. And if the interface is sandboxed (an imperfect solution), then it implies some kind of performance penalty across the barrier, implying additional cognitive load for a developer since he/she has to reason about the possible performance impacts for such things, among others like array bounds checks. Lastly, if the guarantee regarding the extent of the "unsafe" block is as complicated as I suspect it to be, the load is increased further. Of course, the load comment is really relevant to Rust vs C and not Rust vs C++.

Specifics: 1. Go - garbage collector, large runtime/binary sizes, portability issues (AIX: https://groups.google.com/forum/#!topic/golang-nuts/IBi9wqn_...). 2. Rust - too new, general points above. 3. D - lacks the same energy as the above projects, so likely fares worse than the above. Furthermore, due to its closer ties to C++, its rather incremental nature resulted in a significant loss when C++11/C++14 came around. 4. Others - too many to list, focused on above due to their frequency on Hacker News.

pcwalton · on Nov 15, 2015

Focusing on Rust "unsafe" as some sort of way to ding its safety drives me crazy.

"unsafe" blocks in Rust is designed to be just like the compiler backend. We could have built things like Vec::new and vector indexing into the code generator, like Go or Java did. Then we wouldn't need the "unsafe" construct in the language (except to interface with kernel32.dll and/or syscalls). But that would make our lives harder for absolutely no extra safety gain. It's precisely the same thing as implementing things in the compiler backend, but it's safer than doing that, because we get to write code in a stronger type system than that of raw LLVM.

More succinctly: What makes the compiler backend safe while unsafe blocks are not? Can you name one thing that makes the compiler backend safer than code written in unsafe blocks?

gajjanag · on Nov 15, 2015

I did not "ding" it, but asked for an honest, transparent response if available. It can only help the Rust community by making such things clear. I focused on it as it is what I got probing Rust since after all there can be "no free lunch".

The compiler backend may be viewed as a smaller component that needs to be trusted. General code lies on top of it, and there is a big difference due to code volume.

kibwen · on Nov 15, 2015

The quantity of unsafe code in the standard library represents far, far less code than that of the compiler. Furthermore, unsafe blocks don't turn off the type system or anything, they only let you do four extra specific actions (possibly only 3 someday). The correctness of the compiler can be leveraged to assist in determining the correctness of the standard library, and living in a typical library structure makes the code easier to understand and audit than if it were entangled up with compiler logic.

pcwalton · on Nov 15, 2015

I don't understand what "there is a big difference due to code volume" means.

gajjanag · on Nov 20, 2015

I was referring to all usage of Rust by clients across the world versus a single compiler and reference standard library. Many of these clients will need "unsafe" blocks, and the combined length of these "unsafe" blocks will exceed that of the standard library and compiler assuming Rust adoption is high.

This is what concerns me: I wished Rust's "unsafe" blocks could have been exclusively confined to some things in the reference language compiler and standard library. Unfortunately, it seems like many reasonable systems applications still need access to unsafe blocks for reasons of performance, low level access, etc and such needs are not uncommon given the target systems audience.

kibwen · on Nov 21, 2015

  > I wished Rust's "unsafe" blocks could have been 
  > exclusively confined to some things in the reference 
  > language compiler and standard library.

This sadly isn't possible in a language at the systems level. If you don't provide an escape hatch then users will just use the C FFI as an escape hatch, which results in far more potential unsafety than Rust's `unsafe` blocks.

Manishearth · on Nov 15, 2015

> I can reason that the unsafe section has zero side effects outside of it

There are basically a core set of invariants that `unsafe` code must hold, and you can reason about it in those terms.

http://doc.rust-lang.org/nightly/nomicon/ talks a lot about this.

nickpsecurity · on Nov 16, 2015

"General issue: C and C++ have an advantage in the sense that we understand their many problems well due to their long history, and they have formal specs that are far more carefully written than most others, where documentation is spotty and formal specs are often lacking. "

See, even that is misleading. The C specs have a ridiculous amount of undefined behavior and dark corners that can mess up developers. C++ probably does too more than I know. Their specs are actually so hard to formalize that people got Master's degrees, etc for pulling off part of that after decades of people trying. Whereas, many LISP's, ML's, and Wirth languages (eg Oberon/Modula lines) were described quite succinctly and including code. C & C++ specs are so horrible you can't even be sure what your code will do even if you write it to spec. !?

"Example: I have not studied Rust closely, but it seems to me that there are sections that essentially must be placed under "unsafe" for performance or low level access."

Others are addressing the point about Rust, which I'm ignorant about. What I can say is the generic rule on this issue. Most safe-by-default type languages wrap unsafe behavior behind function calls which still have type/interface checks. If that code behaves correctly, you get to leverage type system of safer language to make sure it works with the rest correctly. If it behaves incorrectly, one of two things happen: local damage where an incorrect result is obtained; application crash or hack. You wouldn't be using the unsafe code unless you thought it was necessary. So, doing safe + a little unsafe adds no risk vs going all unsafe in C, etc. Yet, it counters many risks. So, it's a good tradeoff even if unsafe code can have unpredictable effects on the rest.

Note that one can attempt to further mitigate without sandboxing by writing that unsafe code in a way amendable to static analysis or through testing of what it does. One can use something like Frama-C or SPARK Ada to model that one part to make it as bulletproof as possible. Only then include that algorithm into the otherwise safe program. And again wrapped with interface safety to counter even more risk.

C and C++ will always have an advantage over sheer momentum. However, there's always alternatives available that lack the problems of the above. They don't have to have the GC, portability, and size issues as Ada and Wirth's languages always showed. I mean, Wirth's Pascal/P was safer than C, efficient (although not max), and more portable given backend was simple, stack-machine anyone could implement. It was ported to 70+ architectures that differed a lot. The ports of Oberon compiler and OS to each new hardware took 1-2 undergrads under 2 years each time due to good design, modularity, and simplicity. What kind of effort have the UNIX's and C compilers took to put on each new architecture? ;)

Anyway, I agree if you want max contributions by existing programmers that C is likely best choice for OS development. Or at least the kernel of something like JX Operating System or VerveOS where rest is done safely. If you want best results, then avoiding C and using safer alternatives is best. You'll get so much more done w/ higher robustness in the time you save debugging. :)

gajjanag · on Nov 20, 2015

I agree with the point regarding C++'s spec, but don't fully buy the point about C's spec. C's spec is surprisingly succinct. C code written to spec is not harder to think about than any other language I have worked with. I have had as much trouble getting my Python/MATLAB/Julia code to dance the way I want it to as I have with C. What is different is that if you get it wrong, all hell breaks loose in terms of security issues.

Regarding the wrappers behind unsafe behavior, in addition to some of your ideas, a lot can be done even in C, see e.g netstrings http://cr.yp.to/proto/netstrings.txt, and other safer interfaces. It requires thought, but such thought needs to be devoted in designing other languages. Unfortunately, the C/C++ standards committee rarely accepts such slower, safer extensions, forcing clients down the dark road of third party libraries and the endless choices available there, some of which are horrible and actually worse than the stdlib. I believe a lot of the problem is that there is heavy disagreement as to what interface is best, see e.g strlcpy and its adoption. Getting a large committee on board with solving something is a monumental problem, even if all acknowledge that the current situation is terrible :).

"Best results" is a very loaded term, and I tend to avoid it due to the large number of dimensions to it, the most common being the classic performance and security axes. Nevertheless, we mostly agree on the key points, with some differences in the details.

TL;DR: I have not found something representing a Pareto improvement over C. All improvements tradeoff some aspects, and it is thus sometimes not clear that there is a better alternative to C. My stronger claim is that the above is true even if one ignores legacy issues.

agumonkey · on Nov 15, 2015

Very much appreciated Olve Maudal talk about C origins. Thanks a flot !

nickpsecurity · on Nov 15, 2015

Very welcome and good to see the info getting well-received. Can't get rid of support for this monstrosity (C) until people see exactly what and why it is.

agumonkey · on Nov 15, 2015

I don't react as strongly (was it the main motivation behind your research ?) alghouth, C weaknesses really do allow for too many nasty issues in code we rely on. What astonishes me is how something which seems a crude plagiarism became the mainstream of low-level and fast for so long. Even UNIX ... I sense a strong pragmatism about market, ecosystem, network effect and the like. I left the video thinking Unix/C was the WordPress/PHP of its domain and era....

nickpsecurity · on Nov 16, 2015

It was a partial motivation. Many have stars in their eyes seeing C as best-designed system language to the point that modifications aren't considered for bare-metal. One motivation is countering that with cold-hard truth about its alleged design and real history. Other people already see its issues but might want to learn more or enjoy the full context. This re-enforces the move away from such features where possible along with providing evidence to deliver in future conversations supporting that. Either benefit is a fine result to me.

"What astonishes me is how something which seems a crude plagiarism became the mainstream of low-level and fast for so long. Even UNIX ..."

Richard Gabriel explained that well in his Worse is Better essays. Hence why you see me mention it a lot. Has much to do with ease of participation, economics, and group dynamics. Things like C, UNIX, and early OSS made it easy to get started to gain a critical momentum. Past that, it's basically all momentum and its effects. One usually can't counter momentum so much as create an alternate momentum or divert flows off the existing momentum. Hence, co-development of radical approaches like SAFE or JX OS's plus legacy-supporting approaches like Nizza Security Architecture or Cambridge CHERI processor & CHERIBSD. Yet, the full momentum of UNIX and C remain as more gets piled onto them despite the cost.

https://www.dreamsongs.com/WorseIsBetter.html

Note: Ignore his emotional rants tied to specific tech to focus on the effects of the New Jersey approach on market take-up. Truthfully, I think he discovered the time-to-market effect of technology and a method of achieving it before it became a common thing people wrote about. Then again, I haven't studied the history of that enough to be sure.