This is an excellent summary Steve, it also points out one of the challenges of 'System' languages, which is the requirement for 'unsafe.'
One of the first things I did when I started working on Oak (which became Java) was to see about writing an OS in it. The idea being that if you could write an OS in a 'safe' language then you could have a more reliable OS. But we were unable to write it completely in Oak/Java and that lead to some interesting discussions about what might be the minimum set of 'unsafe' actions might be required by a systems language.
Sadly we did not get to explore that very much, although I did pass it on as a possible thesis topic to some interns who came through Sun at the time. I'd be interested in your experience with what actions require 'unsafe' and if you have seen a canonical set that might point toward a process to get to a 'safe' OS.
Some of my closest friends in college specialized in operating systems, and we (mostly them) worked on http://xomb.org , an exokernel in D. I'd hope that today we'd choose Rust instead.
As she says "This is typical of a lot of Rust code I’m writing – I need to write a lot of unsafe code."
I'll have to give this 'minimum set' idea some thought. I think that safety will be more useful for things like kernel modules than in the kernel itself, though I'm not sure why I think that, exactly. Hmmmmm...
Exactly, and of course this leads me to wondering about computer architectures that "require" unsafe things in their kernels, versus things like Multics where the OS and hardware co-operated at some level to make things safe. Lots of interesting questions.
For reference, the following actions are what are enabled within `unsafe` blocks in Rust:
1. Dereferencing raw pointers (a.k.a. "unsafe" pointers). Note that's just dereferencing: there's nothing inherently unsafe about just creating and passing around the pointers themselves.
2. Calling a Rust function that has been marked with the entirely-optional `unsafe` keyword.
3. Calling an external function via the C FFI, all of which are automatically considered unsafe.
>Note that's just dereferencing: there's nothing inherently unsafe about just creating and passing around the pointers themselves.
Sure there is. Passing around multiple pointers can result in a double free and then the dangerous dereference can happen. I'm guessing the developer must be careful when writing unsafe blocks to make sure this doesn't happen.
- dereferencing
- arithmetic (implemented as ptr.offset(number) in the stdlib),
since it is undefined behaviour for LLVM to move a pointer to
point outside* of the object that it came from originally
(even without deferencing)
Everything else is OK. (Although the only other thing possible is passing it around as "black box" value.)
I'm not 100% sure exactly what you mean by 'built-ins', but Rust is almost entirely written in Rust. I don't think there's anything major implemented via C FFI in the compiler itself.
I remember hearing about Singularity a few years back - it sounded promising. Looking at the linked page, there doesn't seem to be anything published or presented since 2009. Anyone know what's going on with Singularity these days?
Sun eventually did JavaOS, though I'm not sure how much of its code was Java.
IBM did Jikes RVM (old name: Jalapeno) which was mostly, if not purely Java. They handled the bootstrapping problem with some clever ahead-of-time compilation + meta-programming. Even their GC is implemented in Java.
I think we can be a little bit more charitable towards C++. Modern compilers will let you know if you try to do something as obviously incorrect as returning a pointer to a stack variable.
$ cat > foo.cpp <<EOF
> int *dangling(void)
> {
> int i = 1234;
> return &i;
> }
> EOF
$ clang++ -Werror -c foo.cpp
foo.cpp:4:13: error: address of stack memory associated with local variable 'i'
returned [-Werror,-Wreturn-stack-address]
return &i;
^
1 error generated.
Thank you! Maybe I should explicitly show a new/free example instead, or does that end up having a similar warning?
I haven't written serious C++ in years, so I have some blind spots. Others on the Rust team have done quite a bit, so they tend to pick up my slack in exactly this manner.
I wasn't trying to undermine your argument. Rust is solving real problems. Use after free and double free are still issues in the C world. In modern C++ we are (hopefully) using smart pointers (std::unique_ptr, std::shared_ptr) to manage heap-allocated object lifetimes.
> In modern C++ we are (hopefully) using smart pointers (std::unique_ptr, std::shared_ptr) to manage heap-allocated object lifetimes.
Those aren't safe. There are many ways to cause use-after-free with unique_ptr: for example, placing a uniquely-owned object in a vector and clearing the vector in a method call on that object.
There are still lots of user space applications written in C/C++ nowadays, that could be easily rewriten in Go or any other safer language with native code compilers, without noticeble performance lost for the problem being solved.
The remainig issue for many of those is "need better GC". Apps like games could be brought in with near-pauseless GC that currently seems to exist only in Java land.
(Yes you can work around the current GC's but this drives away some developers)
Sometimes these applications are written in C for the extreme portability rather than performance. I have a couple open source projects in mind that support a huge selection of OSes and hardware configurations that would not be possible with some of the other current higher level language but almost every platform supports C to some degree. I don't know how well Go supports weird configurations like MMU-less uClinux installs but the GC wouldn't be what is breaking it.
I don't really understand concepts of low level languages, but I've had a perception that if I'd wrote a program in Go using the standard libary and compiled it with go build it would run on all imaginable platforms. Now, as far as I've understood your comment suggests that I'm wrong.
Lets suppose that some corporation would download the binaries of my project and use it on an embedded system, would it run?
It depends on how "embedded" embedded means to you. I've talked to some people who call a CPE Linux/Atom box embedded and I know some people who don't consider anything with an OS embedded. I won't even touch that argument. :)
But I used MMU-less uClinux as an example because it breaks a lot of assumptions that are made in a lot of code. No dynamic linking(which if I remember go doesn't support by default, or does it even support at all? I haven't written any in a year or two), no protected memory, and no real fork (vfork instead) crosses a lot of code off the list on what you can run. I don't know if go can support it, but I know that the current offical toolchains don't. There may be a feature go has that prevents them running on a platform like this, but it won't be GC. There are some GC'd languages that will run on this platform.
C is great in that is makes very few assumptions and if written carefully can be extremely portable. The base language doesn't even assume there is an OS present. I've only run into one platform that had very little C support. It was a small 8 bit uC that had a very strange execution stack that made C as everyone uses it hard to efficiently implement.
In addition to what stusmall's said about "it depends what you mean by embedded":
Go compiles to native code, not bytecode. Your post makes it seem like you might be missing this fact.
Go has good cross-compile support so from say a Windows box I can compile a Go program using just the standard library that will run on Windows, or I can compile one that runs on MacOS, or I can compile one that runs on Linux, and I can even compile one that runs on a different processor arch (like I can be running Windows/x86 and compile a binary for Linux/ARM, for example).
But(!) you have to compile a separate binary for each of those platforms, you can't compile just one single executable that runs on all of those systems. The output of the Go compiler is native machine language code for a specific architecture and OS.
eg. I want to compile an http server and run it on either a Windows/x64 box or a chumby Linux device and I'm currently on a Linux/x86 system:
cd mygoprogram
export GOARCH=arm
export GOARM=5
export GOOS=linux
go build
I now have an executable in the current working directory named "mygoprogram" that will run on a Linux system with an ARMv5 processor (say, an old chumby device)
export GOARCH=amd64
export GOOS=windows
go build
I now have an executable in the current working directory named mygoprogram.exe that will run on a Windows/x64 system, but it is completely separate from the binary that will run on the Linux/ARM device despite being built from the same source code.
mygoprogram.exe will not run on the Linux system, and vice versa. Go is like C/C++ in this regard as opposed to say Java where a common bytecode format will make the compiled code still platform independent.
The main issue with that, is that although there are lots of GC enabled languages to choose from, not all of them have mainstream AOT compilers available.
Shipping a VM with the product is not always possible/desireable.
Even with that constraint, OCaml's 15 years old, so's SML/MLTon, D's 12 years old (D2 is 6 years old), Eiffel's nearly 30 (and moved under ECMA in 2005), Common Lisp can be compiled to native via CMUCL, SBCL or CCL and Scheme via Chicken Scheme or Chez Scheme.
And these are off the top of my head, I'm sure there are others.
This isn't so much an introduction to Rust as it is an introduction to Rust's concurrency model.
The example of returning a reference to an automatic variable isn't super compelling, since every competent C/C++ programmer knows not to do it. That bug does pop up every once in awhile, but almost always in the context of a function that returns a reference to one of many different possible variables depending on some condition in the function.
Does Rust really call its threads "green threads"? Green threads have a weird reputation.
Copy like "this allows you to, well, read and write the data" could be tightened up; it's an attempt at conversational style that doesn't add much. "That doesn't seem too hard, right?" is another example of the same thing.
How much of Rust concurrency is this covering? How much of its memory model? Does the whole concept of Rust concurrency and memory protection boil down to "the language provides an 'unsafe', and then people write libraries to do things with it"?
> This isn't so much an introduction to Rust as it is an introduction to Rust's concurrency model.
Ownership is really central to Rust. It's central to both memory management and concurrency: to work with Rust you need to understand it.
> The example of returning a reference to an automatic variable isn't super compelling, since every competent C/C++ programmer knows not to do it.
That's just a simple example. The same logic also prevents iterator invalidation and use-after-free, which are things that do occur in the real world and lead to security vulnerabilities.
> Does Rust really call its threads "green threads"? Green threads have a weird reputation.
They're M:N threads, multiplexed among multiple hardware threads, like Go or Erlang.
> Does the whole concept of Rust concurrency and memory protection boil down to "the language provides an 'unsafe', and then people write libraries to do things with it"?
Sort of, but I think that's an uncharitable way to say it. The trick is that the language allows these unsafely-implemented primitives to be safely used from safe code. As long as the unsafe code is correct, the safe code is guaranteed to be safe. From a trust point of view this is really no different from building the features into the compiler: either you trust the compiler (which, as it's a compiler, is unsafe) or you trust the unsafe portions of the standard library. But it's way easier, and more flexible, to hack on libraries than to hack things into the compiler—as you get to write code, not code to generate code.
Furthermore, the safe part of the language is so powerful that you rarely ever need "unsafe": you can do practically everything you might need to do, including shared memory with locks, in the safe language, without GC. Only if you really need to squeeze out the last amount of performance, or if you need to interface with C libraries, do you need "unsafe".
I don't mean to be uncharitable; it seems like a legitimate design. I'm just wondering if there are other magic pointer types or type system features that also protect memory. To me, fewer mechanisms is better. On the other hand, building libraries to abstract unsafe pointers has been somewhat discredited by C++ and shared_ptr.
There are no more magic pointer types, though; lifetimes are just annotations on references (&).
> On the other hand, building libraries to abstract unsafe pointers has been somewhat discredited by C++ and shared_ptr.
I think we do a lot better than C++. First of all, we believe our system is safe, unlike shared_ptr (proof is in the works). Second, shared_ptr isn't very fast, due to some design decisions like requiring atomic reference counting (which is an order of magnitude slower) and not being intrusive (requiring 2x the allocations). We also allow shared_ptr to be converted into references, allowing the programmer to eliminate a lot of reference count traffic. Most importantly, though, reference counting is something you only use if you actually need multiple references and you don't have one specific place to free the object in: in other words, you don't use reference counting in Rust much more than you'd use reference counting in C. Typical malloc/free patterns are handled with unique pointers.
"Second, shared_ptr isn't very fast, due to some design decisions like ... not being intrusive (requiring 2x the allocations)."
Not quite. You can use std::make_shared to allocate the object and the ref count in one allocation. You get improved locality of reference as an added bonus.
> That's just a simple example. The same logic also prevents iterator invalidation and use-after-free, which are things that do occur in the real world and lead to security vulnerabilities.
Maybe I should mention some of the more complicated examples explicitly, then?
Salespeople qualify leads by determining if you're ready to buy or not. If you're not, they stop wasting time on you. The general idea for a quick introduction is to qualify your lead. So this isn't a "introduction to Rust's syntax" it's "an introduction to why you should (or should not) care about Rust."
> since every competent C/C++ programmer knows not to do it.
Everyone knows, yet programs still segfault. The point is that the language helps you be competent. Static analysis is very useful.
> Does Rust really call its threads "green threads"? Green threads have a weird reputation.
I agree. Rust has N:M mapped threads by default, but recently added 1:1 as well.
> it's an attempt at conversational style that doesn't add much.
I happen to write like I talk, it gets good and bad reviews. A more neutral style would be more appropriate if/when this gets pulled into Rust itself, thanks, that's a great point.
> Does the whole concept of Rust concurrency and memory protection boil down to "the language provides an 'unsafe', and then people write libraries to do things with it"?
I think this is an unfair characterization, but I gave it to you, so that's a criticism of me, not you. I tried to point out that unsafe exists for exceptional cases only: it's not something that you need unless you're doing something dangerous for a specific reason. I personally have only ever written unsafe when wrapping something with FFI.
Introductions are hard because you never know how much depth to go into; maybe I should go into these bits a little more in depth.
> I happen to write like I talk, it gets good and bad reviews.
There's nothing wrong with writing casually for a web page that is meant to be inviting to newcomers.
I've ignored Rust links on HN until the subject of your link got me to take a look. The style was fine and you did a good job in at least showing a few snippets of code while advocating the benefits of the language.
Contrary to tptacek's criticisms, I wasn't looking for a dry or comprehensive manual - just something to read in a couple of minutes that would give me an indicator of whether or not I should look further into the language.
Oh, don't get me wrong, I'm sold on memory protection as a type system feature. I'm just suggesting that the example you're using might make it sound less valuable, because returning stack variable references isn't the most common kind of error made by C programmers; when you do that, more often than not your program doesn't work at all.
Probably, with a slightly tricky ownership issue leading to use after free (a well known source of exploits http://cwe.mitre.org/data/definitions/416.html) e.g. allocate to the heap, pass to a function which deallocates it (assuming that it has ownership) and use it after the call, e.g.
Which compiles without warnings using Clang (unless -Weverything, and even then the warnings are not related to use-after-free), works "correctly" in O0 and O1 (prints "3" twice) then breaks starting at O2 (prints "3" then "0"). (note: it always prints "3" twice with GCC 4.8, showing how fun these things are)
meanwhile the equivalent
fn main() {
let v = ~3;
destroyer(v);
println!("{}", *v)
}
fn destroyer(val: ~int) {
println!("{}", *val)
}
refuses to compile and explains why:
test.rs:4:20: 4:21 error: use of moved value: `v`
test.rs:4 println!("{}", *v)
^
note: in expansion of format_args!
<std-macros>:224:8: 224:50 note: expansion site
<std-macros>:223:4: 225:6 note: in expansion of format!
<std-macros>:241:45: 241:63 note: expansion site
<std-macros>:240:4: 242:5 note: in expansion of println!
test.rs:4:4: 5:1 note: expansion site
test.rs:3:14: 3:15 note: `v` moved here because it has type `~int`, which is non-copyable (perhaps you meant to use clone()?)
test.rs:3 destroyer(v);
^
error: aborting due to previous error
which can be fixed either by explicitly cloning the value, or by altering the sub-function to not consider it owns the pointer (or by removing the `println!` call in `main()`, thus transferring the ownership of the pointer to the sub-function safely, of course)
Analogous example in C++11, using unique_ptr (which Rust's owned pointer models):
#include <stdio.h>
#include <memory>
void destroyer(std::unique_ptr<int> x)
{
printf("%d\n", *x);
// x is deallocated here
}
int main()
{
auto x = std::make_unique<int>(3);
// destroyer(x); ERROR: ownership must be transferred explicitly
destroyer(move(x)); // Override with an explicit move
printf("%d\n", *x); // Guaranteed to segfault
return 0;
}
This is somewhat more helpful, but sadly the compiler cannot monitor the state of x at compile time, like Rust. The upside is that this bug is easy to catch since the invalid access is guaranteed to try to access a null pointer, and will crash instead of giving garbled results.
> The upside is that this bug is easy to catch since the invalid access is guaranteed to try to access a null pointer, and will crash instead of giving garbled results.
GCC and LLVM optimize based on the assumption that null pointers are never dereferenced, so this is actually undefined behavior, no? Anything can happen.
You're right that it's UB, good point. It's hard to think what a sane compiler would do in this case besides going through with the dereference, though.
A pointer dereference allows the compiler to assume that a pointer is non-NULL, so it can then perform "invalid" optimisations (like dead-code removal), this post contains an very small example: http://blog.llvm.org/2011/05/what-every-c-programmer-should-...
For all intents and purposes no. unique_ptr is nothing more than a pointer container that disables copying but allows moving. Therefore only one unique_ptr should be owning a pointer at any given time (unless, like everything in C++, you go around it).
The downside of picking subtler, more intricate examples is that you waste your reader's time trying to understand the subtleties of the example, which isn't teaching them anything about Rust.
Another option is to say something like, "For the sake of brevity, this is a very simple and arguably obvious violation of safety. In practice, there are many subtle and hard-to-diagnose sources of unsafety in C++, even when you use safer abstractions like shared_ptr." This allows you to avoid getting sidetracked and losing your reader, while heading off skepticism of readers with more knowledge about C++.
One thing I did when trying to come up with a "What features of language X help safety" article for ATS was to work through a simple example from unsafe, to safer, to safe. This is one such article I did:
A conversational style is fine, but those particular sentences actually get in the way. The style comes from the overall tone of the writing, not just sentences which talk to the reader directly.
> I happen to write like I talk, it gets good and bad reviews.
I found it unusually clear, which I liked a lot. There was a bit at the end to do with unsafe that was a bit difficult to follow, but a sentence like 'but wait, RWArc is just a library, how come it was able to do the thing that I said the compiler wouldn't let you do?' would have cleared it up just fine.
> but a sentence like 'but wait, RWArc is just a library, how come it was able to do the thing that I said the compiler wouldn't let you do?' would have cleared it up just fine.
Hmmm. Isn't that what this says?
> So, the Rust language does not allow for shared mutable state, yet I just showed you some code that has it. How’s this possible? The answer: unsafe.
Well for me at least, you're introducing the language so it seems like RWArc is part of that. I have no idea as to whether or not there are any little cheats put into things like RWArc that are not available to other code, so emphasizing that RWArc is the kind of thing I could have written myself would have helped me understand.
It's that jump from being introduced to RWArc to considering how it might have been implemented that wasn't obvious to me.
Thanks for doing this, I've got much more appetite for reading about rust than can easily be sated, so it's very gratifying to have someone explain the key concepts so clearly and well.
>I happen to write like I talk, it gets good and bad reviews. A more neutral style would be more appropriate if/when this gets pulled into Rust itself, thanks, that's a great point
I think the style's alright. I also like the tone of the tutorial on the Rust website, don't know if you were involved with that too.
>Thanks for the great feedback. :)
FWIW, I found the feedback to be a little too condescending for what is a volunter effort to help people understand a new language.
tptacek, I've been meaning to ask this question to someone with some extensive security experience: Is there a compelling story for security researchers and engineers for low-level languages with an emphasis on memory safety (like Rust or Cyclone)? From my admittedly limited perspective, it seems like it could eliminate a lot of mistakes that lead to insecure software, but then again, I don't know how common memory-flaw exploits are.
> From my admittedly limited perspective, it seems like it could eliminate a lot of mistakes that lead to insecure software, but then again, I don't know how common memory-flaw exploits are.
We have done measurements on this for Firefox code. 100% of the security vulnerabilities for Web Audio were memory safety flaws.
I forget the exact number, but it was at least 20. And I believe they concluded that, yes, Rust would have caught them. I'll need to ask pcwalton to be sure though.
They have always been hard to come by because not everyone wants to spend 10 years banging their heads against the wall of memory management errors unnecessarily.
When I start coding, GC enabled languages only existed in very expensive workstations in research places like Xerox PARC.
Somehow I have this memory, maybe false, that on those days the developers had better skills than most of the younger developers I met on the last five years.
Sure, because it was expected that certain features would take a exponentially longer time to develop because many features were not created back then and the market was not as competitive.
If I'm asked to make an image gallery for a GUI you can bet 99% of people will go for an existing solution out of a matter of productivity, and the reality is that even if I was interested in making a gallery of my own or acquiring a deep understanding of how to make a proper piece of code that solves that problem, it is extremely unlikely I could come up with something better in the span of time I've been allocated to solve it.
Similarly, there's not reason to deal with memory management errors unless you cannot avoid it at all costs. None.
This sounds like vinyl DJs complaining that kids nowadays have DJ software that does beat detection and automatic loops n' shit. Sure, but the production value of your average mix has gone up tremendously now that you don't have to spend 30% of your time beat matching and instead you can now add samples and synths.
For the record, I can program in Assembly and I can DJ with vinyl, but it's been years since I've had a necessity to recur to either. At least the vinyl has some aesthetic, subjective vintage value.
> At least the vinyl has some aesthetic, subjective vintage value.
For some of us so does working with assembly and other low-level quirks and domains. :) The challenges and problems, tools and solutions are very differnt and IMO much more interesting than "getting shit done for real life value". To each their own I guess.
I like this tutorial because dives straight into the most unique/unfamiliar parts of Rust (ownership/references) and gets them out of the way. It's a "learn the hard way"-style tutorial, and I think that's the best approach. Once you learn how ownership and borrowing work, along with ARCs and concurrency, everything else is really simple and just naturally falls out.
Agreed. I'd love to see an even more in depth document that takes a wide range of ownership/allocation patterns that are common in C and C++, shows Rust equivalents, and analyses why Rust can or cannot prove that they are safe (i.e. whether they require unsafe blocks or not). I don't have an intuitive sense yet for the boundaries of what Rust can automatically prove safe. How much C and C++ could be directly translated into safe Rust and how much would need to be reworked or put in unsafe blocks?
I like this a lot, and think it's the best intro to Rust yet. The thing that concerns me a bit is that it presents the special cases in concurrency without impressing some of the most important points. Primarily, the channel example presents the send as copying, which in this case it is, but one of the main advantages of Rust's channels and owned types is that message passing of heap-allocated types do not need to copy. It probably doesn't stress hard enough that Rust tasks do not share memory before saying, 'oh, but really you can share memory if you need to', though I see that the Arc and RWArc examples are good ways to introduce the concept of using unsafe code to provide safe abstractions.
The focus on C++ as point of comparison is understandable given Mozilla's background, but in Internet land most systems software runs on the JVM, and is written in Java, or increasingly, Scala (see LinkedIn and Twitter, for example).
The issues of memory layout and the like come up here, and unlike Rust the JVM doesn't give much control of this aspect. See Martin Thompson's blog for an example of someone very concerned with issues of performance on the JVM (http://mechanical-sympathy.blogspot.co.uk/) I believe Rust could see a lot of adoption within this community as a "better" Scala -- a modern high-level language that allows dropping down to bit-twiddling when performance is an issue. It needs higher kinded types before it will work for me, but I hear that is on the road-map.
BTW, I've read a few Rust tutorials and they all fail for me in the same way: too much waffle and not enough getting down to the details. I understand the difference between stack allocation, reference counting, and GC, I get why shared mutable state is a bad idea, etc. What I want is a short document laying out the knobs Rust provides (mutable vs immutable, ownership, allocation) and how I can twiddle said knobs.
It's probably more accurate to say that in the Enterprise most software is running on the JVM. But there are certain (large) internet companies that have significant system software running on the JVM (Google, Twitter).
Funnily, I see at being closer to Scala than either C++ or Java. I think it all comes down to background.
The official tutorial isn't very good on issues of memory management. The first mention of managed references (I assume that means GCed) is in an example in section 11. Nowhere does it actually explain what managed means (and if I missed it, I blame the tutorial for not making it explicit enough!)
Well that's the thing, while Rust occupies the same space as C++, but it doesn't just take its inspirations from there. It also takes it from functional languages like ML or Haskell (the compiler was originally written in OCaml).
I think the emphasis on "unsafe" isn't helpful. As far as I can tell, the only thing that "unsafe" is enabling is that Arc and RWArc are written in Rust rather than in C in the runtime (the way they'd be in Go, or Erlang, or Haskell). The things that make Rust able to do what it does are ownership and tasks and lifetimes and affine types -- all the things the post covers before talking about "unsafe".
Also, it gives the impression that there's something fundamentally unsafe about all of this, whereas the whole point is that these abstractions are _safe_ to use.
I think `unsafe` is important to talk about. It's the escape hatch that lets you subvert the type system and do things that the compiler cannot statically reason about. The ability to implement Arc in pure Rust code is very important. But I do agree that it could have been presented perhaps in a different way.
Agreed. My experiments with rust have been uphill because I've been fighting with lifetimes, ownership and borrowing.
For the most part I failed to see that I -needed- unsafe code in some situations. Instead I was trying (failing) to annotate my code to ridiculous levels with lifetimes. It was really frustrating.
I don't think the current docs do a great job of putting unsafe in a suitable perspective. It's somewhat downplayed IMO.
Still, I've learned now and it's been pretty pleasant after that.
FWIW I've been doing c++ for maybe 18 years, writing device drivers, game engines, compiler development. I thought rust was made for me but it's been tough, much more so than any other language except maybe SML!
> Also, it gives the impression that there's something fundamentally unsafe about all of this, whereas the whole point is that these abstractions are _safe_ to use.
Right, this is my point. I should find a way to make it a bit more clear.
I wrote it this way because the systems people I talk to are skeptical at times that a compiler knows best. After all, there's a reason you want that low-level control in the first place, right? The ability to escape things when you have to relaxes people.
> I wrote it this way because the systems people I talk to are skeptical at times that a compiler knows best.
Well, the best way to combat that would be demonstrate that cool things can be done safely in Rust, but that probably requires a lot more than fits in your introduction.
I think you could allay that fear by addressing it directly, rather than saying that Rust's secret sauce is unsafety.
Hm. I thought that talking about building ARC and discussing its implementation was something cool that directly addresses it. What would you like to see?
I think talking about how ARC is implemented in Rust is cool. I would just pitch it differently.
I think what I don't like about your current description is that it makes it seem like unsafety is what allows you to _have_ Arc in the language. Instead, unsafe blocks are what allow you to _implement_ Arc inside the language.
I'd start the footnote this way:
---------------
A footnote: Implementing Arc
So, the Rust language doesn't let us use shared mutable state in dangerous ways, but what happens if we really need to get down and dirty? For example, what if we wanted to _implement_ `Arc` ourselves?
In fact, `Arc` and `RWArc` are both implemented in Rust. Inside their implementations, they use locks, low-level memory operations, and everything else you might see in a C++ program. However, anytime we use these features, we have to wrap them in an `unsafe` block.
...
---------------
I hope that conveys the different emphasis that I'm talking about.
I think that's a necessary approach. The first time I encountered Rust, in a treatment that covered unique and managed but not unsafe/raw pointers, my impression was: ‘You promised GC was optional, but I can't even write a DAG’.
A little OT...but what's with Svbtle's apparent default styling of links? There's no indication that any particular word or sentence contains a link, which basically makes those links invisible to readers. Or do lots of people read web articles by randomly hovering the mouse around the text?
But relevant to the OP...I generally try to save useful tutorials like this on my pinboard, which often doesn't pick up the meta-description text. So I double-click to copy the first paragraph and paste it into pinboard...except in the OP, I kept on clicking on text that was hiding links underneath.
It's a strange UI decision, and one that seems to discourage the use of outbound links...if you can't see the links, then what is the purpose of them? For spiders?
How can I have the pointer to something that is maybe allocated or maybe present? Do I have to have additional booleans for such uses? Isn't that a waste?
How can I effectively build complex data structures like graphs, tries etc then?
What's neat is that if you stuff a pointer inside an `Option`, then not only is it guaranteed to be memory-safe but it also compiles down to a plain old nullable pointer at runtime, so there's no extra overhead while still retaining safety.
So it doesn't have "null" but it has "None." On the linked page:
// Remove the contained string, destroying the Option
let unwrapped_msg = match msg {
Some(m) => m,
None => ~"default message"
};
Now why is there that ";" at the end? Before that there is a construct without it:
// Take a reference to the contained string
match msg {
Some(ref m) => println!("{}", *m),
None => ()
}
And did we have take the "reference" to print the value of m?
And one note more: the linked page doesn't explain that the Some actually introduces "Option" type. It writes about the Option but the code uses just "Some."
let msg = Some(~"howdy");
Some as a "keyword" seems to have two different semantical purposes, depending if it's in the "match" or not. I don't see that explained too.
Most of this is out of the scope of this tutorial, but is in the comprehensive one.
match is an expression, but let is a statement. In the first example, the match expression is used inside of the let statement, in order to produce what is assigned.
I'm not 100% sure if you _must_ take that reference, but given that it's a pointer, that makes sense. In the previous version, it's simply returning a value, but println! needs the contents, not the pointer itself.
You use the Option type. Option can either be None or Some(T). I think Rust optimizes this to (non)null, so there is very little, if any, overhead compared to the equivelent C++ code.
I personally dislike the style of tutorial that has lots of 'we' and 'lets' in it.
I suppose part of that comes from the tendency for such tutorials to provide revelations instead of motivators. For example, in this tutorial there is 'look at this C++ code because I said to' and then two sentences later it explains that the C++ code ends up in a garbage value.
But this is probably very much a point of style and I'm sure lots of people think my view is stupid.
I tend to be very collectively focused, so I do tend to write this way. Thanks for the feedback; the style may not be appropriate for an official tutorial.
> the tendency for such tutorials to provide revelations instead of motivators.
I'm going to have to think about this, that's very interesting. I would like to say that my revelations provide motivation, but that may be wishful thinking...
Do you think there's a way to demonstrate these concepts in a way without 'the reveal'? It seems to me that comparison will always feel a bit reveal-y, as demonstrating some kind of difference is inherent in comparison.
Thanks for the the tutorial! Rust seems a bit too complex to me. Like a C++ on steroid that wants to do and be everything. Nothing wrong with that but not my cup of tea. I'd rather stick to C if I need tight memory management, it is way simpler and straight forward. And if I need concurrency, I'll stick to Golang (or erlang). Really, it's such a pleasure to read some golang after reading this 30 minutes of Rust. Anyway, just my opinion.
> Rust seems a bit too complex to me. Like a C++ on steroid that wants to do and be everything.
No, that's explicitly not a goal. The goal is to enable zero-cost abstractions and memory safety. Everything here is in service of that goal.
> I'd rather stick to C if I need tight memory management, it is way simpler and straight forward.
The problem is that the "simple and straightforward" model of C leads to a lot of very not-simple-and-straightforward time in front of Valgrind to get the program to work, or worse, to fix security vulnerabilities.
I would argue that rust is both simpler and more consistent than either C or C++. No preprocessor, no pre/post increment, a real type system (instead of templates)—there's just not the mountain of undefined parts of the language that tends to be a large problem. Scala, in spite of being on a VM, is far closer to C++ IMHO in terms of attitude.
I actually find golang harder to read without parametric types (sans built-in slices and maps).
Rust most definitely isn't "like a C++ on steroid" from a complexity standpoint. It tries (and — I think — mostly succeed) to be a significantly simpler and more coherent language.
It does make things which are implicit in C or C++ (e.g. ownership) explicit. That's a good thing, you need to know your ownership in C or C++, the language just doesn't help you much.
> I'd rather stick to C if I need tight memory management, it is way simpler and straight forward.
Well, if you don't want your C to segfault, you have to understand these concepts anyway, they're just implicit to the language. C's memory management may be simple, but it's surely not easy.
I think the amount of syntax you've shown is fine. Since there's C++ and Rust code that both do the same thing, it allows readers familiar with C++ to infer what the Rust syntax means. This is a much quicker process than reading "Here's how you write an if clause ..." prose.
When it comes to showing new languages to experienced programmers, I prefer showing code, and explaining what it accomplishes. Experienced programmers will start building up a Bayesian model of the syntax without being explicitly told.
I actually liked the amount of detail in explaining the syntax. For the most part, the code samples are very readable and understandable as they are (and I haven't written any Rust).
I think explaining a lot of the syntax would have made this more of a "how to Rust" article, rather than a "why to Rust" article. I really appreciated the "why to Rust" tone of this article, and it provides a great explanation of why the language is different from C / C++ and why I should care.
I've only written a little C and C++, about six years ago, but enough to understand the syntax well enough to understand C and non-templated C++ code. The introduction was at just the right level for me, and I didn't have any problems with the syntax.
I think it's well written, and it makes me interested in learning more about Rust. The one thing I'd like to have seen addressed is type inference.
> But wait, how is that possible? We can’t both allow and disallow mutable state. What gives?
I had to re-read the above a few times. And still don't get it. Steve, what do you mean with it? Are you talking about how are Arc/RWArc implemented? Or is it something else?
The previous sentence is "We gain the efficiency of shared mutable state, while retaining the safety of disallowing shared mutable state."
I also wasn't sure about the part you quoted, but I was trying to explain that it's not that Rust _doesn't_ allow shared mutable state, it's that while the language doesn't, you can use unsafe to build safe abstractions, so in practice, it does. Hmmm.
I still don't get it... And the previous sentence is now confusing for me too :)
First you showed two examples: Arc (to share immutable data) and RWArc (to share mutable data with enforced mutexes around closures). Then you talked about `unsafe`. Seems easy, one needs a "backdoor" to implement RWArc in Rust (at first I thought it was implemented in C/C++).
But (quoted) sentences between Arc/RWArc part and `unsafe` part don't really connect them, at least to me.
Ha! Bummer, maybe I will just need to re-write this paragraph.
A RWArc is shared mutable state: you can have two references to the Arc in two different tasks. Yet I said that Rust throws a compiler error for shared mutable state.
> (at first I thought it was implemented in C/C++).
There's very little C++ in Rust anymore. :)
> Seems easy, one needs a "backdoor" to implement RWArc in Rust
Yup, this is exactly the point with those two sentences. Maybe I should just straight-up remove them.
I think we've now got none, except for LLVM in the compiler, which means any binaries built using the stdlib don't need any C++ libraries (to be clear, libraries without the stdlib have never needed any).
I found it a bit worrying, as when I read that I had to go back and re-read the code a few times to make sure that there wasn't an unsafe block in the code, or perhaps that the use of RWArc implied unsafe, and would then turn that whole function into an automatically unsafe block.
Nevertheless, very cool. I really liked the tutorial, I'm getting quite excited by Rust.
The last time I touched C code was my sophomore year in college, so maybe 12 years ago? As a result, the last time I had to deal with pointers and such was back then, as well.
I'm primarily a web-dev. Ruby, PHP, and Javascript are the languages I'm most familiar with at the moment.
Are there any Rust for Dummies-style tutorials floating around? As simple as this introduction is, it was still over my head...
Do you see Rust as a language that people will write the business domain type applications in or as a supplement to languages like Ruby, Python, etc. to write those "gotta have performance here" parts of the application?
The second ever (that we know of) use of Rust in production is by Tilde, who are using Rust embedded in Ruby for Skylight to help ensure that memory usage stays down and to get more consistent performance. Garbage collection while doing performance monitoring is bad news.
I think that it's possible that Rust might eventually be useful as an application level language. We'll see how it shakes out. We've been in love with scripting languages for the past few years, but now their drawbacks are becoming more apparent. Look at all the Rubyists and Pythonistas flocking to Go, for example...
I'm not the person you asked, but I see Rust as making a play in these areas:
- embedded systems
- games
- high performance + high correctness environments
Personally, I don't see it as a web-app or line of business app development languages. But it will allow you to create the components that the webapp and LoB apps call into. But I imagine if some people really like Rust for those purposes, they will start building the libraries and code infrastructure to make it easy and away we go.
Is rust borrowing any kind of code from what is used for objective C ARC technology relative to detecting the lifetime of a variable and automaticaly freeing the resource ? Is it a common known algorithm ?
I had the impression that rust and ARC were doing a bit of static code analysis to detect when a "reference count decrease/increase" could be performed ( thus the "automatic" part in ARC), and insert code at compile time. Which seems a bit more complicated than simply decrease them dynamically when a pointing is object is destroyed. But now that i come to think of it, things look a little blurry for me on that part.
I find it annoying to the point of offensive. Actions being actived on mouse-hover is terrible; I did not intend to give "kudos" but was curious what might be linked under it, and suddenly an action is recorded. Now I can't consider any webpage safe and have to watch where my mouse goes; that's wrong.
You're not letting anything escape. You're copying `nums` and then mutating your copy. And this only works because `nums` can be implicitly copied like this. If you changed it from `[int, ..3]` to `~[int]` you'd get a compiler error (as `~[T]` cannot be implicitly copied).
One of the first things I did when I started working on Oak (which became Java) was to see about writing an OS in it. The idea being that if you could write an OS in a 'safe' language then you could have a more reliable OS. But we were unable to write it completely in Oak/Java and that lead to some interesting discussions about what might be the minimum set of 'unsafe' actions might be required by a systems language.
Sadly we did not get to explore that very much, although I did pass it on as a possible thesis topic to some interns who came through Sun at the time. I'd be interested in your experience with what actions require 'unsafe' and if you have seen a canonical set that might point toward a process to get to a 'safe' OS.