Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
A 30 minute introduction to Rust (steveklabnik.com)
328 points by steveklabnik on Jan 13, 2014 | hide | past | favorite | 156 comments


This is an excellent summary Steve, it also points out one of the challenges of 'System' languages, which is the requirement for 'unsafe.'

One of the first things I did when I started working on Oak (which became Java) was to see about writing an OS in it. The idea being that if you could write an OS in a 'safe' language then you could have a more reliable OS. But we were unable to write it completely in Oak/Java and that lead to some interesting discussions about what might be the minimum set of 'unsafe' actions might be required by a systems language.

Sadly we did not get to explore that very much, although I did pass it on as a possible thesis topic to some interns who came through Sun at the time. I'd be interested in your experience with what actions require 'unsafe' and if you have seen a canonical set that might point toward a process to get to a 'safe' OS.


Glad you liked it, Chuck.

Some of my closest friends in college specialized in operating systems, and we (mostly them) worked on http://xomb.org , an exokernel in D. I'd hope that today we'd choose Rust instead.

Julia Evans has been writing _fantastic_ series about a kernel in Rust: http://jvns.ca/blog/categories/kernel/

As she says "This is typical of a lot of Rust code I’m writing – I need to write a lot of unsafe code."

I'll have to give this 'minimum set' idea some thought. I think that safety will be more useful for things like kernel modules than in the kernel itself, though I'm not sure why I think that, exactly. Hmmmmm...


  > I need to write a lot of unsafe code.
To contrast with application-level code, pcwalton has measured that less than 1% of Servo's code is contained within `unsafe` blocks.


Exactly, and of course this leads me to wondering about computer architectures that "require" unsafe things in their kernels, versus things like Multics where the OS and hardware co-operated at some level to make things safe. Lots of interesting questions.


One possibility might be to add hardware support for fine-grained capability security - http://www.cl.cam.ac.uk/research/security/ctsrd/


Would that help? Wouldn't the use of a safe language make hardware-level memory protection less important, not more?


For reference, the following actions are what are enabled within `unsafe` blocks in Rust:

1. Dereferencing raw pointers (a.k.a. "unsafe" pointers). Note that's just dereferencing: there's nothing inherently unsafe about just creating and passing around the pointers themselves.

2. Calling a Rust function that has been marked with the entirely-optional `unsafe` keyword.

3. Calling an external function via the C FFI, all of which are automatically considered unsafe.


>Note that's just dereferencing: there's nothing inherently unsafe about just creating and passing around the pointers themselves.

Sure there is. Passing around multiple pointers can result in a double free and then the dangerous dereference can happen. I'm guessing the developer must be careful when writing unsafe blocks to make sure this doesn't happen.


Unsafe behaviours with raw pointers:

  - dereferencing
  - arithmetic (implemented as ptr.offset(number) in the stdlib), 
    since it is undefined behaviour for LLVM to move a pointer to
    point outside* of the object that it came from originally 
    (even without deferencing)
Everything else is OK. (Although the only other thing possible is passing it around as "black box" value.)

*One byte past the end is ok.


but they are safe until you decide to do that deferencing, otherwise they're just uint_ptrs really.


Don't points 2 & 3 together imply that anything that uses built-ins, which I assume are implemented with C FFI, is `unsafe`?


No, “unsafe” means “evaluate this unsafe block as though it were safe”. This enables Modula-style safe interfaces for unsafe implementations.


Just to clarify, using a `unsafe` block in a function does NOT mean that you have to tag that function as unsafe.


I'm not 100% sure exactly what you mean by 'built-ins', but Rust is almost entirely written in Rust. I don't think there's anything major implemented via C FFI in the compiler itself.


You'll probably be interested in the Singularity project at Microsoft Research: http://research.microsoft.com/en-us/projects/singularity/

Much of the kernel is implemented in managed code.


I remember hearing about Singularity a few years back - it sounded promising. Looking at the linked page, there doesn't seem to be anything published or presented since 2009. Anyone know what's going on with Singularity these days?


Sun eventually did JavaOS, though I'm not sure how much of its code was Java.

IBM did Jikes RVM (old name: Jalapeno) which was mostly, if not purely Java. They handled the bootstrapping problem with some clever ahead-of-time compilation + meta-programming. Even their GC is implemented in Java.


Jikes RVM was an absolutely amazing project. Superb work.

googles

Jikes RVM is an absolutely amazing project. Superb work. And i'm very glad to see it's still under somewhat active development!

http://jikesrvm.org/


I think we can be a little bit more charitable towards C++. Modern compilers will let you know if you try to do something as obviously incorrect as returning a pointer to a stack variable.

    $ cat > foo.cpp <<EOF
    > int *dangling(void)
    > {
    >     int i = 1234;
    >     return &i;
    > }
    > EOF
    
    $ clang++ -Werror -c foo.cpp
    foo.cpp:4:13: error: address of stack memory associated with local variable 'i'
          returned [-Werror,-Wreturn-stack-address]
        return &i;
                ^
    1 error generated.


Thank you! Maybe I should explicitly show a new/free example instead, or does that end up having a similar warning?

I haven't written serious C++ in years, so I have some blind spots. Others on the Rust team have done quite a bit, so they tend to pick up my slack in exactly this manner.


I wasn't trying to undermine your argument. Rust is solving real problems. Use after free and double free are still issues in the C world. In modern C++ we are (hopefully) using smart pointers (std::unique_ptr, std::shared_ptr) to manage heap-allocated object lifetimes.


> In modern C++ we are (hopefully) using smart pointers (std::unique_ptr, std::shared_ptr) to manage heap-allocated object lifetimes.

Those aren't safe. There are many ways to cause use-after-free with unique_ptr: for example, placing a uniquely-owned object in a vector and clearing the vector in a method call on that object.


What's wrong with this use-case? The object will be destroyed and no pointer to it will exist after the clear.


True, and hopefully the likes of D, Rust and Go will improve the situation.

In the mean time, we can take advantage of modern C++ safe constructs instead of keep on using C, as new language adoption always takes time.


Go being mandatorily GC'd I don't think it's relevant to the issue of improving on C/C++ for their existing use cases.


There are still lots of user space applications written in C/C++ nowadays, that could be easily rewriten in Go or any other safer language with native code compilers, without noticeble performance lost for the problem being solved.


The remainig issue for many of those is "need better GC". Apps like games could be brought in with near-pauseless GC that currently seems to exist only in Java land.

(Yes you can work around the current GC's but this drives away some developers)


Sure but if a GC'd language is acceptable for the application, the problem "was solved" years ago, the tooling not being the issue anymore.


Sometimes these applications are written in C for the extreme portability rather than performance. I have a couple open source projects in mind that support a huge selection of OSes and hardware configurations that would not be possible with some of the other current higher level language but almost every platform supports C to some degree. I don't know how well Go supports weird configurations like MMU-less uClinux installs but the GC wouldn't be what is breaking it.


I don't really understand concepts of low level languages, but I've had a perception that if I'd wrote a program in Go using the standard libary and compiled it with go build it would run on all imaginable platforms. Now, as far as I've understood your comment suggests that I'm wrong.

Lets suppose that some corporation would download the binaries of my project and use it on an embedded system, would it run?


It depends on how "embedded" embedded means to you. I've talked to some people who call a CPE Linux/Atom box embedded and I know some people who don't consider anything with an OS embedded. I won't even touch that argument. :)

But I used MMU-less uClinux as an example because it breaks a lot of assumptions that are made in a lot of code. No dynamic linking(which if I remember go doesn't support by default, or does it even support at all? I haven't written any in a year or two), no protected memory, and no real fork (vfork instead) crosses a lot of code off the list on what you can run. I don't know if go can support it, but I know that the current offical toolchains don't. There may be a feature go has that prevents them running on a platform like this, but it won't be GC. There are some GC'd languages that will run on this platform.

C is great in that is makes very few assumptions and if written carefully can be extremely portable. The base language doesn't even assume there is an OS present. I've only run into one platform that had very little C support. It was a small 8 bit uC that had a very strange execution stack that made C as everyone uses it hard to efficiently implement.


In addition to what stusmall's said about "it depends what you mean by embedded":

Go compiles to native code, not bytecode. Your post makes it seem like you might be missing this fact.

Go has good cross-compile support so from say a Windows box I can compile a Go program using just the standard library that will run on Windows, or I can compile one that runs on MacOS, or I can compile one that runs on Linux, and I can even compile one that runs on a different processor arch (like I can be running Windows/x86 and compile a binary for Linux/ARM, for example).

But(!) you have to compile a separate binary for each of those platforms, you can't compile just one single executable that runs on all of those systems. The output of the Go compiler is native machine language code for a specific architecture and OS.

eg. I want to compile an http server and run it on either a Windows/x64 box or a chumby Linux device and I'm currently on a Linux/x86 system:

cd mygoprogram

export GOARCH=arm

export GOARM=5

export GOOS=linux

go build

I now have an executable in the current working directory named "mygoprogram" that will run on a Linux system with an ARMv5 processor (say, an old chumby device)

export GOARCH=amd64

export GOOS=windows

go build

I now have an executable in the current working directory named mygoprogram.exe that will run on a Windows/x64 system, but it is completely separate from the binary that will run on the Linux/ARM device despite being built from the same source code.

mygoprogram.exe will not run on the Linux system, and vice versa. Go is like C/C++ in this regard as opposed to say Java where a common bytecode format will make the compiled code still platform independent.


Depends on the implementation.

Don't mix languages with implementations.

Currently there are two native compiler toolchains for Go, which only support a given set of OSs and computer architectures.

Additionally some people written Go interpreters with another set of supported targets.

So to answer your question, it would run on the embedded system if :

1 - the hardware could cope with Go runtime requirements

2 - it would be a supported target for one available native code toolchain


The main issue with that, is that although there are lots of GC enabled languages to choose from, not all of them have mainstream AOT compilers available.

Shipping a VM with the product is not always possible/desireable.


Even with that constraint, OCaml's 15 years old, so's SML/MLTon, D's 12 years old (D2 is 6 years old), Eiffel's nearly 30 (and moved under ECMA in 2005), Common Lisp can be compiled to native via CMUCL, SBCL or CCL and Scheme via Chicken Scheme or Chez Scheme.

And these are off the top of my head, I'm sure there are others.


I know all those languages and played with them one time or the other.

I am also active on D forums.

However the mainstream developers aren't aware, or even care, about those languages. Which leaves us in the current state of affairs in the industry.


This isn't so much an introduction to Rust as it is an introduction to Rust's concurrency model.

The example of returning a reference to an automatic variable isn't super compelling, since every competent C/C++ programmer knows not to do it. That bug does pop up every once in awhile, but almost always in the context of a function that returns a reference to one of many different possible variables depending on some condition in the function.

Does Rust really call its threads "green threads"? Green threads have a weird reputation.

Copy like "this allows you to, well, read and write the data" could be tightened up; it's an attempt at conversational style that doesn't add much. "That doesn't seem too hard, right?" is another example of the same thing.

How much of Rust concurrency is this covering? How much of its memory model? Does the whole concept of Rust concurrency and memory protection boil down to "the language provides an 'unsafe', and then people write libraries to do things with it"?


> This isn't so much an introduction to Rust as it is an introduction to Rust's concurrency model.

Ownership is really central to Rust. It's central to both memory management and concurrency: to work with Rust you need to understand it.

> The example of returning a reference to an automatic variable isn't super compelling, since every competent C/C++ programmer knows not to do it.

That's just a simple example. The same logic also prevents iterator invalidation and use-after-free, which are things that do occur in the real world and lead to security vulnerabilities.

> Does Rust really call its threads "green threads"? Green threads have a weird reputation.

They're M:N threads, multiplexed among multiple hardware threads, like Go or Erlang.

> Does the whole concept of Rust concurrency and memory protection boil down to "the language provides an 'unsafe', and then people write libraries to do things with it"?

Sort of, but I think that's an uncharitable way to say it. The trick is that the language allows these unsafely-implemented primitives to be safely used from safe code. As long as the unsafe code is correct, the safe code is guaranteed to be safe. From a trust point of view this is really no different from building the features into the compiler: either you trust the compiler (which, as it's a compiler, is unsafe) or you trust the unsafe portions of the standard library. But it's way easier, and more flexible, to hack on libraries than to hack things into the compiler—as you get to write code, not code to generate code.

Furthermore, the safe part of the language is so powerful that you rarely ever need "unsafe": you can do practically everything you might need to do, including shared memory with locks, in the safe language, without GC. Only if you really need to squeeze out the last amount of performance, or if you need to interface with C libraries, do you need "unsafe".


I don't mean to be uncharitable; it seems like a legitimate design. I'm just wondering if there are other magic pointer types or type system features that also protect memory. To me, fewer mechanisms is better. On the other hand, building libraries to abstract unsafe pointers has been somewhat discredited by C++ and shared_ptr.


> I'm just wondering if there are other magic pointer types or type system features that also protect memory.

Oh, I see. In that case yes, there are also explicit lifetimes, which allow you to squeeze out more expressiveness to cover most of C++'s use cases for pointers/references: http://static.rust-lang.org/doc/master/guide-lifetimes.html

There are no more magic pointer types, though; lifetimes are just annotations on references (&).

> On the other hand, building libraries to abstract unsafe pointers has been somewhat discredited by C++ and shared_ptr.

I think we do a lot better than C++. First of all, we believe our system is safe, unlike shared_ptr (proof is in the works). Second, shared_ptr isn't very fast, due to some design decisions like requiring atomic reference counting (which is an order of magnitude slower) and not being intrusive (requiring 2x the allocations). We also allow shared_ptr to be converted into references, allowing the programmer to eliminate a lot of reference count traffic. Most importantly, though, reference counting is something you only use if you actually need multiple references and you don't have one specific place to free the object in: in other words, you don't use reference counting in Rust much more than you'd use reference counting in C. Typical malloc/free patterns are handled with unique pointers.


"Second, shared_ptr isn't very fast, due to some design decisions like ... not being intrusive (requiring 2x the allocations)."

Not quite. You can use std::make_shared to allocate the object and the ref count in one allocation. You get improved locality of reference as an added bonus.


And it uses (or may use) a lock-free implementation on many platforms.


An atomic increment is still significantly more expensive than a normal one.


> proof is in the works

That's very exciting!


> That's just a simple example. The same logic also prevents iterator invalidation and use-after-free, which are things that do occur in the real world and lead to security vulnerabilities.

Maybe I should mention some of the more complicated examples explicitly, then?


I wouldn't want it to get too in-depth. Maybe just mention that this is a simple example but Rust prevents many other types of related errors.


Salespeople qualify leads by determining if you're ready to buy or not. If you're not, they stop wasting time on you. The general idea for a quick introduction is to qualify your lead. So this isn't a "introduction to Rust's syntax" it's "an introduction to why you should (or should not) care about Rust."

> since every competent C/C++ programmer knows not to do it.

Everyone knows, yet programs still segfault. The point is that the language helps you be competent. Static analysis is very useful.

> Does Rust really call its threads "green threads"? Green threads have a weird reputation.

I agree. Rust has N:M mapped threads by default, but recently added 1:1 as well.

> it's an attempt at conversational style that doesn't add much.

I happen to write like I talk, it gets good and bad reviews. A more neutral style would be more appropriate if/when this gets pulled into Rust itself, thanks, that's a great point.

> Does the whole concept of Rust concurrency and memory protection boil down to "the language provides an 'unsafe', and then people write libraries to do things with it"?

I think this is an unfair characterization, but I gave it to you, so that's a criticism of me, not you. I tried to point out that unsafe exists for exceptional cases only: it's not something that you need unless you're doing something dangerous for a specific reason. I personally have only ever written unsafe when wrapping something with FFI.

Introductions are hard because you never know how much depth to go into; maybe I should go into these bits a little more in depth.

Thanks for the great feedback. :)


> I happen to write like I talk, it gets good and bad reviews.

There's nothing wrong with writing casually for a web page that is meant to be inviting to newcomers.

I've ignored Rust links on HN until the subject of your link got me to take a look. The style was fine and you did a good job in at least showing a few snippets of code while advocating the benefits of the language.

Contrary to tptacek's criticisms, I wasn't looking for a dry or comprehensive manual - just something to read in a couple of minutes that would give me an indicator of whether or not I should look further into the language.

Thanks for the article.


Oh, don't get me wrong, I'm sold on memory protection as a type system feature. I'm just suggesting that the example you're using might make it sound less valuable, because returning stack variable references isn't the most common kind of error made by C programmers; when you do that, more often than not your program doesn't work at all.


Absolutely, I don't want people to think I'm attacking a straw man. Maybe a heap allocated example would be better?


Probably, with a slightly tricky ownership issue leading to use after free (a well known source of exploits http://cwe.mitre.org/data/definitions/416.html) e.g. allocate to the heap, pass to a function which deallocates it (assuming that it has ownership) and use it after the call, e.g.

    #include <stdlib.h>
    #include <stdio.h>


    void destroyer(int* val) {
      printf("%d\n", *val);
      free(val);
    }

    int main(int argc, char** argv) {
      int* v = malloc(sizeof(int));
      *v = 3;
      destroyer(v);
      printf("%d\n", *v);
      return 0
    }
Which compiles without warnings using Clang (unless -Weverything, and even then the warnings are not related to use-after-free), works "correctly" in O0 and O1 (prints "3" twice) then breaks starting at O2 (prints "3" then "0"). (note: it always prints "3" twice with GCC 4.8, showing how fun these things are)

meanwhile the equivalent

    fn main() {
        let v = ~3;
        destroyer(v);
        println!("{}", *v)
    }

    fn destroyer(val: ~int) {
        println!("{}", *val)
    }
refuses to compile and explains why:

    test.rs:4:20: 4:21 error: use of moved value: `v`
    test.rs:4     println!("{}", *v)
                                  ^
    note: in expansion of format_args!
    <std-macros>:224:8: 224:50 note: expansion site
    <std-macros>:223:4: 225:6 note: in expansion of format!
    <std-macros>:241:45: 241:63 note: expansion site
    <std-macros>:240:4: 242:5 note: in expansion of println!
    test.rs:4:4: 5:1 note: expansion site
    test.rs:3:14: 3:15 note: `v` moved here because it has type `~int`, which is non-copyable (perhaps you meant to use clone()?)
    test.rs:3     destroyer(v);
                            ^
    error: aborting due to previous error
which can be fixed either by explicitly cloning the value, or by altering the sub-function to not consider it owns the pointer (or by removing the `println!` call in `main()`, thus transferring the ownership of the pointer to the sub-function safely, of course)


Analogous example in C++11, using unique_ptr (which Rust's owned pointer models):

    #include <stdio.h>
    #include <memory>

    void destroyer(std::unique_ptr<int> x)
    {
        printf("%d\n", *x);
        // x is deallocated here
    }

    int main()
    {
        auto x = std::make_unique<int>(3);
        // destroyer(x); ERROR: ownership must be transferred explicitly
        destroyer(move(x)); // Override with an explicit move
        printf("%d\n", *x); // Guaranteed to segfault
        return 0;
    }
This is somewhat more helpful, but sadly the compiler cannot monitor the state of x at compile time, like Rust. The upside is that this bug is easy to catch since the invalid access is guaranteed to try to access a null pointer, and will crash instead of giving garbled results.


> The upside is that this bug is easy to catch since the invalid access is guaranteed to try to access a null pointer, and will crash instead of giving garbled results.

GCC and LLVM optimize based on the assumption that null pointers are never dereferenced, so this is actually undefined behavior, no? Anything can happen.


You're right that it's UB, good point. It's hard to think what a sane compiler would do in this case besides going through with the dereference, though.


A pointer dereference allows the compiler to assume that a pointer is non-NULL, so it can then perform "invalid" optimisations (like dead-code removal), this post contains an very small example: http://blog.llvm.org/2011/05/what-every-c-programmer-should-...


Does `unique_ptr` incur a run-time peanalty for those checks?


For all intents and purposes no. unique_ptr is nothing more than a pointer container that disables copying but allows moving. Therefore only one unique_ptr should be owning a pointer at any given time (unless, like everything in C++, you go around it).


Thanks so much. I'll certainly have to use this if this makes it to Rust proper.


The downside of picking subtler, more intricate examples is that you waste your reader's time trying to understand the subtleties of the example, which isn't teaching them anything about Rust.

Another option is to say something like, "For the sake of brevity, this is a very simple and arguably obvious violation of safety. In practice, there are many subtle and hard-to-diagnose sources of unsafety in C++, even when you use safer abstractions like shared_ptr." This allows you to avoid getting sidetracked and losing your reader, while heading off skepticism of readers with more knowledge about C++.


Your C example is caught by the clang static analyzer:

  main.c:14:20: warning: Use of memory after it is freed
    printf("%d\n", *v);
                   ^~


One thing I did when trying to come up with a "What features of language X help safety" article for ATS was to work through a simple example from unsafe, to safer, to safe. This is one such article I did:

http://bluishcoder.co.nz/2012/08/30/safer-handling-of-c-memo...

I think something similar for Rust would make for a great article subject too.


A conversational style is fine, but those particular sentences actually get in the way. The style comes from the overall tone of the writing, not just sentences which talk to the reader directly.


> I happen to write like I talk, it gets good and bad reviews.

I found it unusually clear, which I liked a lot. There was a bit at the end to do with unsafe that was a bit difficult to follow, but a sentence like 'but wait, RWArc is just a library, how come it was able to do the thing that I said the compiler wouldn't let you do?' would have cleared it up just fine.


> but a sentence like 'but wait, RWArc is just a library, how come it was able to do the thing that I said the compiler wouldn't let you do?' would have cleared it up just fine.

Hmmm. Isn't that what this says?

> So, the Rust language does not allow for shared mutable state, yet I just showed you some code that has it. How’s this possible? The answer: unsafe.


Well for me at least, you're introducing the language so it seems like RWArc is part of that. I have no idea as to whether or not there are any little cheats put into things like RWArc that are not available to other code, so emphasizing that RWArc is the kind of thing I could have written myself would have helped me understand.

It's that jump from being introduced to RWArc to considering how it might have been implemented that wasn't obvious to me.


Gotcha. It seems a few other people had this issue too, I'll take care of it in the next iteration. Thank you!


Thanks for doing this, I've got much more appetite for reading about rust than can easily be sated, so it's very gratifying to have someone explain the key concepts so clearly and well.


Any time. If only there were five of me!


>I happen to write like I talk, it gets good and bad reviews. A more neutral style would be more appropriate if/when this gets pulled into Rust itself, thanks, that's a great point

I think the style's alright. I also like the tone of the tutorial on the Rust website, don't know if you were involved with that too.

>Thanks for the great feedback. :)

FWIW, I found the feedback to be a little too condescending for what is a volunter effort to help people understand a new language.


> I personally have only ever written unsafe when wrapping something with FFI.

Is discussion of unsafe in an introduction article esoterica?

By the way, loved the proposal video [1] you did on docs for rust: it's a pretty solid proposal for any language.

[1] https://air.mozilla.org/rust-meetup-december-2013/


Unsure. I think it's really important that people know the rules _can_ be broken if you think you know better than the compiler.

Thanks. :)


tptacek, I've been meaning to ask this question to someone with some extensive security experience: Is there a compelling story for security researchers and engineers for low-level languages with an emphasis on memory safety (like Rust or Cyclone)? From my admittedly limited perspective, it seems like it could eliminate a lot of mistakes that lead to insecure software, but then again, I don't know how common memory-flaw exploits are.


> From my admittedly limited perspective, it seems like it could eliminate a lot of mistakes that lead to insecure software, but then again, I don't know how common memory-flaw exploits are.

We have done measurements on this for Firefox code. 100% of the security vulnerabilities for Web Audio were memory safety flaws.


How many bugs in total? And, memory safety that Rust would've protected against?


I forget the exact number, but it was at least 20. And I believe they concluded that, yes, Rust would have caught them. I'll need to ask pcwalton to be sure though.


Absolutely it does.


Green threads are the standard term for language-level threads that are not OS-level threads.


> since every competent C/C++ programmer knows not to do it.

They are hard to come by, in this time and age, of cutting down costs everywhere while offshoring components.


They have always been hard to come by because not everyone wants to spend 10 years banging their heads against the wall of memory management errors unnecessarily.


When I start coding, GC enabled languages only existed in very expensive workstations in research places like Xerox PARC.

Somehow I have this memory, maybe false, that on those days the developers had better skills than most of the younger developers I met on the last five years.


Sure, because it was expected that certain features would take a exponentially longer time to develop because many features were not created back then and the market was not as competitive.

If I'm asked to make an image gallery for a GUI you can bet 99% of people will go for an existing solution out of a matter of productivity, and the reality is that even if I was interested in making a gallery of my own or acquiring a deep understanding of how to make a proper piece of code that solves that problem, it is extremely unlikely I could come up with something better in the span of time I've been allocated to solve it.

Similarly, there's not reason to deal with memory management errors unless you cannot avoid it at all costs. None.

This sounds like vinyl DJs complaining that kids nowadays have DJ software that does beat detection and automatic loops n' shit. Sure, but the production value of your average mix has gone up tremendously now that you don't have to spend 30% of your time beat matching and instead you can now add samples and synths.

For the record, I can program in Assembly and I can DJ with vinyl, but it's been years since I've had a necessity to recur to either. At least the vinyl has some aesthetic, subjective vintage value.


> At least the vinyl has some aesthetic, subjective vintage value.

For some of us so does working with assembly and other low-level quirks and domains. :) The challenges and problems, tools and solutions are very differnt and IMO much more interesting than "getting shit done for real life value". To each their own I guess.


Rust's tasks can either use native threads or green threads multiplexed on top of a pool of native threads, without change in API.


I like this tutorial because dives straight into the most unique/unfamiliar parts of Rust (ownership/references) and gets them out of the way. It's a "learn the hard way"-style tutorial, and I think that's the best approach. Once you learn how ownership and borrowing work, along with ARCs and concurrency, everything else is really simple and just naturally falls out.


Agreed. I'd love to see an even more in depth document that takes a wide range of ownership/allocation patterns that are common in C and C++, shows Rust equivalents, and analyses why Rust can or cannot prove that they are safe (i.e. whether they require unsafe blocks or not). I don't have an intuitive sense yet for the boundaries of what Rust can automatically prove safe. How much C and C++ could be directly translated into safe Rust and how much would need to be reworked or put in unsafe blocks?


I like this a lot, and think it's the best intro to Rust yet. The thing that concerns me a bit is that it presents the special cases in concurrency without impressing some of the most important points. Primarily, the channel example presents the send as copying, which in this case it is, but one of the main advantages of Rust's channels and owned types is that message passing of heap-allocated types do not need to copy. It probably doesn't stress hard enough that Rust tasks do not share memory before saying, 'oh, but really you can share memory if you need to', though I see that the Arc and RWArc examples are good ways to introduce the concept of using unsafe code to provide safe abstractions.


The focus on C++ as point of comparison is understandable given Mozilla's background, but in Internet land most systems software runs on the JVM, and is written in Java, or increasingly, Scala (see LinkedIn and Twitter, for example).

The issues of memory layout and the like come up here, and unlike Rust the JVM doesn't give much control of this aspect. See Martin Thompson's blog for an example of someone very concerned with issues of performance on the JVM (http://mechanical-sympathy.blogspot.co.uk/) I believe Rust could see a lot of adoption within this community as a "better" Scala -- a modern high-level language that allows dropping down to bit-twiddling when performance is an issue. It needs higher kinded types before it will work for me, but I hear that is on the road-map.

BTW, I've read a few Rust tutorials and they all fail for me in the same way: too much waffle and not enough getting down to the details. I understand the difference between stack allocation, reference counting, and GC, I get why shared mutable state is a bad idea, etc. What I want is a short document laying out the knobs Rust provides (mutable vs immutable, ownership, allocation) and how I can twiddle said knobs.


but in Internet land most systems software runs on the JVM, and is written in Java

[[Citation needed]].

The Internet land I've lived in mostly lives in C with a smattering of non-JVM scripting languages (Python, Ruby, PHP, etc) on top.


It's probably more accurate to say that in the Enterprise most software is running on the JVM. But there are certain (large) internet companies that have significant system software running on the JVM (Google, Twitter).


I think that's totally true. It's not just so much because of Mozilla, but also because Rust is probably closer to C++ than Java...

The official tutorial contains much of that information.


Funnily, I see at being closer to Scala than either C++ or Java. I think it all comes down to background.

The official tutorial isn't very good on issues of memory management. The first mention of managed references (I assume that means GCed) is in an example in section 11. Nowhere does it actually explain what managed means (and if I missed it, I blame the tutorial for not making it explicit enough!)


Yeah, so the syntax for managed references was removed, yet some of it lingers in the docs. I'll make a note to clean that up.

The official tutorial is really bad but at least has a large bulk of information. I'd love to re-write it, but I haven't had the time.

(And it was more reference counted than actually garbage collected: now we have Rc<T> and Gc<T> for both strategies.)


(GC<T> still uses @ internally; i.e. it's reference counting and not garbage collection too. This will be fixed.)


Well that's the thing, while Rust occupies the same space as C++, but it doesn't just take its inspirations from there. It also takes it from functional languages like ML or Haskell (the compiler was originally written in OCaml).


I think the emphasis on "unsafe" isn't helpful. As far as I can tell, the only thing that "unsafe" is enabling is that Arc and RWArc are written in Rust rather than in C in the runtime (the way they'd be in Go, or Erlang, or Haskell). The things that make Rust able to do what it does are ownership and tasks and lifetimes and affine types -- all the things the post covers before talking about "unsafe".

Also, it gives the impression that there's something fundamentally unsafe about all of this, whereas the whole point is that these abstractions are _safe_ to use.


I think `unsafe` is important to talk about. It's the escape hatch that lets you subvert the type system and do things that the compiler cannot statically reason about. The ability to implement Arc in pure Rust code is very important. But I do agree that it could have been presented perhaps in a different way.


Agreed. My experiments with rust have been uphill because I've been fighting with lifetimes, ownership and borrowing.

For the most part I failed to see that I -needed- unsafe code in some situations. Instead I was trying (failing) to annotate my code to ridiculous levels with lifetimes. It was really frustrating.

I don't think the current docs do a great job of putting unsafe in a suitable perspective. It's somewhat downplayed IMO.

Still, I've learned now and it's been pretty pleasant after that.

FWIW I've been doing c++ for maybe 18 years, writing device drivers, game engines, compiler development. I thought rust was made for me but it's been tough, much more so than any other language except maybe SML!


> Also, it gives the impression that there's something fundamentally unsafe about all of this, whereas the whole point is that these abstractions are _safe_ to use.

Right, this is my point. I should find a way to make it a bit more clear.

I wrote it this way because the systems people I talk to are skeptical at times that a compiler knows best. After all, there's a reason you want that low-level control in the first place, right? The ability to escape things when you have to relaxes people.


> I wrote it this way because the systems people I talk to are skeptical at times that a compiler knows best.

Well, the best way to combat that would be demonstrate that cool things can be done safely in Rust, but that probably requires a lot more than fits in your introduction.

I think you could allay that fear by addressing it directly, rather than saying that Rust's secret sauce is unsafety.


Hm. I thought that talking about building ARC and discussing its implementation was something cool that directly addresses it. What would you like to see?


I think talking about how ARC is implemented in Rust is cool. I would just pitch it differently.

I think what I don't like about your current description is that it makes it seem like unsafety is what allows you to _have_ Arc in the language. Instead, unsafe blocks are what allow you to _implement_ Arc inside the language.

I'd start the footnote this way:

---------------

A footnote: Implementing Arc

So, the Rust language doesn't let us use shared mutable state in dangerous ways, but what happens if we really need to get down and dirty? For example, what if we wanted to _implement_ `Arc` ourselves?

In fact, `Arc` and `RWArc` are both implemented in Rust. Inside their implementations, they use locks, low-level memory operations, and everything else you might see in a C++ program. However, anytime we use these features, we have to wrap them in an `unsafe` block.

...

---------------

I hope that conveys the different emphasis that I'm talking about.


Certainly. Thanks. That's much more clear.


I think that's a necessary approach. The first time I encountered Rust, in a treatment that covered unique and managed but not unsafe/raw pointers, my impression was: ‘You promised GC was optional, but I can't even write a DAG’.


A little OT...but what's with Svbtle's apparent default styling of links? There's no indication that any particular word or sentence contains a link, which basically makes those links invisible to readers. Or do lots of people read web articles by randomly hovering the mouse around the text?

But relevant to the OP...I generally try to save useful tutorials like this on my pinboard, which often doesn't pick up the meta-description text. So I double-click to copy the first paragraph and paste it into pinboard...except in the OP, I kept on clicking on text that was hiding links underneath.

It's a strange UI decision, and one that seems to discourage the use of outbound links...if you can't see the links, then what is the purpose of them? For spiders?


> There's no indication that any particular word or sentence contains a link

There is a subtle grey underline, which I'm sure can be nearly invisible depending on your screen.


Allow me to save everyone from opening their devtools:

  background-color: #FFF;
  border-bottom: 2px solid #F4F4F4;
In RGB, that's the difference between 255 and 244. It's more than a little absurd.


Yeah, I think that they have been made lighter recently. Bummer :/


"Rust does not have the concept of null."

How can I have the pointer to something that is maybe allocated or maybe present? Do I have to have additional booleans for such uses? Isn't that a waste?

How can I effectively build complex data structures like graphs, tries etc then?

I'd like to see that covered too.


Rust uses the `Option` type for that (name taken from Scala and ML, it's called `Maybe` in Haskell).

http://static.rust-lang.org/doc/master/std/option/index.html

What's neat is that if you stuff a pointer inside an `Option`, then not only is it guaranteed to be memory-safe but it also compiles down to a plain old nullable pointer at runtime, so there's no extra overhead while still retaining safety.


So it doesn't have "null" but it has "None." On the linked page:

     // Remove the contained string, destroying the Option
     let unwrapped_msg = match msg {
         Some(m) => m,
         None => ~"default message"
     };
Now why is there that ";" at the end? Before that there is a construct without it:

    // Take a reference to the contained string
    match msg {
        Some(ref m) => println!("{}", *m),
        None => ()
    }
And did we have take the "reference" to print the value of m?

And one note more: the linked page doesn't explain that the Some actually introduces "Option" type. It writes about the Option but the code uses just "Some."

     let msg = Some(~"howdy");
Some as a "keyword" seems to have two different semantical purposes, depending if it's in the "match" or not. I don't see that explained too.


Most of this is out of the scope of this tutorial, but is in the comprehensive one.

match is an expression, but let is a statement. In the first example, the match expression is used inside of the let statement, in order to produce what is assigned.

I'm not 100% sure if you _must_ take that reference, but given that it's a pointer, that makes sense. In the previous version, it's simply returning a value, but println! needs the contents, not the pointer itself.

Inside the linked Option enum, it shows both: http://static.rust-lang.org/doc/master/std/option/enum.Optio...

It's an enum like any other.

That said, these are all good points, and this documentation should be improved. Thanks, I'll add this to my list.


The first example is an expression. The result of `match` will be assigned to `unwrapped_msg`.

The second example is a statement. Thus, we're not using the result of the `match` statement.


You use the Option type. Option can either be None or Some(T). I think Rust optimizes this to (non)null, so there is very little, if any, overhead compared to the equivelent C++ code.


It compiles down to a nullable pointer if possible, and if not, it needs a tag, so yes, it's very minimal if any.


I personally dislike the style of tutorial that has lots of 'we' and 'lets' in it.

I suppose part of that comes from the tendency for such tutorials to provide revelations instead of motivators. For example, in this tutorial there is 'look at this C++ code because I said to' and then two sentences later it explains that the C++ code ends up in a garbage value.

But this is probably very much a point of style and I'm sure lots of people think my view is stupid.


I tend to be very collectively focused, so I do tend to write this way. Thanks for the feedback; the style may not be appropriate for an official tutorial.

> the tendency for such tutorials to provide revelations instead of motivators.

I'm going to have to think about this, that's very interesting. I would like to say that my revelations provide motivation, but that may be wishful thinking...

Do you think there's a way to demonstrate these concepts in a way without 'the reveal'? It seems to me that comparison will always feel a bit reveal-y, as demonstrating some kind of difference is inherent in comparison.


Well, try to state the motivation prior to the explanation.

"The second function in this C++ code does not properly initialize num":

...

"How does that happen?"

...

"Rust avoids this by"

...


Great, thank you.


Thanks for the the tutorial! Rust seems a bit too complex to me. Like a C++ on steroid that wants to do and be everything. Nothing wrong with that but not my cup of tea. I'd rather stick to C if I need tight memory management, it is way simpler and straight forward. And if I need concurrency, I'll stick to Golang (or erlang). Really, it's such a pleasure to read some golang after reading this 30 minutes of Rust. Anyway, just my opinion.


> Rust seems a bit too complex to me. Like a C++ on steroid that wants to do and be everything.

No, that's explicitly not a goal. The goal is to enable zero-cost abstractions and memory safety. Everything here is in service of that goal.

> I'd rather stick to C if I need tight memory management, it is way simpler and straight forward.

The problem is that the "simple and straightforward" model of C leads to a lot of very not-simple-and-straightforward time in front of Valgrind to get the program to work, or worse, to fix security vulnerabilities.


I would argue that rust is both simpler and more consistent than either C or C++. No preprocessor, no pre/post increment, a real type system (instead of templates)—there's just not the mountain of undefined parts of the language that tends to be a large problem. Scala, in spite of being on a VM, is far closer to C++ IMHO in terms of attitude.

I actually find golang harder to read without parametric types (sans built-in slices and maps).


> Like a C++ on steroid

Rust most definitely isn't "like a C++ on steroid" from a complexity standpoint. It tries (and — I think — mostly succeed) to be a significantly simpler and more coherent language.

It does make things which are implicit in C or C++ (e.g. ownership) explicit. That's a good thing, you need to know your ownership in C or C++, the language just doesn't help you much.


> I'd rather stick to C if I need tight memory management, it is way simpler and straight forward.

Well, if you don't want your C to segfault, you have to understand these concepts anyway, they're just implicit to the language. C's memory management may be simple, but it's surely not easy.


> C if I need tight memory management, it is way simpler and straight forward

If you work alone yes. Good luck on a 50+ developer team size, with high atrition rates.


> Really, it's such a pleasure to read some golang after reading this 30 minutes of Rust.

Incidentally, I find reading Go to be much more verbose and even heaver cognitively compared to reading Rust code.


Thanks for this straightforward and accessible intro to some of Rust's unique features!


You're very welcome.


I like the overall structure, but I'm not sure about throwing so much syntax without explaining it in detail.


I think the amount of syntax you've shown is fine. Since there's C++ and Rust code that both do the same thing, it allows readers familiar with C++ to infer what the Rust syntax means. This is a much quicker process than reading "Here's how you write an if clause ..." prose.

When it comes to showing new languages to experienced programmers, I prefer showing code, and explaining what it accomplishes. Experienced programmers will start building up a Bayesian model of the syntax without being explicitly told.

I think of this as the "Dive Into Python" approach: http://www.diveintopython.net/


I actually liked the amount of detail in explaining the syntax. For the most part, the code samples are very readable and understandable as they are (and I haven't written any Rust).

I think explaining a lot of the syntax would have made this more of a "how to Rust" article, rather than a "why to Rust" article. I really appreciated the "why to Rust" tone of this article, and it provides a great explanation of why the language is different from C / C++ and why I should care.


Excellent. Thank you.

It's hard to have a beginner's mindset after you've done something for a year.


I've only written a little C and C++, about six years ago, but enough to understand the syntax well enough to understand C and non-templated C++ code. The introduction was at just the right level for me, and I didn't have any problems with the syntax.

I think it's well written, and it makes me interested in learning more about Rust. The one thing I'd like to have seen addressed is type inference.

Also, you don't credit the original author.


Great, thank you.

"This excellent presentation" is a link.


From the article:

> But wait, how is that possible? We can’t both allow and disallow mutable state. What gives?

I had to re-read the above a few times. And still don't get it. Steve, what do you mean with it? Are you talking about how are Arc/RWArc implemented? Or is it something else?


The previous sentence is "We gain the efficiency of shared mutable state, while retaining the safety of disallowing shared mutable state."

I also wasn't sure about the part you quoted, but I was trying to explain that it's not that Rust _doesn't_ allow shared mutable state, it's that while the language doesn't, you can use unsafe to build safe abstractions, so in practice, it does. Hmmm.


I still don't get it... And the previous sentence is now confusing for me too :)

First you showed two examples: Arc (to share immutable data) and RWArc (to share mutable data with enforced mutexes around closures). Then you talked about `unsafe`. Seems easy, one needs a "backdoor" to implement RWArc in Rust (at first I thought it was implemented in C/C++).

But (quoted) sentences between Arc/RWArc part and `unsafe` part don't really connect them, at least to me.


Ha! Bummer, maybe I will just need to re-write this paragraph.

A RWArc is shared mutable state: you can have two references to the Arc in two different tasks. Yet I said that Rust throws a compiler error for shared mutable state.

> (at first I thought it was implemented in C/C++).

There's very little C++ in Rust anymore. :)

> Seems easy, one needs a "backdoor" to implement RWArc in Rust

Yup, this is exactly the point with those two sentences. Maybe I should just straight-up remove them.


> There's very little C++ in Rust anymore.

I think we've now got none, except for LLVM in the compiler, which means any binaries built using the stdlib don't need any C++ libraries (to be clear, libraries without the stdlib have never needed any).


> Maybe I should just straight-up remove them.

Or be explicit that you'll now talk about RWArc implementation.


Yes. This.

I found it a bit worrying, as when I read that I had to go back and re-read the code a few times to make sure that there wasn't an unsafe block in the code, or perhaps that the use of RWArc implied unsafe, and would then turn that whole function into an automatically unsafe block.

Nevertheless, very cool. I really liked the tutorial, I'm getting quite excited by Rust.


Thanks for this. I've been thinking about getting into Rust recently and this motivates me to do so now.


Excellent. You're very welcome. :)


The last time I touched C code was my sophomore year in college, so maybe 12 years ago? As a result, the last time I had to deal with pointers and such was back then, as well.

I'm primarily a web-dev. Ruby, PHP, and Javascript are the languages I'm most familiar with at the moment.

Are there any Rust for Dummies-style tutorials floating around? As simple as this introduction is, it was still over my head...


I have you covered in that case as well: http://www.rustforrubyists.com/

I want to provide a version of the 30 minute intro that's not strictly for systems people as well, but you have to start somewhere.


Do you see Rust as a language that people will write the business domain type applications in or as a supplement to languages like Ruby, Python, etc. to write those "gotta have performance here" parts of the application?


The second ever (that we know of) use of Rust in production is by Tilde, who are using Rust embedded in Ruby for Skylight to help ensure that memory usage stays down and to get more consistent performance. Garbage collection while doing performance monitoring is bad news.

I'm not entirely sure yet. I wrote another post that touches on this: http://words.steveklabnik.com/rust-is-surprisingly-expressiv...

I think that it's possible that Rust might eventually be useful as an application level language. We'll see how it shakes out. We've been in love with scripting languages for the past few years, but now their drawbacks are becoming more apparent. Look at all the Rubyists and Pythonistas flocking to Go, for example...

Interesting times indeed.


I'm not the person you asked, but I see Rust as making a play in these areas:

- embedded systems

- games

- high performance + high correctness environments

Personally, I don't see it as a web-app or line of business app development languages. But it will allow you to create the components that the webapp and LoB apps call into. But I imagine if some people really like Rust for those purposes, they will start building the libraries and code infrastructure to make it easy and away we go.


Is rust borrowing any kind of code from what is used for objective C ARC technology relative to detecting the lifetime of a variable and automaticaly freeing the resource ? Is it a common known algorithm ?


Everything is written in Rust, so we're not borrowing the code.

Reference counting is very common: http://en.wikipedia.org/wiki/Reference_counting


I had the impression that rust and ARC were doing a bit of static code analysis to detect when a "reference count decrease/increase" could be performed ( thus the "automatic" part in ARC), and insert code at compile time. Which seems a bit more complicated than simply decrease them dynamically when a pointing is object is destroyed. But now that i come to think of it, things look a little blurry for me on that part.


I'm a big fan of that kudos button! Very nice website and an interesting introduction to Rust.


I find it annoying to the point of offensive. Actions being actived on mouse-hover is terrible; I did not intend to give "kudos" but was curious what might be linked under it, and suddenly an action is recorded. Now I can't consider any webpage safe and have to watch where my mouse goes; that's wrong.


Three cheers for NoScript!

I understand that it doesn't really solve your problem, but if it's safety you're worried about...


Thanks! I don't make any of that, it's just Svbtle.


> You can see how this makes it impossible to mutate the state without remembering to aquire the lock.

Not quite true. Looking at the type signature of e.g. RWArc::write I see this:

    fn write<U>(&self, blk: |x: &mut T| -> U) -> U
which means I could probably do:

    let mut n = local_arc.write(|nums| {
         nums[num] += 1;
         return ~(*nums);
     });
    n[2] = 42;


You're not letting anything escape. You're copying `nums` and then mutating your copy. And this only works because `nums` can be implicitly copied like this. If you changed it from `[int, ..3]` to `~[int]` you'd get a compiler error (as `~[T]` cannot be implicitly copied).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: