I read it as > I agree, Rust seems a lot better for inline assembly [because the...

jmillikin · on March 8, 2025

Rust's inline assembly syntax is part of the language, and in principle the same Rust source would compile on any conforming compiler (rustc, gccrs).

C/C++ doesn't have a standard syntax for inline assembly. Clang and GCC have extensions for it, with compiler-specific behavior and syntax.

wakawaka28 · on March 8, 2025

I mentioned somewhere else but I might as well mention here too: there is no standard assembler that everyone uses. Each one may have a slightly different syntax, even for the same arch, and at least some C++ compilers allow you to customize the assembler used during compilation. Therefore, one would assume that inline assembly can't be uniform in general, without picking a single assembler (even assembler version) for each arch.

jmillikin · on March 8, 2025

You're talking about the syntax of the assembly code itself. In practice small variations between assemblers isn't much of a problem for inline assembly in the same way it would be for standalone .s sources, because inline assembly rarely has implementation-specific directives and macros and such. It's not like the MASM vs NASM split.

This thread is about the compiler-specific syntax used to indicate the boundary between C and assembly and the ABI of the assembly block (register ins/outs/clobbers). Take a look at the documentation for MSVC vs GCC:

https://learn.microsoft.com/en-us/cpp/assembler/inline/asm?v...

https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html

Rust specifies the inline assembly syntax at https://doc.rust-lang.org/reference/inline-assembly.html in great detail. It's not a rustc extension, it's part of the Rust language spec.

wakawaka28 · on March 8, 2025

>This thread is about the compiler-specific syntax used to indicate the boundary between C and assembly and the ABI of the assembly block (register ins/outs/clobbers).

I see... Nevertheless, this is a really weird issue to get bent out of shape over. How many people are really writing so much inline assembly and also needing to support multiple compilers with incompatible syntax?

jmillikin · on March 9, 2025

Biggest category of libraries that need inline assembly with compiler portability are compression/decompression codecs (like the linked article) -- think of images (PNG, JPEG), audio (MP3, Opus, FLAC), video (MPEG4, H.264, AV1).

Also important is cryptography, where inline assembly provides more deterministic performance than compiler-generated instructions.

Compiler intrinsics can get you pretty far, but sometimes dropping down to assembly is the only solution. In those times, inline assembly can be more ergonomic than separate .s source files.

vlovich123 · on March 8, 2025

Exactly. It picks a single assembler:

> Currently, all supported targets follow the assembly code syntax used by LLVM’s internal assembler which usually corresponds to that of the GNU assembler (GAS)

Uniformity like that is a good thing when you need to ensure that your code compiles consistently in a supported manner forever. Swapping out assemblers isn’t helpful for inline assembly.

jmillikin · on March 8, 2025

The quoted statement is weaker than what you're reading it as, I think. It's not a statement that emitted assembly code is guaranteed to conform to LLVM syntax, it's just noting that (1) at present, (2) for supported targets of the rustc implementation, the emitted assembly uses LLVM syntax.

Non-LLVM compilers like gccrs could support platforms that LLVM doesn't, which means the assembly syntax they emit would definitionally be non-LLVM. And even for platforms supported by both backends, gccrs might choose to emit GNU syntax.

Note also that using a non-builtin assembler is sometimes necessary for niche platforms, like if you've got a target CPU that is "MIPS plus custom SIMD instructions" or whatever.

estebank · on March 8, 2025

I didn't follow up the stabilization process very closely, but I believe you're wrong. What you're describing is what used to be asm! and is now llvm_asm!. The current stable asm! syntax actually parses its own assembly instead of passing it through to the backend unchanged. This was done explicitly to allow for non-llvm backends to work, and for alternative front-ends to be able to be compatible. I saw multiple statements on this thread about alternative compilers or backends causing trouble here, and that's just not the case given the design was delayed for ages until those issues could be addressed.

Given that not all platforms that are supported by rust have currently support for asm!, I believe your last paragraph does still apply.

https://rust-lang.github.io/rfcs/2873-inline-asm.html

jmillikin · on March 8, 2025

This sentence from the Reference is important:

  > The exact assembly code syntax is target-specific and opaque to the compiler
  > except for the way operands are substituted into the template string to form
  > the code passed to the assembler.

You can verify that rustc doesn't validate the contents of asm!() by telling it to emit the raw LLVM IR:

  % cat bogus.rs
  #![no_std]
  pub unsafe fn bogus_fn() {
   core::arch::asm!(".bogus");
   core::arch::asm!("bogus");
  }
  % rustc --crate-type=lib -C panic=abort --emit=llvm-ir -o bogus.ll bogus.rs
  % cat bogus.ll
  [...]
  ; bogus::bogus_fn
  ; Function Attrs: nounwind
  define void @_ZN5bogus8bogus_fn17h0e38c0ae539c227fE() unnamed_addr #0 {
  start:
    call void asm sideeffect alignstack ".bogus", "~{cc},~{memory}"(), !srcloc !2
    call void asm sideeffect alignstack "bogus", "~{cc},~{memory}"(), !srcloc !3
    ret void
  }

That IR is going to get passed to llvm-as and possibly onward to an external assembler, which is where the actual validation of instruction mnemonics and assembler directives happens.

---

The difference between llvm_asm!() and asm!() is in the syntax of the stuff outside of the instructions/directives -- LLVM's "~{cc},~{memory}" is what llvm_asm!() accepts more-or-less directly, and asm!() generates from backend-independent syntax.

I have an example on my blog of calling Linux syscalls via inline assembly in C, LLVM IR, and Rust. Reading it might help clarify the boundary: https://john-millikin.com/unix-syscalls#inline-assembly

vlovich123 · on March 8, 2025

Assembly by definition is platform specific. The issue isn’t that it’s the same syntax on every platform but that it’s a single standardized syntax on each platform.

wakawaka28 · on March 8, 2025

I understood it that way too. I just expect that if there were more Rust compilers (a benefit which C++ has in spades) then there would most likely be many annoying differences between them as well. There isn't an ISO standard for Rust. For that matter I guess most programming languages with multiple implementations have basically the same pro and con: there's more than one way to do things.

jmillikin · on March 8, 2025

Note that becoming an international standard (via ISO, ECMA, IETF, or whatever) isn't necessary or sufficient to avoid dialects.

If the Rust language specification is precise enough to avoid disagreements about intended behavior, then multiple compilers can be written against that spec and they can all be expected to correctly compile Rust source code to equivalent output. Even if no international standards body has signed off on it.

On the other hand, if the spec is incomplete or underspecified, then even an ANSI/ISO/IETF stamp of approval won't help bring different implementations into alignment. C/C++ has been an ISO standard for >30 years and it's still difficult to write non-trivial codebases that can compile without modification on MSVC, GCC, Clang, and ICC because the specified (= portable) part of the language is too small to use exclusively.

Or hell, look at JSON, it's tiny and been standardized by the IETF but good luck getting consistent parsing of numeric values.

vlovich123 · on March 8, 2025

You see it as a benefit, I see it as ridiculously user hostile. Porting your code to a new platform isn’t just “implement new APIs” it’s also “adjust your usage of the language to the dialect this vendor understands“. There is no benefit whatsoever to the end user and ecosystem of the language to having multiple frontends to contend with.

I’m all for multiple backends but there should be only 1 frontend. That’s why I hope gccrs remains forever a research project - it’s useful to help the Rust language people find holes in the spec but if it ever escapes the lab expect Rust to pick up C++ disease. Rust with a gcc backend is fine for when you want gcc platform support - a duplicate frontend with its own quirks serves no purpose.

I also hope Rust never moves to an ISO standard for similar reasons. As someone who has participated in an ISO committee (not language) it was a complete and utter shitshow and a giant waste of time taking forever to get simple things done.

jmillikin · on March 8, 2025

  > I’m all for multiple backends but there should be only 1 frontend. That’s
  > why I hope gccrs remains forever a research project - it’s useful to help
  > the Rust language people find holes in the spec but if it ever escapes the
  > lab expect Rust to pick up C++ disease.

An important difference between Rust and C++ is that Rust maintains a distinction between stable and unstable features, with unstable features requiring a special toolchain and compiler pragma to use. The gccrs developers have said on record that they want to avoid creating a GNU dialect of Rust, so presumably their plan is to either have no gccrs-specific features at all, or to put such features behind an unstable #![feature] pragma.

  > Rust with a gcc backend is fine for when you want gcc platform support
  > - a duplicate frontend with its own quirks serves no purpose.

A GCC-based Rust frontend would reduce the friction needed to adopt Rust in existing large projects. The Linux kernel is a great example, many of the Linux kernel devs don't want a hard dependency on LLVM, so they're not willing to accept Rust into their part of the tree until GCC can compile it.

vlovich123 · on March 8, 2025

Dialects are created not just because of different feature sets, but also because of different interpretations of the spec / bugs. Similarly, if Rust adds a feature, it’ll take time for gccrs to port that feature - that’s a dialect or Rust becomes a negotiation of getting gccrs to adopt the feature unless you really think gccrs will follow the Rust compiler with the same set of features implemented in a version (ie tightly coupled release cycles). It’s irrelevant of the intentions - that’s going to be the outcome.

> A GCC-based Rust frontend would reduce the friction needed to adopt Rust in existing large projects. The Linux kernel is a great example, many of the Linux kernel devs don't want a hard dependency on LLVM, so they're not willing to accept Rust into their part of the tree until GCC can compile it.

How is that use case not addressed by rust_codegen_gcc? That seems like a much more useful effort for the broader community to focus on that delivers the benefits of gcc without bifurcating the frontend.