The Case for Rust (in the base system)

msplsh · May 14, 2024

kpedersen said:
I think they said the same thing about Java.

In userspace, sure. I mean what's Android.

kpedersen · May 14, 2024

msplsh said:
In userspace, sure. I mean what's Android.

Still reliant on the NDK for many of its underlying libraries.

Just look at any substantial project and you will see a tonne of jni and native C/C++ libraries. Turns out Java wasn't a better tool.

With the introduction of C/C++ NativeActivity and Kotlyn, it seems that Java usage is even becoming quite legacy now in the Android ecosystem.

msplsh · May 14, 2024

IMO that's just a symptom of Java not reaching far enough down. Hence stuff like Rust and Golang.

kpedersen · May 14, 2024

msplsh said:
IMO that's just a symptom of Java not reaching far enough down. Hence stuff like Rust and Golang.

Indeed. And C reaching further still (including deep into the JVM that is primarily written in it).

Plus we have another issue. Calling Rust from Java is quite tricky. Calling Java from Rust needs to go through C anyway. Hence Rust Bindings to JNI. Another use-case that this completely crap guide overlooks. Binding against VM languages written in C (or anything else).

msplsh · May 14, 2024

You seem intent on muddying these things. Rust and goes as far as C does before both of them have to bang rocks together to run an opcode.

Rust from Java should be as easy as a C JNI. IIRC Rust should be mostly indistinguishable from any other C library. Java from Rust is going to be difficult to do in order to reconcile memory models. Directions matter.

Jose · May 14, 2024

kpedersen said:
Just look at any substantial project and you will see a tonne of jni and native C/C++ libraries. Turns out Java wasn't a better tool.

I work on plenty of substantial projects that are pure Java. They're not open source.

Kotlin is just the latest Scala, Clojure, Groovy, etc. Some lame semi "functional" or "declarative" language bolted on to the JVM badly. Hard pass.

kpedersen · May 14, 2024

msplsh said:
You seem intent on muddying these things. Rust and goes as far as C does before both of them have to bang rocks together to run an opcode.

Its actually the mud you want to focus on more to understand Rust's limitations. With C you can tap into the underlying OS directly, using its C APIs. opcodes are easy, it is the muddy layers in between that is complex to interface with if you are using the wrong language the underlying OS API was written for. This is why Rust's crates.io is filled with C bindings rather than tapping into i.e inline assembly directly.

In short, the depths of the OS internals are out of reach unless Rust can consume C headers.

msplsh said:
Rust from Java should be as easy as a C JNI.

No, Rust has the same complexity as C++ does with JNI. One of the main use-cases of C is because languages can bind against it easily. With C++ and Rust you have no stable ABI for one. Passing an std::string or equivalent is also no good.

With Rust you will need to do the same as C++. That is expose your *entire* API through extern "C" "unmangled" function wrappers. This comes with numerous limitations too.

msplsh said:
Java from Rust is going to be difficult to do in order to reconcile memory models. Directions matter.

Yep, hence why I stated both directions. Rust has no native JNI of its own and never can, so is at a big disadvantage to C (and close supersets) there.

msplsh · May 14, 2024

You're telling me Rust doesn't access libc unless it's through "mud"? Maybe the problem is what is this "mud" and how is it different than a "shim" or a "wrapper", or "glue." Is "mud" just a "shim" for things people don't like?

kpedersen · May 14, 2024

msplsh said:
You're telling me Rust doesn't access libc unless it's through "mud"?

Rust indeed has some fairly minor inbuilt "mud" for libc. But when was the last time you have been able to write a large program using *just* libc?

So you would need more mud for any dependency you use. Gtk, SDL, oci8, sqlite, libevent, etc, etc.

A little bit of mud is fine but it all starts to build up and ultimately why Rust solutions pull in a shedload of dependencies to do trivial things. It isn't about avoiding reinventing the wheel anymore; it is about covering the wheel in mud so that Rust can use it.

A more specific example, if you wanted to use Rust to improve the i.e JVM garbage collector how would you even begin? Sure, you can execute opcodes. You can access libc. But how would you actually start to add code that integrates with the existing (monstrous) JVM GC code? You would need to basically expose everything in it to Rust through some substantial boilerplate. If this isn't mud, I don't know what is.

msplsh said:
Maybe the problem is what is this "mud" and how is it different than a "shim" or a "wrapper", or "glue."

No difference. Its all the same. Bindings, whether fat or thin.

msplsh said:
Is "mud" just a "shim" for things people don't like?

Absolutely. I don't think anyone likes shims, wrappers or glue. Especially when they tend to rot and go unmaintained quite often.

Don't get me wrong. If everything was written in Rust rather than C and the roles were reversed, absolutely I would find people trying to use C to be completely mad. But unfortunately this is not what happened.

The Rust team do acknowledge this interoperability issue higher up the chain. Thus why projects such as this exist. However it is just a little silly to see the Rust community (ignoring the noisy reddit kids or students who don't know better) trying to pretend that this problem doesn't exist.

kpedersen · May 14, 2024

Jose said:
I work on plenty of substantial projects that are pure Java. They're not open source.

On Android? That is surprising.

Certainly for large scale enterprise systems, Java is still going strong. In many ways this is one of the use-cases where it can be "pure" because it doesn't need to interop with lower level systems (by design!). Pretty much just files and sockets (as a gross oversimplification

).

Jose said:
Kotlin is just the latest Scala, Clojure, Groovy, etc. Some lame semi "functional" or "declarative" language bolted on to the JVM badly. Hard pass.

I have never actually tried it. Frankly the java-like build systems scare me off before I even get to the languages XD

msplsh · May 14, 2024

I think Java's build systems are the way they are because giant corporations projected their needs on them. Same goes with most of their design paradigms.

Jose · May 15, 2024

kpedersen said:
On Android? That is surprising.

Nope. GUI programming in Java is a joke. Then again, looking at things like Electron, maybe it's not so bad.

kpedersen said:
I have never actually tried it. Frankly the java-like build systems scare me off before I even get to the languages XD

Yeah, they're a nightmare. One of the original sins of Java was to break the invariant that 1 source file = 1 compiled file. It was a side effect of trying to simplify by eliminating header files. That made it apparently impossible to write a reasonable build tool because javac foo.java can wind up generating any number of .class files.

There's also a stunning level of ignorance among "senior" Java programmers. You wouldn't believe how many times I've had trouble explaining how to build a simple test case I'd written in pure core Java by running

Code:

javac MyTest.java
java -cp . MyTest

EDIT: And... I screwed up the build/run commands ? Maybe I'm not as smart as I think I am.

kpedersen · May 15, 2024

Jose said:
Yeah, they're a nightmare. One of the original sins of Java was to break the invariant that 1 source file = 1 compiled file. It was a side effect of trying to simplify by eliminating header files. That made it apparently impossible to write a reasonable build tool

I see. Yep, one I used to run into was that "anonymous inner classes" generate extra .class files that the build tool doesn't know about (and can't infer). It makes it a pain to even use Makefiles with it. That said, computers are so fast now and Java compiles relatively quickly that I would be tempted to still use Makefiles and just compile up the whole module per change, keeping the build system simple (probably why I am never invited to join Java projects

).

Jose said:
You wouldn't believe how many times I've had trouble explaining how to build a simple test case I'd written in pure core Java by running

Heh, for some (annoyingly veteran) developers, If you know more than "Just click the green "play" button" to build. You are basically a wizard.

Jose · May 17, 2024

kpedersen said:
I see. Yep, one I used to run into was that "anonymous inner classes" generate extra .class files that the build tool doesn't know about (and can't infer).

Yep, but that's not the only problem. Consider the following trivial Java classes

Foo.java

Java:

class Foo{public static void main(String[] args){Bar.mess();}}

Bar.java

Java:

class Bar{static void mess(){System.out.println("Hello, world!");}}

Now if you run javac Foo.java you'll notice that both Bar.class and Foo.class are generated. If you compile Bar.java first, its .class file will just be used without updating it. Thus the objects generated by compilation depend on the order in which things are compiled!

This makes javac both a compiler and primitive build tool. The intention was good. There were a lot of ugly pre-processor mazes in the '90s that caused many, many hard to debug problems. However, this violation of the principle to do only one thing and do it well makes it impossible to come up with a nice clean DAG for Java builds. Hence the proliferation of monstrous and abstruse build systems for it.

patmaddox · May 20, 2024

For people that don't understand the "mud" involved in Rust->C integration, here's my simplest explanation of it: you have to re-implement headers in Rust for any functions or structs that you use, and you have to keep those definitions up-to-date with any changes in the underlying library. aka bindings

C, C++, Go, and Zig are the only languages I know of that can include C headers directly without having to define bindings.

msplsh · May 20, 2024

I was thinking "isn't this a thing you could just do automatically?"

kpedersen · May 20, 2024

msplsh said:
I was thinking "isn't this a thing you could just do automatically?"

As discussed, unfortunately not. But it can help maybe ~30% of the way.

If swig/cbindgen could do the job. Most Rust crates on crates.io wouldn't even need to exist.

And if you want safe Rust centric bindings (i.e fat bindings), then SDL2-rust provides a good example of the kind of work required.

cracauer@ · May 20, 2024

patmaddox said:
For people that don't understand the "mud" involved in Rust->C integration, here's my simplest explanation of it: you have to re-implement headers in Rust for any functions or structs that you use, and you have to keep those definitions up-to-date with any changes in the underlying library. aka bindings

C, C++, Go, and Zig are the only languages I know of that can include C headers directly without having to define bindings.

The Clasp implementation of Common Lisp uses the llvm libraries to do this. It even has a bunch of C++ specific binding capabilities.

bakul · May 21, 2024

patmaddox said:
C, C++, Go, and Zig are the only languages I know of that can include C headers directly without having to define bindings.

You can add V to your list.

patmaddox · May 21, 2024

msplsh said:
I was thinking "isn't this a thing you could just do automatically?"

Sometimes, maybe.

I admittedly haven’t used rust-bindgen (or maybe I have, I don’t remember). I used an auto-convert-c-header-to-bindings library in some language on a more complicated file (maybe jail.h) and it OOMed after about 20 minutes.

For super simple includes, sure those tools work no problem. But include files can include other files, and can get big, and the tools can struggle with them.

cracauer@ · May 21, 2024

patmaddox said:
Sometimes, maybe.

I admittedly haven’t used rust-bindgen (or maybe I have, I don’t remember). I used an auto-convert-c-header-to-bindings library in some language on a more complicated file (maybe jail.h) and it OOMed after about 20 minutes.

For super simple includes, sure those tools work no problem. But include files can include other files, and can get big, and the tools can struggle with them.

There is also the problem of C preprocessor macro expansion. That causes problems both because of size of expanded stuff and because of renamings that happen that make the resulting non-C API more remote from the original C (and its documentation) and sometimes hard to use.

You really want to go all the way here and have the LLVM libraries do the C (or C++) "analysis".

Crivens · May 21, 2024

Once upon a time when I was doing commercial compiler construction, we had a look at the lingua franca of this - the debug information. That worked pretty good for automatic API stub generating.

kpedersen · May 21, 2024

cracauer@ said:
You really want to go all the way here and have the LLVM libraries do the C (or C++) "analysis".

Agreed. Almost similar to .NET IL, it would provide the benefit that any language could consume any language.

Tightly depending on LLVM is risky(ish...) but is a good first step for Rust if it wants to make progress outside of C's (slightly janky) shadow.

Jose · May 21, 2024

kpedersen said:
Tightly depending on LLVM is risky(ish...) but is a good first step for Rust if it wants to make progress outside of C's (slightly janky) shadow.

Yeah. The package set I use on this workstation required compiling three different versions of LLVM. Please don't add 5 versions of Rust to the mess.

patmaddox · May 21, 2024

Jose said:
The package set I use on this workstation required compiling three different versions of LLVM.

I'm in the same boat, and I don't really get it - LLVM is fully backwards-compatible, right? So it should be possible to only build the highest version LLVM needed, not need to build three different versions.

I haven't looked closely at what packages are initiating the different LLVM builds. I suppose it's because they're configured with specific versions - perhaps greater than LLVM_DEFAULT? One thing I've not yet tried is setting LLVM_DEFAULT to 17... hopefully the packages check for a version >= rather than ==?

The Case for Rust (in the base system)

Administrator