Fascinating--I wonder to what extent the performance improvements are due to the better constraints Rust hands over to LLVM for optimization (specifically thinking `noalias` rules) vs aggressiv code optimization by the developers to be to the point where they beat the C zlib. The latter could be more difficult to achieve given how long the original zlib has been getting incrementally improved, though.
zlib is really old and out of date, no real reason to use it except legacy. I think most optimization effort goes into newer compression libs like zstd.
Legacy is a bit much. Deflate is probably the only supported compression format in many existing/live standards. For a greenfield project where you can control all clients, sure, use whatever, but zlib is likely going to continue to be used for decades.
I think you've confused Zstd (which they were not discussing) for Zlib (the primary implementation of DEFLATE, which is most commonly used by Gzip)
Gzip/DEFLATE is mediocre on most metrics relevant to compression but it's widely used of course. Zstd is better all-around, but much newer and thus doesn't have the same level of adoption.
The comparison is against zlib-ng, not the OG zlib. So this is still very impressive.
I suspect the results won't be quite as good on Aarch64 or other architectures though. zlib-ng has a pretty wide range of hand-optimized intrinsics whereas zlib-rs seems to only really have x86_64.
-Cllvm-args=-enable-dfa-jump-thread option, which recovers most of the performance here. It performs a kind of jump threading for deterministic finite automata, and our decompression logic matches this pattern.
Really interesting, I've never used the Rust version, I can only assume I've used the C implementation indirectly dozens of times if not hundreds, but does anyone know how 1:1 the Rust implementation is?
The one area where C will always beat Rust (at least today) will probably always be portability, C compilers run everywhere, on everything, and to anything... Unless Rust has similar capabilities via LLVM?
That's the most interesting thing to me; is this to the point where you could ex. drop it in to a Gentoo box with nothing but a USE flag and switch everything over to use it? (Sadly it's API compatible but not ABI compatible so we're still talking about needing to recompile, but sometimes that's a minor matter)
Many of those C compilers are stuck in old versions like C89, and full of vendor extensions only relevant to a special CPU of an embedded vendor, compiling code that can hardly be acknowledged as ISO C.
Many of those vendors are slowly throwing away these proprietary toolchains, some of them are actually GCC forks, never ever updated, and replacing them with clang.
So it depends, on how much those vendors will care to support Rust for their customers, or if those kind of customers even care about anything else than their beloved C dialect.
Here’s the set of platforms Rust supports [1] and indeed it has broad support due to LLMV. It obviously still needs to port it’s platform-specific parts of the stdlib to the platforms that aren’t nostd. There’s also [2] which broadens to the set of platforms that GCC supports although I think it might be more bleeding edge than even tier 3 platforms at the moment.
I think the only place C really beats Rust is in vendors with toolchains they’ve stopped investing in. More broadly there’s clearly pressure for the industry to abandon C/C++ in favor of Rust given the inability of the language to be modernized into a safe version.
Fascinating--I wonder to what extent the performance improvements are due to the better constraints Rust hands over to LLVM for optimization (specifically thinking `noalias` rules) vs aggressiv code optimization by the developers to be to the point where they beat the C zlib. The latter could be more difficult to achieve given how long the original zlib has been getting incrementally improved, though.
zlib is really old and out of date, no real reason to use it except legacy. I think most optimization effort goes into newer compression libs like zstd.
Legacy is a bit much. Deflate is probably the only supported compression format in many existing/live standards. For a greenfield project where you can control all clients, sure, use whatever, but zlib is likely going to continue to be used for decades.
EDIT: doh! Mixed up Zstd and Zlib. Nothing to see here. (Can't delete with a reply.)
I think you've confused Zstd (which they were not discussing) for Zlib (the primary implementation of DEFLATE, which is most commonly used by Gzip)
Gzip/DEFLATE is mediocre on most metrics relevant to compression but it's widely used of course. Zstd is better all-around, but much newer and thus doesn't have the same level of adoption.
The comparison is against zlib-ng, not the OG zlib. So this is still very impressive.
I suspect the results won't be quite as good on Aarch64 or other architectures though. zlib-ng has a pretty wide range of hand-optimized intrinsics whereas zlib-rs seems to only really have x86_64.
zlib-rs does use ARM CRC32 instructions though.
Reading the linked blog post:
-Cllvm-args=-enable-dfa-jump-thread option, which recovers most of the performance here. It performs a kind of jump threading for deterministic finite automata, and our decompression logic matches this pattern.
https://en.wikipedia.org/wiki/Jump_threading
Really interesting, I've never used the Rust version, I can only assume I've used the C implementation indirectly dozens of times if not hundreds, but does anyone know how 1:1 the Rust implementation is?
The one area where C will always beat Rust (at least today) will probably always be portability, C compilers run everywhere, on everything, and to anything... Unless Rust has similar capabilities via LLVM?
From the GitHub page, they provide libz-rs-sys, a zlib-compatible C API for usage in non-Rust applications.
That's the most interesting thing to me; is this to the point where you could ex. drop it in to a Gentoo box with nothing but a USE flag and switch everything over to use it? (Sadly it's API compatible but not ABI compatible so we're still talking about needing to recompile, but sometimes that's a minor matter)
Many of those C compilers are stuck in old versions like C89, and full of vendor extensions only relevant to a special CPU of an embedded vendor, compiling code that can hardly be acknowledged as ISO C.
Many of those vendors are slowly throwing away these proprietary toolchains, some of them are actually GCC forks, never ever updated, and replacing them with clang.
So it depends, on how much those vendors will care to support Rust for their customers, or if those kind of customers even care about anything else than their beloved C dialect.
Here’s the set of platforms Rust supports [1] and indeed it has broad support due to LLMV. It obviously still needs to port it’s platform-specific parts of the stdlib to the platforms that aren’t nostd. There’s also [2] which broadens to the set of platforms that GCC supports although I think it might be more bleeding edge than even tier 3 platforms at the moment.
I think the only place C really beats Rust is in vendors with toolchains they’ve stopped investing in. More broadly there’s clearly pressure for the industry to abandon C/C++ in favor of Rust given the inability of the language to be modernized into a safe version.
[1] https://doc.rust-lang.org/nightly/rustc/platform-support.htm...
[2] https://github.com/rust-lang/rustc_codegen_gcc
Are they exporting their library with the C-ABI? It's cool if it's fast and would be nice if I could link to it from non-Rust code.
https://news.ycombinator.com/item?id=43172364