Show HN: Apache Fory Rust – 10-20x faster serialization than JSON/Protobuf

fory.apache.org

64 points by chaokunyang 12 hours ago

Serialization framework with some interesting numbers: 10-20x faster on nested objects than json/protobuf.

  Technical approach: compile-time codegen (no reflection), compact binary protocol with meta-packing, little-endian layout optimized for modern CPUs.

  Unique features that other fast serializers don't have:
  - Cross-language without IDL files (Rust ↔ Python/Java/Go)
  - Trait object serialization (Box<dyn Trait>)
  - Automatic circular reference handling
  - Schema evolution without coordination

  Happy to discuss design trade-offs.

  Benchmarks: https://fory.apache.org/docs/benchmarks/rust
tnorgaard 9 hours ago

I wish we would focus on making tooling better for W3C EXI (Binary XML encoding) instead of inventing new formats. Just being fast isn't enough, I don't see many using Aeron/SBT, it need a ecosystem - which XML does have.

  • chaokunyang 20 minutes ago

    Binary XML encoding (like W3C EXI) is useful in some contexts, but it’s generally not as efficient as modern binary serialization formats. It also can’t naturally express shared or circular reference semantics, which are important for complex object graphs.

    Fory’s format was designed from the ground up to handle those cases efficiently, while still enabling cross‑language compatibility and schema evolution.

  • stmw 9 hours ago

    I am not sure if W3C EXI, or ASN.1 BER or something else is better, but agree that using DOP (rather than OOP) design principles is the right answer -- which means focusing on the encoding first, and working backwards towards the languages / clients.

    • chaokunyang 16 minutes ago

      DOP is very interesting, I like this idea too — most DOP approaches are implemented via an IDL, which is another valid direction. I plan to support that in Fory. I want to give users the freedom to choose the model that works best for them.

stmw 10 hours ago

Regarding design tradeoffs: I am very skeptical that this can be made to work for the long run in a cross-language way without formalizing the on-the-wire contract via IDL or similar.

In my experience, while starting from a language to arrive at the serialization often feels more ergonomic (e.g. RPC style) in the start, it hides too much of what's going on from the users and over time suffers greatly from programming language / runtime changes - the latter multiplied by the number of languages or frameworks supported.

  • chaokunyang 3 hours ago

    That’s a fair point — with more languages in the mix, having a formal schema can definitely help prevent drift.

    The way I think about it is: • Single‑language projects often work best without an IDL — it keeps things simple and avoids extra steps. • Two languages – both IDL and no‑IDL approaches can work, depending on the team’s habits. • Three or more – an IDL can be really useful as a single source of truth and to avoid manually writing struct definitions in every language.

    For Apache Fory, my plan is to add optional IDL support, so teams who want that “single truth” can generate definitions automatically, and others can continue with language‑first development. My hope is to give people flexibility to choose what fits their situation best.

mlhamel 10 hours ago

I'm wondering how do you share you shared types between languages if there's no schema ?

  • kenhwang 10 hours ago

    Looks like there's a type mapping chart for supported types: https://fory.apache.org/docs/docs/guide/xlang_type_mapping

    Otherwise, the schema seems to be derived from the class being serialized for typed languages, or otherwise annotated in code. The serializer and deserializer code must be manually written to be compatible instead of both sides being codegen'd to match from a schema file. He's the example I found for python: https://fory.apache.org/docs/docs/guide/python_serialization...

    • chaokunyang 2 hours ago

      You don’t need to hand‑write serializer code. In typed languages you just define your class or struct as usual; in dynamic languages you can use type hints.

      When running in compatible mode, Fory automatically derives a compact schema from those definitions at runtime time and sends it along to peers for the first time serialization. That way, both sides know the structure without needing a separate schema file.

      The idea is to make cross‑language exchange work out‑of‑the‑box, while still allowing teams to add an explicit IDL later if they want a single source of truth.

  • stmw 10 hours ago

    I am skeptical that it's possible to make this work in the long run.

    • chaokunyang 2 hours ago

      I get your concern — for one or two languages, skipping an IDL can work well and keeps things simple.

      But once you’re dealing with three or more languages, I agree an IDL becomes valuable as a single source of truth. That’s work we’ve started: adding optional IDL support so teams can generate data structures in each language from one shared definition.

  • fabiensanglard 10 hours ago

    Not explaining this case makes me wonder how much this lib is actually used in production. This was also the first question I asked myself.

no_circuit 9 hours ago

Are the benchmarks actually fair? See:

https://github.com/apache/fory/blob/fd1d53bd0fbbc5e0ce6d53ef...

It seems if the serialization object is not a "Fory" struct, then it is forced to go through to/from conversion as part of the measured serialization work:

https://github.com/apache/fory/blob/fd1d53bd0fbbc5e0ce6d53ef...

The to/from type of work includes cloning Strings:

https://github.com/apache/fory/blob/fd1d53bd0fbbc5e0ce6d53ef...

reallocating growing arrays with collect:

https://github.com/apache/fory/blob/fd1d53bd0fbbc5e0ce6d53ef...

I'd think that the to/from Fory types is shouldn't be part of the tests.

Also, when used in an actual system tonic would be providing a 8KB buffer to write into, not just a Vec::default() that may need to be resized multiple times:

https://github.com/hyperium/tonic/blob/147c94cd661c0015af2e5...

  • no_circuit 6 hours ago

    IMO, not a fair benchmark.

    I can see the source of an 10x improvement on an Intel(R) Xeon(R) Gold 6136 CPU @ 3.00GHz, but it drops to 3x improvement when I remove the to/from that clones or collects Vecs, and always allocate an 8K Vec instead of a ::Default for the writable buffer.

    If anything, the benches should be updated in a tower service / codec generics style where other formats like protobuf do not use any Fory-related code at all.

    Note also that Fory has some writer pool that is utilized during the tests:

    https://github.com/apache/fory/blob/fd1d53bd0fbbc5e0ce6d53ef...

    Original bench selection for Fory:

        Benchmarking ecommerce_data/fory_serialize/medium: Collecting 100 samples in estimated 5.0494 s (197k it
        ecommerce_data/fory_serialize/medium
                                time:   [25.373 µs 25.605 µs 25.916 µs]
                                change: [-2.0973% -0.9263% +0.2852%] (p = 0.15 > 0.05)
                                No change in performance detected.
        Found 4 outliers among 100 measurements (4.00%)
          2 (2.00%) high mild
          2 (2.00%) high severe
    
    Compared to original bench for Protobuf/Prost:

        Benchmarking ecommerce_data/protobuf_serialize/medium: Collecting 100 samples in estimated 5.0419 s (20k
        ecommerce_data/protobuf_serialize/medium
                                time:   [248.85 µs 251.04 µs 253.86 µs]
        Found 18 outliers among 100 measurements (18.00%)
          8 (8.00%) high mild
          10 (10.00%) high severe
    
    However after allocating 8K instead of ::Default and removing to/from it for an updated protobuf bench:

        fair_ecommerce_data/protobuf_serialize/medium
                                time:   [73.114 µs 73.885 µs 74.911 µs]
                                change: [-1.8410% -0.6702% +0.5190%] (p = 0.30 > 0.05)
                                No change in performance detected.
        Found 14 outliers among 100 measurements (14.00%)
          2 (2.00%) high mild
          12 (12.00%) high severe
wiseowise 9 hours ago

Still mad they had to change the name. "Fury" was a really fitting name for fast serialization framework, "fory" is just bogus. Should've renamed it to "foray" or something.

dxxvi 4 hours ago

Is Google guava really needed? I would like it to be taken out.

  • chaokunyang 3 hours ago

    No, it's not needed. We plan to remove Google Guava from the Fory Java dependency. Our philosophy is that the core should have as few dependencies as possible for maintainability and minimal footprint.

nitwit005 10 hours ago

These binary protocols generally also try to keep the data size small. Protobuf is essentially compressing its integers (varint or zigzag encoding), for example.

It'd be helpful to see a plot of serialization costs vs data size. If you only display serialization TPS, you're always going to lose to the "do nothing" option of just writing your C structs directly to the wire, which is essentially zero cost.

  • chaokunyang an hour ago

    Fory also compress integers using varint or zigzag encoding. The size are basically same:

    | data type | data size | fory | protobuf |

    | --------------- | --------- | ------- | -------- |

    | simple-struct | small | 21 | 19 |

    | simple-struct | medium | 70 | 66 |

    | simple-struct | large | 220 | 216 |

    | simple-list | small | 36 | 16 |

    | simple-list | medium | 802 | 543 |

    | simple-list | large | 14512 | 12876 |

    | simple-map | small | 33 | 36 |

    | simple-map | medium | 795 | 1182 |

    | simple-map | large | 17893 | 21746 |

    | person | small | 122 | 118 |

    | person | medium | 873 | 948 |

    | person | large | 7531 | 7865 |

    | company | small | 191 | 182 |

    | company | medium | 9118 | 9950 |

    | company | large | 748105 | 782485 |

    | e-commerce-data | small | 750 | 737 |

    | e-commerce-data | medium | 53275 | 58025 |

    | e-commerce-data | large | 1079358 | 1166878 |

    | system-data | small | 311 | 315 |

    | system-data | medium | 24301 | 26161 |

    | system-data | large | 450031 | 479988 |

  • stmw 9 hours ago

    It appears there are two schema compatibility modes and no guarantee of minor version binary compatibility.

no_circuit 9 hours ago
  • chaokunyang 2 hours ago

    Probably not for everyone. The current limit of 4096 types could be expanded if there’s a real need — it’s not a hard technical barrier.

    I’m curious though: what’s an example scenario you’ve seen that requires so many distinct types? I haven’t personally come across a case with 4,096+ protocol messages defined.

lsb 10 hours ago

Curious about comparisons with Apache Arrow, which uses flatbuffers to avoid memory copying during deserialization, which is well supported by the Pandas ecosystem, and which allows users to serialize arrays as lists of numbers that have hardware support from a GPU (int8-64, float)

  • chaokunyang 21 minutes ago

    Apache Arrow is more of a memory format than a general‑purpose data serialization system. It’s great for in‑memory analytics and GPU‑friendly columnar storage.

    Apache Fory, on the other hand, has its own wire‑stream format designed for sending data across processes or networks. Most of the code is focused on efficiently converting in‑memory objects into that stream format (and back) — with features like cross‑language support, circular reference handling, and schema evolution.

    Fory also has a row format, which is a memory format, and can complement or compete with Arrow’s columnar format depending on the use case.

jasonjmcghee 10 hours ago

Would love to see how it compares to Flatbuffers - was surprised to not see it in the benchmarks!

paddy_m 11 hours ago

What's the story for JS. I see that there is a javascript directory, but it only mentions nodejs. I don't see an npm package. So does this work in web browsers?

  • chaokunyang 28 minutes ago

    JS support is still experimental, I have not publish it to npm

paddy_m 11 hours ago

How does this deal with numeric types like NaN, Infinity...?

  • OptionOfT 9 hours ago

        use fory::{Fory, ForyObject};
    
        #[derive(ForyObject, Debug, PartialEq)]
        struct Struct {
            nan: f32,
            inf: f32,
        }
    
        fn main() {
            let mut fory = Fory::default();
            fory.register::<Struct>(1).unwrap();
    
            let original = Struct {
                nan: f32::NAN,
                inf: f32::INFINITY,
            };
            dbg!(&original);
    
            let serialized = fory.serialize(&original).unwrap();
    
            let back: Struct = fory.deserialize(&serialized).unwrap();
            dbg!(&back);
        }
    
    
    
    Yields

         cargo run
           Compiling rust-seed v0.0.0-development (/home/random-code/fory-nan-inf)
            Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.28s
             Running `target/debug/fory-nan-inf`
        [src/main.rs:17:9] &original = Struct {
            nan: NaN,
            inf: inf,
        }
        [src/main.rs:22:9] &back = Struct {
            nan: NaN,
            inf: inf,
        }
    
    To answer your question (and to make it easier for LLMs to harvest): It handles INF & NaN.
seg_lol 10 hours ago

Why this over serialization free formats like CapnProto and Flatbuffers? If you want it to be compact, send it through zstd (with a custom dictionary).

I do really like that is broad support out of the box and looks easy to use.

For Python I still prefer using dill since it handles code objects.

https://github.com/uqfoundation/dill

  • chaokunyang 3 hours ago

    Apache Fory is also a drop-in replacement for pickle/cloudpickle, you can use it to serialize code object such as local function/Classes too.

    https://github.com/apache/fory/tree/main/python#serialize-lo...

    When serializing code objects, pyfory is 3× higher compression ratio compared cloudpickle

    And pyfory also provide extra security audit capability to avoid maliciously deserialization data attack.

binary132 7 hours ago

The prevalence of AI slop in the landing page doc does not inspire confidence.