For over two decades, most companies have relied on libxml2 for parsing XML files. It ships inside browsers, operating systems, programming language runtimes, and countless enterprise systems. It is, by any measure, critical infrastructure for the internet. But there were concerns recently when the original maintainer stepped down.
SupXML is our attempt to carry that legacy forward by making a modern, memory-safe XML library built for the next twenty years.
Standing on libxml2's shoulders
libxml2 is a genuinely remarkable piece of engineering. Written in C in the late 1990s, it became the default XML toolkit. If you've parsed a XML config file, validated an XSD, signed a SAML assertion, or rendered a web page, there's a good chance libxml2 was somewhere in the stack. We have enormous respect for it.
What's changed since the late 90s is the language we'd choose to write it in. Across the industry, roughly 70% of security vulnerabilities (CVEs) come from memory-safety bugs: buffer overflows, use-after-free, and the like. Any large C codebase that parses untrusted input lives with that risk because C doesn't have bounds-checked arrays.
SupXML: memory-safe by construction
SupXML is a memory-safe, fast, spec-compliant XML library for Rust, with a drop-in C ABI replacement for libxml2. It's written in pure Rust, which means an entire class of vulnerabilities simply can't happen — the memory-safety bugs that account for the majority of XML-parser CVEs are ruled out at compile time.
SupXML isn't memory-safe by accident, either. The small amount of
unsafe code it does contain is a tightly audited core where every
block carries a safety comment, checked with Miri and fuzzing to
catch the unknown unknowns.
We're faster, too
Safety often comes with a performance tax, but SupXML is built for efficiency too. On a full-validation DOM parse across benchmarks, SupXML is faster than libxml2 on every fixture, with a median speedup of about 2.1×. Entity-heavy and attribute-dense documents see even more: a 599 KB Arabic text parses ~3.75× faster, an English Wikipedia article ~2.96× faster.
It's strict on correctness, as well. On the W3C XML conformance suite, SupXML passes all the tests. Across a broader 1,147-file corpus drawn from multiple vendors, SupXML scores 95.6%. On XSD 1.0 validation it reaches 99.2% conformance. These stats are all just as good if not better than libxml2.
Not to single out libxml2 here, because it's worth pointing out libxml2 is actually significantly better than most other XML parsers. I've tested them, and basically nothing comes close to the scores of libxml2 or SupXML. Most XML parsers are not very correct, which has security implications for companies relying on them. Even if you think you trust your XML files, the trust boundary is fuzzy, and there are lots of ways a malicious file can get into what you think is a trusted space.
Drop-in replacement for libxml2
When I set out to make SupXML, I intentionally built it in Rust, but added a C ABI (Application Binary Interface) matching libxml2's functions, to make it easy to migrate. This means that anyone using libxml2, from any language, can migrate to SupXML without having to update their code. The existing code will call our C functions, which then use our Rust library.
Why does critical software depend on volunteers?
The technical case for SupXML is strong, but it isn't the most important part of the story. The most important part is why a library this critical was carried for so long by so few.
Open source built the modern world, and it did so without much financial support at all. The people who write and maintain the libraries everyone depends on are, with very few exceptions, not paid for it. They do it on nights and weekends, out of a sense of duty, until they burn out and walk away, at which point the multi-billion-dollar companies depending on their work scramble to find someone else to do it for free.
The fact is, open source software is worth billions, if not trillions, of dollars to the global economy, but barely gets any monetary support. Companies are freeriders when it comes to critical software. It's the tragedy of the commons.
libxml2 is a case in point. In recent years, its long-time maintainer was candid about the fact that a single unpaid volunteer can't be expected to provide urgent, ongoing security work indefinitely. He was right, and it wasn't a complaint about the code or about the people who use it. It's a structural problem: a foundational library used by the largest companies in the world should not depend on one person's goodwill. It's not the maintainers fault, really. It's a failure of how the world funds (or rather, mostly doesn't fund) open source.
That's what Supported Source is trying to fix
Supported Source exists to fix exactly this problem. Companies are willing to pay for software that's reliable, supported, and secure. They do it all the time. What's been missing is a good way to do that with their dependencies, which for the most part are free open source projects.
SupXML is made by and released through Supported Source. Commercial use requires a paid license, while hobbyists and evaluators can use it for free. The revenue from companies flows back to the maintainers, so the work that everyone depends on is funded and supported.
That changes the incentives for the better. When maintainers get paid, security issues get fixed quickly because that's someone's job. New features and bug fixes get prioritized. The library stays healthy because it becomes a product with actual paid engineers behind it, like any other paid product a company might use.
SupXML is a fast, safe, modern XML library. But the reason it'll stay that way and improve year after year, is that there's a paid business model behind it.
You can read more about SupXML at the docs page. And if you maintain a project companies depend on, get in touch. We'd love to help you get paid for it too.
Open source built the world. Supported Source helps the people who build it.