Skip to main content
  1. Projects/

regolith

·2104 words·10 mins
Jake Roggenbuck
Author
Jake Roggenbuck
Working at Capital One as a SWE Intern as well as Programming Language Research during the school year.

Regolith #

A server-side TypeScript and JavaScript library immune to Regular Expression Denial of Service (ReDoS) attacks by using Rust and linear Regex under the hood. Regolith has a linear worst case time complexity, compared to the default RegExp found in TypeScript and JavaScript, which has an exponential worst case.

Rust
TypeScript
Version
NPM Downloads
CI

Motivation: I wanted a Regex library for TypeScript and JavaScript where I didn’t have to worry about ReDoS attacks.

[!IMPORTANT] Regolith is still early in development! We need help building and getting developer adoption!

Drop-in Replacement #

Regolith attempts to be a drop-in replacement for RegExp and requires minimal (to no) changes to be used instead. The goal of Regolith is to allow developers to easily build software that is immune to ReDoS attacks.

Preventing ReDoS Attacks #

What are ReDoS attacks? #

Regular Expression Denial of Service (ReDoS) attacks occur when vulnerable Regex patterns are executed with specifically constructed inputs that result in an inefficient execution. This can be exploited to cause services to become unavailable because the services are stuck trying to compute the inefficient Regex.

Read more: owasp.org & learn.snyk.io

Exponential Worst Case #

Python has an exponential increase in execution time for the worst case.

This is the same case for TypeScript and JavaScript. Both having RegExp, which has an exponential worst case.

Linear vs Exponential Regex Libraries #

This table shows popular languages and if their Regex library has a linear worst case or an exponential worst case. It also includes experimental results for how long execution took for a vulnerable Regex pattern that can be attacked with ReDoS and an input of size 30.

Note that TypeScript and JavaScript do not have a linear worst case for Regex, making them vulnerable to these types of attacks.

More information and images: Jake Roggenbuck - Preventing ReDoS Attacks - 2025

Impact #

[!NOTE]
ReDoS attacks happen relatively frequently to popular libraries costing millions of hours of work. This can be prevented with linear regex engines like Regolith.

These vulnerabilities happen relatively often in popular libraries. It’s no one’s fault specifically, it just comes down to the fact that the language allows for these things to happen.

A recent example of a ReDoS vulnerability is CVE-2025-5889 from brace-expansion. Again, this isn’t any fault of that project, it’s simply an issue with the language allowing this to happen. Measures can be put into place to reduce the risk of this, but it’s hard to spot and test for these issues.

The brace-expansion project is used by 42.5 million other projects on GitHub. Meaning if everyone were to patch their software (which the hopefully will), that would be 42.5 million pull requests, roughly 42.5 million build minutes, and probably more than 42 million engineering minutes as well. All of that for a single vulnerability, and that’s just a lower bound of effort spent on this if everyone were to keep their software patched.

Other versions of brace-expansion had these patches backported to them, needing updates for versions 1, 2, 3, and the current version 4.

Having a library or project that is immune to these vulnerabilities would save this effort for each project that adopted it, and would save the whole package ecosystem that effort if widely adopted. Adoption of libraries is difficult, especially when they aren’t very flashy, but helping library maintainers and engineers not worry about ReDoS for one library, one project at a time, is our goal.

Trade-off #

The Rust Regex library purposefully excludes features that make Regex engines particularly vulnerable to ReDoS attacks. Those features are backreferences and look-around. Excluding those features allow Regex to guarantee linear time execution.

Since Regolith uses Rust bindings to implement the Rust Regex library to achieve linear time worst case, this means that backreferences and look-around aren’t available in Regolith either.

This trade-off has proven to be worth it for the Rust community of libraries and projects.

Results #

Since ReDoS vulnerabilities are hard to spot, there are rather frequent CVEs that get submitted. Having a Regex library that has a linear worst case time would completely prevent all of these potential issues for downstream projects.

Rust Regex under the hood #

Regolith makes JavaScript bindings (using napi-rs) that implement the features of the very popular Regex library for Rust. Initially, when I had this idea for this library, I wanted to implement my own linear time regex engine. Now there is a chance I still end up doing that, I realized it’s better to not duplicate the work of the already excellent Regex library and focus on making these JavaScript and TypeScript bindings the best they can be. The focus of this project is to deliver the best linear time regex engine to TypeScript and JavaScript as a drop-in replacement for the default RegExp.

My full appreciation goes out to the developers of the Rust Regex library, who enabled this project to exist.

Limitations #

I’m still working to get this to be able to link to client side run applications like when you use react with “use client”. This may result in either finding a new way to link Rust (possible with WASM) or making my own Regex engine. Currently, everything works for server side JavaScript and TypeScript, which is the main focus of this, because servers are what usually get attacked in ReDoS attacks instead of clients.

image

You might get a message link: No loader is configured for ".node" files: node_modules/@regolithjs/regolith-linux-x64-gnu/regolith.linux-x64-gnu.node.

I will actively be working to add this functionality and it will be tracked as issue #40. For now, I recommend using Regolith for server side and systems applications to prevent ReDoS attacks for servers, as this is what commonly gets Denial of Service attacks.

Usage (Quick Start) #

1. Install #

npm i @regolithjs/regolith

2. Try it out #

import { Regolith } from '@regolithjs/regolith';

const pattern = new Regolith("^\\d+$");

pattern.test("12345");  // true
pattern.test("Hello");  // false

Examples #

Simple pattern matching example to match crab in our sentence my crab ferris.

import { Regolith } from '@regolithjs/regolith';

const pattern = new Regolith('crab', 'g');
console.log(pattern.test('my crab ferris')); // true

Here we use ‘g’ in the Regolith constructor to mean a global.

Match method #

const sentence = 'crab, snail, crab';
const crabPattern = new Regolith('crab', 'g');

// Find all matches
console.log(crabPattern.match(sentence));
// Output: ['crab', 'crab']

Replace method #

const sentence = 'crab, snail, crab';
const crabPattern = new Regolith('crab', 'g');

// Replace all occurrences
console.log(crabPattern.replace(sentence, 'snake'));
// Output: 'snake, snail, snake'

Search method #

const sentence = 'crab, snail, crab';

const snailPattern = new Regolith('snail');
console.log(snailPattern.search(sentence));
// Output: 6 (index where 'snail' is found)

Split method #

const splitPattern = new Regolith('[,\\|]');
console.log(splitPattern.split('apple,banana|orange'));
// Output: ['apple', 'banana', 'orange']

Express Example #

Make a backend express app that will tell you if a number is an int or a float. View the complete guide for the Express example.

import express from "express";
import { Regolith } from "@regolithjs/regolith";

const app = express();
const port = 3000;

// Create Regolith Regex patterns
const intPattern = new Regolith("^\\d+$");
const floatPattern = new Regolith("^\\d*\\.\\d+$");

app.get("/check", (req, res) => {
    const value = req.query.value;

    if (!value) {
        return res.status(400).send("Please provide a value query parameter");
    }

    // Run the test with Regolith pattern
    const isInt = intPattern.test(value);
    const isFloat = floatPattern.test(value);

    res.json({
        value,
        isInt,
        isFloat,
    });
});

app.listen(port, () => {
    console.log(`Example app listening on port ${port}`);
});

Development #

These are instructions only if you want to build this library yourself (e.g. for development).

Building #

  1. Before you build, you will need to have yarn installed. Here is a guide for installing yarn.
  2. You will also need Rust, and you can install Rust with rustup.

To build the project, use this command:

yarn build

Running yarn build will build the Rust package, and you should see the Rust compiler complete the build process.

image

Running #

Now we can test to see if Regolith was built correctly. We can open the node REPL and load the library.

node

After opening the shell, you can load the library with:

const { Regolith } = await import("./index.js");

After that, you can use Regolith as normal.

const integerPattern = new Regolith("^\\d+$");
integerPattern.test("123");

Here is an example of running Regolith in the REPL to test if it built correctly.

image

Testing #

Testing the TS/JS library #

yarn test

You should see the tests complete. Currently, there are 93 tests that get run.

image

These tests can be found in the __test__ directory.

Testing the Rust bindings #

cargo test

Here is what the output should look like:

image

These tests can be found in the source files in src/lib.rs.

Publishing Checklist #

  1. Increment the version in package.json
  2. All changes are merged into main
  3. Run the tests with yarn test
  4. Run npm login
  5. Run npm publish

Docs #

1. Important Files #

namepurposedocs
build.rsRuns the setup for napi-rs
Cargo.lockAutomatically generated by Cargo to keep track of Rust package versions
Cargo.tomlContains information about the Rust crate; like the name, version, and dependencies
index.d.tsType information automatically generated by napi-rs
index.jsThe main entry point for the library that is automatically generated by napi-rs
package.jsonInformation about the Regolith package
rustfmt.tomlA config for the Rust formatter
yarn.lockKeeps track of the dependency version for yarn and it is automatically generated
.npmignoreKeeps files and directories out of what is shipped in the libraryLink
.yarnrc.ymlConfigure yarn settings

2. Formatting #

2.1 Rust Format #

Use cargo fmt. This is actually checked in the automated tests when you create a pull request. You can also see rustfmt.toml for the config for cargo fmt.

2.2 TypeScript / JavaScript Format #

Use prettier with prettier --write <file> or prettier --write . to format all .ts and .js files.

3. Website #

The source code for the Regolith website can be found at github.com/JakeRoggenbuck/regolith-website. The URL for the website is regolithjs.com.

4. Platforms Tested #

These are the platforms that Regolith has been tested on. These checks happen automatically in the CI.

PlatformStatus
Arm 64 Apple DarwinWorking
Arm 64 Linux AndroidWorking
Arm 64 Linux GNUWorking
Arm 64 Linux MUSLWorking
Arm 64 PC Windows MSVCWorking
Arm v7 Linux GNUWorking
Arm v7 Linux MUSLWorking
x86-64 Linux MUSLWorking
x86-64 FreeBSDNot Tested
i686 PC Windows MSVCWorking
Arm v7 Linux AndriodWorking
Universal Apple DarwinWorking
RISC-V 64 GC Linux GNUWorking

Report a Bug #

If you find a bug, please send me an email at bug at jr0 dot org and or open an issue.

Name Origin #

When trying to think of words that started with reg, I thought of the word regolith, which describes top layer of a planet that’s made of dust and rock. I likely got familiar with this word in a class I took about dinosaurs.

Motivation and Background #

I was initially inspired to build this library after doing undergraduate research to learn more about why certain languages have problems with ReDoS and others don’t. This led me to a question I couldn’t answer: “Why isn’t the most popular Regex library a linear time engine for languages like TypeScript, JavaScript, and Python?” You’d think that having a library that cannot get attacked (in a common way software gets attacked) would be more commonly used. I found an example called regexy in Python, but there hasn’t been an update in 8 years, and it was archived in 2024. There is also rure-python but this has not been updated in 6 years either. JavaScript has some more popular libraries that address this issue too. One is called re2js, which wraps Google’s RE2 library written in C++. re2js has a different API as the default RegExp from JavaScript, requiring some reworking on code that needs to be migrated over. The other is called node-re2, and this library also provides bindings for Google’s RE2 library. node-re2 does have an API similar to JavaScript’s RegExp. Even with these libraries, anecdotally it feels like the vast majority of projects still use the default regex for their respective languages; libraries that are vulnerable to ReDoS attacks. I could not find an exact percentage for how many projects use linear time engines vs exponential engines so this should be something to either find out from literature or maybe even try to answer this question directly, by reviewing packages published and trying to calculate a percentage.

Ultimately, I wanted a Regex library that is a drop-in replacement for RegExp in TypeScript and JavaScript where I didn’t have to worry about ReDoS attacks. My hope is that this library brings value to your software as well.