Skip to content
Oeiuwq Faith Blog OpenSource Porfolio

dragostis/chili

Rust port of Spice, a low-overhead parallelization library

dragostis/chili.json
{
"createdAt": "2024-08-27T18:12:53Z",
"defaultBranch": "main",
"description": "Rust port of Spice, a low-overhead parallelization library",
"fullName": "dragostis/chili",
"homepage": "",
"language": "Rust",
"name": "chili",
"pushedAt": "2025-05-05T14:10:57Z",
"stargazersCount": 699,
"topics": [],
"updatedAt": "2026-01-06T21:49:02Z",
"url": "https://github.com/dragostis/chili"
}

Crates.io Docs

Rust port of [Spice], a low-overhead parallelization library

Section titled “Rust port of [Spice], a low-overhead parallelization library”

Very low-overhead parallelization primitive, almost identical to [rayon::join]. At any fork point during computation, it may run the two passed closures in parallel.

It works best in cases where there are many small computations and where it is expensive to estimate how many are left on the current branch in order to stop trying to share work across threads.

The following example sums up all nodes in a binary tree in parallel.

fn sum(node: &Node, scope: &mut Scope<'_>) -> u64 {
let (left, right) = scope.join(
|s| node.left.as_deref().map(|n| sum(n, s)).unwrap_or_default(),
|s| node.right.as_deref().map(|n| sum(n, s)).unwrap_or_default(),
);
node.val + left + right
}

This is the ideal example since per-node computation is very cheap and the nodes don’t keep track of how many descendants are left.

The following benchmarks measure the time it takes to sum up all the values in a balanced binary tree with varying number of nodes.

While the improvement over the baseline in the 134M nodes case is close to the theoretical maximum, it’s worth noting that the actual time per node is 0.8ns vs. a theoretical 1.8 / 8 = 0.2ns, if we’re to compare against the 1K nodes case.

Number of nodesBaselineRayonchiliBaseline / chili
10231.8 µs51.1 µs3.4 µsx0.53
1677721594.4 ms58.1 ms13.6 msx6.94
134217727797.5 ms497.2 ms101.8 msx7.83
Number of nodesBaselineRayonchiliBaseline / chili
10231.6 µs29.2 µs3.5 µsx0.46
1677721539.4 ms40.5 ms11.2 msx3.51
67108863156.5 ms167.1 ms44.3 msx3.53

chili overhead on AMD Ryzen 7 4800HS (8 cores)

Section titled “chili overhead on AMD Ryzen 7 4800HS (8 cores)”

The overhead in the 1K nodes case remains approximately constant with respect to the number of threads.

Number of nodesBaseline1 thread2 threads4 threads8 threads
10231.8 ns3.5 ns3.5 ns3.5 ns3.5 ns

[Spice] !: https://github.com/judofyr/spice [rayon::join] !: https://docs.rs/rayon/latest/rayon/fn.join.html