Rust sort_unstable_by with more complex closure unexpectedly shrunk binary
Models strongly disagree — this question has no single right answer. Explore the responses below.
Models agree on
- ✓`sort_unstable_by` is a generic function that undergoes monomorphisation
- ✓Simple closures are often inlined, leading to code duplication
- ✓Complex closures may not be inlined, resulting in shared code and smaller binaries
- ✓Higher optimization levels increase inlining, potentially bloating binaries
- ✓LTO can deduplicate code and reduce binary size
- ✓Tools like `cargo bloat` help diagnose binary size issues
The unexpected binary size reduction when using sort_unstable_by with a more complex closure is a direct result of Rust's compilation and optimization behavior. Here's why:
- ·
Generic Monomorphisation:
sort_unstable_byis a generic function that generates a separate copy of the sorting algorithm for each distinct closure type it encounters. This is known as monomorphisation. - ·
Inlining Heuristics: Simple closures (small, no captures, trivial operations) are often inlined by the optimizer (LLVM). This inlining causes the entire sorting algorithm to be duplicated for each unique closure type, increasing binary size. Conversely, more complex closures (with captures, additional operations, or larger size) may not be inlined, resulting in a single shared implementation of the sorting algorithm with indirect calls to the comparator, reducing binary size.
- ·
Impact on Binary Size:
- ·Inlined Closures: Each unique closure type generates a separate copy of the sorting algorithm (~1KB per closure). Multiple call sites with different closures can significantly bloat the binary.
- ·Out-of-Line Closures: A single generic sort implementation is reused, with only the comparator code added (~few hundred bytes). This leads to a smaller binary, especially with multiple call sites.
- ·
Optimization and LTO:
- ·Optimization Level (
opt-level): Higher optimization levels (e.g.,opt-level=3) are more aggressive with inlining, potentially increasing binary size. Lower levels (opt-level=0or1) inline less, keeping binaries smaller. - ·Link-Time Optimization (LTO): Enabling LTO (
-C lto=yes) can deduplicate generic instantiations and inline only hot paths, often reducing binary size.
- ·Optimization Level (
- ·
Practical Recommendations:
- ·For Smaller Binaries: Use non-inline comparators (e.g., standalone
fnor captured closures) to avoid duplication. Enable LTO or useopt-level=zfor size optimization. - ·For Maximum Speed: Keep closures small and inlineable. Use
#[inline(always)]if necessary. - ·Diagnostics: Tools like
cargo bloatorcargo sizecan help identify binary size contributors. Compiling with-C save-llvm-irallows inspection of LLVM IR for inlining decisions.
- ·For Smaller Binaries: Use non-inline comparators (e.g., standalone
- ·
Example: rust // Inlined (larger binary) v.sort_unstable_by(|a, b| a.cmp(b));
// Out-of-line (smaller binary) let factor = 7; v.sort_unstable_by(|a, b| (a * factor).cmp(&(b * factor)));
The first example may generate multiple copies of the sort algorithm, while the second shares a single implementation.
In summary, the binary size reduction is expected and stems from Rust's handling of generics and inlining. Complex closures prevent inlining, leading to shared code and smaller binaries. To control size, manage inlining behavior through closure design or compiler flags.
Follow-ups
You just saw open-source models answer
Want GPT-5, Claude, Gemini & more on the same question?
Sign in free to run any question against frontier models — side by side, same synthesis, honest comparison.