This commit is contained in:
Sridhar Ratnakumar 2026-03-27 14:11:48 -04:00
parent f9d41fdd65
commit bdce9d338e

View file

@ -5,205 +5,216 @@ description: Optimize Nix flake evaluation and `nix develop` startup time. Use w
# Nix Flake Evaluation Optimization
Systematic approach to reducing `nix develop` / `nix-shell` startup time.
Systematic approach to reducing `nix develop` startup time.
## Workflow
This is an iterative optimization loop. For each optimization:
Iterative optimization loop. For each change:
1. **Measure** baseline (3 samples, cold eval cache)
1. **Measure** baseline (5 samples, isolated `$HOME` to avoid cache interference)
2. **Change** one thing
3. **Measure** again
4. **Commit** with measurements in the commit message
4. **Commit** with before/after measurements in the commit message
5. **Push** to the PR branch
6. Repeat
6. **CI** after each significant change
7. Repeat
### Setup
Create a branch and open a **draft PR** before starting. Each optimization gets its own commit with before/after measurements in the message body.
Create a branch and open a **draft PR** before starting. Each optimization gets its own commit.
### Measurement
Use isolated `$HOME` so measurements don't destroy the developer's caches:
```bash
git checkout -b optimize/nix-develop-time
git push -u origin optimize/nix-develop-time
# Create draft PR (use github-pr skill for title/body)
for i in 1 2 3 4 5; do
rm -rf /tmp/nix-bench/home && mkdir -p /tmp/nix-bench/home
{ time HOME=/tmp/nix-bench/home nix develop -c echo "s$i" 2>/dev/null; } 2>&1 | grep real
done
```
Also measure warm (daemon-cached) performance — just run without resetting `$HOME`:
```bash
# warm up first
HOME=/tmp/nix-bench/home nix develop -c echo warmup 2>/dev/null
for i in 1 2 3; do
{ time HOME=/tmp/nix-bench/home nix develop -c echo "s$i" 2>/dev/null; } 2>&1 | grep real
done
```
### Commit Format
Use conventional commits. Include measurements in every commit:
```
perf: switch devShell from nix develop to nix-shell
`nix develop` pays ~7s in flake fetcher-cache verification.
`nix-shell` with fetchTarball skips this entirely.
Measurement (3 samples, cold eval cache / warm store):
Before: 7.35s, 7.22s, 7.20s (avg ~7.3s)
After: 2.34s, 2.34s, 2.36s (avg ~2.35s)
```
### CI
Run CI after each significant change to catch regressions (especially `nix flake lock` silently bumping nixpkgs — see Pitfalls).
## Measurement Protocol
Always measure before and after each change. The metric is wall-clock time with cold eval cache:
```bash
# Cold eval cache, warm nix store (typical dev scenario)
rm -rf ~/.cache/nix; time nix develop -c echo "test"
# Take 3 samples for stability
for i in 1 2 3; do rm -rf ~/.cache/nix; { time nix develop -c echo "s$i" 2>/dev/null; } 2>&1 | grep real; done
```
Profile with `NIX_SHOW_STATS=1 nix eval ...` to get eval stats (function calls, thunks, set operations).
Use conventional commits. Include measurements in every commit body.
## Where Time Goes
`nix develop` time breaks down into:
1. **Flake fetcher-cache verification** (~1.5s per input) — nix verifies each flake input against its fetcher cache even when nothing changed. This is the #1 bottleneck.
2. **Nixpkgs evaluation** (~0.2-0.5s) — importing and evaluating the nixpkgs tree.
3. **Package resolution** (~0.5-1.5s) — resolving specific packages (playwright-driver is notably expensive at ~0.4s).
4. **Shell environment realization** (~0.5-0.7s) — nix-shell/nix-develop daemon overhead for building the shell env derivation.
5. **Flake-parts / module system** — adds eval overhead but is NOT the main bottleneck despite appearances.
1. **Flake fetcher-cache verification** (~1.5s per input) — nix re-verifies each input on every cold invocation. This is the #1 bottleneck.
2. **Nixpkgs evaluation** (~0.2-0.5s) — importing the nixpkgs tree.
3. **Package resolution** (~0.5-1.5s) — resolving specific packages.
4. **Shell env realization** (~0.3-0.7s) — daemon overhead for building the shell derivation.
### Key Insight
The **fetcher-cache verification** dominates. Even a flake with a single `nixpkgs` input costs ~7s on cold eval cache. Each additional input adds ~1.5s. This cost is inherent to `nix develop` and cannot be optimized away within the flake framework.
With **zero flake inputs**, `nix develop` is fast: ~2.6s cold, ~0.3s warm (daemon eval cache). Each flake input adds ~1.5s. A typical flake with 4+ inputs (nixpkgs, flake-parts, git-hooks, systems) costs 7-18s.
## Optimization Tiers
## The Fix: Zero-Input Flake
### Tier 1: Reduce Flake Inputs (moderate gain)
**Import nixpkgs via `fetchTarball` instead of as a flake input.** This eliminates all fetcher-cache overhead while keeping `nix develop` as the interface.
Remove unnecessary flake inputs. Each removed input saves ~1.5s on cold eval cache.
### Architecture
Common removable inputs:
- **flake-parts** → replace with a simple `eachSystem` helper (3 lines of nix)
- **systems** → inline the systems list
- **git-hooks.nix** → use a static `.pre-commit-config.yaml` with tools from the devShell
- **process-compose-flake** → use just's `[parallel]` attribute or shell-based concurrency
```nix
# Replace flake-parts with:
eachSystem = f: nixpkgs.lib.genAttrs systems (system: f nixpkgs.legacyPackages.${system});
```
nix/nixpkgs.nix ← single nixpkgs pin (fetchTarball)
default.nix ← all packages + shared koluEnv
shell.nix ← devShell (imports default.nix, accepts { pkgs } arg)
flake.nix ← zero inputs, imports the above, exports packages + devShells
```
### Tier 2: Switch from `nix develop` to `nix-shell` (large gain)
### Source management with npins
`nix-shell` with `fetchTarball` bypasses flake fetcher-cache entirely. Typical improvement: **7s → 2.5s**.
Use [npins](https://github.com/andir/npins) to manage all fetched sources (nixpkgs, GitHub repos, etc.). This replaces hardcoded `fetchTarball`/`fetchFromGitHub` calls with a single `npins/sources.json`.
Create a `shell.nix` alongside the flake:
```nix
# nix/nixpkgs.nix — single source of truth for the nixpkgs pin
import (fetchTarball {
url = "https://github.com/NixOS/nixpkgs/archive/<REV>.tar.gz";
sha256 = "<NARHASH>";
})
```bash
npins init --bare
npins add github nixos nixpkgs --branch nixpkgs-unstable --at <REV>
npins add github owner repo --branch main --at <REV>
npins update # bump all sources
npins update nixpkgs # bump just nixpkgs
```
### `nix/nixpkgs.nix` — Single source of truth
```nix
# shell.nix
let pkgs = import ./nix/nixpkgs.nix { };
in pkgs.mkShell { ... }
# Managed by npins. To update: npins update nixpkgs
let sources = import ../npins;
in import sources.nixpkgs
```
Keep `flake.nix` as a thin compat wrapper for CI and downstream consumers:
Other nix files can also use `sources`:
```nix
# flake.nix — zero inputs, imports nix/nixpkgs.nix directly
# nix/some-dep/default.nix
{ pkgs }:
let sources = import ../../npins;
in pkgs.runCommand "foo" {} ''
cp -r ${sources.some-repo}/bar $out
''
```
Note: npins handles GitHub repos and tarballs. Plain `fetchurl` (e.g. font files from CDNs) stays hardcoded.
### `flake.nix` — Zero inputs
```nix
# IMPORTANT: This flake intentionally has ZERO inputs.
#
# nixpkgs is imported via fetchTarball in nix/nixpkgs.nix, bypassing the
# flake input system. Each flake input adds ~1.5s of fetcher-cache
# verification. With zero inputs, `nix develop` cold is ~2.6s, warm ~0.3s.
#
# DO NOT add flake inputs (nixpkgs, flake-parts, git-hooks, etc.).
# Instead, use fetchTarball or callPackage in nix/ files.
{
outputs = { self, ... }:
let
eachSystem = f: builtins.listToAttrs (map (system: {
name = system;
value = f (import ./nix/nixpkgs.nix { inherit system; });
}) [ "x86_64-linux" "aarch64-darwin" ]);
in {
packages = eachSystem (pkgs: import ./default.nix { inherit pkgs; });
systems = [ "x86_64-linux" "aarch64-darwin" ];
eachSystem = f: builtins.listToAttrs (map
(system: {
name = system;
value = f (import ./nix/nixpkgs.nix { inherit system; });
})
systems);
commitHash = self.shortRev or self.dirtyShortRev or "dev";
in
{
packages = eachSystem (pkgs:
let all = import ./default.nix { inherit pkgs commitHash; };
in removeAttrs all [ "koluEnv" ]); # koluEnv is not a derivation
devShells = eachSystem (pkgs:
{ default = import ./shell.nix { inherit pkgs; }; });
};
}
```
Update `.envrc` from `use flake` to `use nix`.
### `shell.nix` — Shared devShell
Update justfile's nix_shell prefix to use a wrapper script instead of `nix develop -c`.
### Tier 3: Environment Caching (massive gain)
Cache the full shell environment (`export -p`) keyed on content hashes of nix input files. Typical improvement: **2.5s → 0.015s**.
Create a `nix-shell-fast` wrapper script:
```bash
#!/usr/bin/env bash
# nix-shell-fast — cached nix-shell replacement (~0.015s vs ~2.5s)
#
# Runs a command inside the shell.nix environment, caching the full set of
# exported env vars on first use. Subsequent calls eval the cache and exec
# directly, skipping nix entirely.
#
# Cache invalidation:
# Content hash of listed nix files. Any byte change → cache miss → re-eval.
# IMPORTANT: add new .nix files to CACHE_KEY when shell.nix/default.nix imports them.
#
# Known limitations:
# - shellHook side-effects (pre-commit install, symlinks) only run on cache miss.
# - Only one store path checked for GC validity. rm cache dir to force re-eval.
set -euo pipefail
DIR="$(cd "$(dirname "$0")" && pwd)"
CACHE_DIR="${XDG_CACHE_HOME:-$HOME/.cache}/<project>-shell"
CACHE_KEY=$(cat "$DIR/shell.nix" "$DIR/default.nix" "$DIR/nix/nixpkgs.nix" \
<other nix files> 2>/dev/null | sha256sum | cut -d' ' -f1)
ENV_FILE="$CACHE_DIR/$CACHE_KEY.env"
cache_valid() {
[[ -f "$ENV_FILE" ]] || return 1
local store_path
store_path=$(grep '^declare -x PATH=' "$ENV_FILE" | grep -o '/nix/store/[^/:]*' | head -1)
[[ -n "$store_path" ]] && [[ -e "$store_path" ]]
```nix
{ pkgs ? import ./nix/nixpkgs.nix { } }:
let packages = import ./default.nix { inherit pkgs; };
in pkgs.mkShell {
name = "my-shell";
# Use mkShell's env attr — no duplicate export lines
env = packages.koluEnv // { ... };
shellHook = ''...'';
packages = with pkgs; [ ... ];
}
if ! cache_valid; then
mkdir -p "$CACHE_DIR"
find "$CACHE_DIR" -maxdepth 1 -type f -not -name "$CACHE_KEY.*" -delete 2>/dev/null || true
nix-shell "$DIR/shell.nix" --run 'export -p' 2>/dev/null | grep '^declare -x ' > "$ENV_FILE.tmp"
mv "$ENV_FILE.tmp" "$ENV_FILE"
fi
eval "$(cat "$ENV_FILE")"
exec "$@"
```
Then in justfile:
### DRY shared env vars
Define env vars once in `default.nix` as an attrset, use in both the build derivation's `env` and `mkShell`'s `env`:
```nix
# In default.nix
koluEnv = {
KOLU_THEMES_JSON = "${ghosttyThemes}/themes.json";
KOLU_FONTS_DIR = "${fonts}";
};
kolu = pkgs.stdenv.mkDerivation {
env = { npm_config_nodedir = nodejs; } // koluEnv;
...
};
```
**Important:** `koluEnv` is not a derivation — filter it out of flake `packages` output with `removeAttrs` or devour-flake will fail trying to build it.
### Replacements for common flake inputs
- **flake-parts**`eachSystem` helper (3 lines)
- **systems** → inline `[ "x86_64-linux" "aarch64-darwin" ]`
- **git-hooks.nix** → static `.pre-commit-config.yaml` + tools in devShell
- **process-compose-flake** → just's `[parallel]` attribute
### `.envrc`
```
use flake
```
### justfile
```just
nix_shell := if env('IN_NIX_SHELL', '') != '' { '' } else { justfile_directory() + '/nix-shell-fast' }
nix_shell := if env('IN_NIX_SHELL', '') != '' { '' } else { 'nix develop path:' + justfile_directory() + ' -c' }
```
## Why Not nix-shell or env caching?
We benchmarked all approaches with a zero-input flake:
| Approach | Cold | Warm |
|---|---|---|
| `nix develop` (zero inputs) | **2.6s** | **0.3s** |
| `nix-shell` (fetchTarball) | 2.4s | 2.4s (no daemon cache) |
| env caching script | 5s miss | 0.014s hit |
`nix develop` wins: 0.3s warm with zero maintenance. `nix-shell` has no daemon eval cache so it's always 2.4s. Env caching (nix-shell-fast) achieves 0.014s but introduces staleness bugs, manual cache key maintenance, and shellHook side-effects not re-running — complexity not worth the gain.
## Pitfalls
- **`nix flake lock` silently bumps inputs** — when you remove inputs and run `nix flake lock`, nix may update remaining inputs to latest. Always verify the nixpkgs rev matches what was in the original flake.lock. Pin explicitly. Run CI after any flake.lock change.
- **`fetchTarball` sha256 format** — use the `narHash` from flake.lock (SRI format `sha256-...`), not the base32 nix hash.
- **Cache key completeness** — include ALL files that affect the shell evaluation: shell.nix, default.nix, nix/nixpkgs.nix, and any files imported by callPackage (fonts, themes, etc.). When adding a new .nix file that shell.nix imports, add it to the CACHE_KEY line.
- **macOS compatibility**`sha256sum` exists on NixOS but may not on stock macOS. If targeting macOS without nix in PATH, use `shasum -a 256` as fallback.
- **nix-shell `export -p` output** — contains shellHook stdout (e.g., "Sourcing pytest-check-hook"). Filter with `grep '^declare -x '`.
- **Store path validation** — after `nix-collect-garbage`, cached env vars may point to deleted store paths. Validate at least one store path from PATH before using the cache.
- **shellHook side-effects** — the env cache stores the *result* of shellHook (env vars), not the commands. Things like `pre-commit install` or `ln -sfn` only run on cache miss. To force: `rm -rf ~/.cache/<project>-shell`.
- **`nix flake lock` silently bumps inputs** — when removing inputs, nix may update remaining ones to latest. Always verify the nixpkgs rev matches master's. Run CI after any lock change.
- **`fetchTarball` sha256** — use SRI format (`sha256-...`). With npins, hashes are managed automatically.
- **npins `default.nix` is auto-generated** — don't edit it manually; `npins` overwrites it. Mark it in `.gitattributes` as `linguist-generated`.
- **Non-derivation in packages** — if `default.nix` exports non-derivations (like a `koluEnv` attrset), filter them out in `flake.nix` or devour-flake/`nix flake check` will fail.
- **direnv cache staleness** — after changing `nix/nixpkgs.nix`, delete `.direnv/` to force direnv re-evaluation. Otherwise `use flake` serves stale env vars.
## Expected Results
| Tier | Typical Time | Improvement |
|------|-------------|-------------|
| Baseline (nix develop, multiple inputs) | 7-18s | — |
| Tier 1 (fewer inputs) | 5-7s | ~1.5x |
| Tier 2 (nix-shell) | 2-3s | ~3x |
| Tier 3 (env cache) | 0.01-0.02s | ~500x |
Cache miss (first run after nix file changes) still costs Tier 2 time (~2.5s).
| | Cold eval cache | Warm (daemon cached) |
|---|---|---|
| Before (4+ inputs) | 7-18s | 3-7s |
| After (0 inputs) | ~2.6s | ~0.3s |