Feature flag companies sell you tools to manage your flags. But do they manage their own?
I built FlagShark, an open source CLI that scans codebases for stale feature flags. It works across 13 programming languages and auto-detects which flag provider you're using from your imports. No config, no setup. Just npx flagshark scan.
I pointed it at three feature flag companies' own open source repos. The results were not what I expected.
The results
Unleash — 0/100
Unleash is one of the most popular open source feature flag platforms. Their repo has 5,681 TypeScript files.
FlagShark found 4 feature flags. All 4 are stale.
🦈 FlagShark v1.0.1
Scanned 5,681 files across 1 language
Detected providers: unleash-client
Found 4 feature flags, 4 stale
Flag File Signal
disableMetrics src/lib/features/frontend-api/frontend-api-controller.ts Single file
impactMetrics src/lib/features/frontend-api/frontend-api-service.ts Single file
advancedPlayground src/lib/features/playground/playground-service.ts Single file
migrationLock src/lib/server-impl.ts Single file
Flag Health Score: 0/100
These aren't example files or test fixtures. disableMetrics is in their actual API controller. migrationLock is in server-impl.ts, their server startup file. Each flag is referenced in exactly one file, which typically means the rollout completed and the flag was never cleaned up.
One flag, advancedPlayground, even has a code comment: "used for runtime control, do not remove." That might be intentional. But the other three look like classic stale flags.
PostHog — 21/100
PostHog is an open source product analytics platform that includes feature flags as one of its core features. Their repo is massive: 14,799 files across 6 languages.
FlagShark found 71 flags. 54 are stale in production code (I excluded 8 that were in docs and example snippets).
🦈 FlagShark v1.0.1
Scanned 14,799 files across 6 languages
Detected providers: posthog-js, posthog
Found 71 feature flags, 54 stale in production code
Flag Health Score: 21/100
Some of the stale flags sit in critical code paths:
cache-warminginposthog/caching/warming.py— their caching layerhogql-access-controlinposthog/hogql/database/database.py— their database access layerbatch-export-earliest-backfillinposthog/batch_exports/http.py— their batch export HTTP handlerstale-cache-invalidation-enabledinposthog/caching/utils.py— the irony writes itself
There's also a cluster of AI-related flags in ee/hogai/utils/feature_flags.py: phai-tasks, phai-memory-tool, phai-plan-mode, phai-sandbox-mode. All referenced in a single file. These look like they were used for a gradual rollout of PostHog's AI features and never cleaned up.
GrowthBook — 50/100
GrowthBook is an open source feature flagging and experimentation platform. Smaller codebase: 2,101 files.
FlagShark found 4 flags. 2 are stale.
🦈 FlagShark v1.0.1
Scanned 2,101 files across 2 languages
Detected providers: @growthbook/growthbook
Found 4 feature flags, 2 stale
Flag Health Score: 50/100
The healthiest of the three. Two flags with single-file references, two that appear in multiple files and look actively used. A 50/100 is decent, but still means half the flags might be dead code.
Why this matters
Stale feature flags aren't just ugly code. They're a real risk.
The most famous example is Knight Capital, which lost $460 million in 45 minutes because of an old feature flag that reactivated dead code during a deployment. A function that was supposed to be decommissioned was still wrapped in a flag, and when the flag was accidentally toggled, it started executing a trading algorithm that had been obsolete for 8 years.
Beyond catastrophic failures, stale flags create everyday drag:
- Dev velocity slows down. Every conditional branch is a branch an engineer has to reason about. A flag that will never be toggled is a branch that will never execute, but the next developer reading the code doesn't know that.
- Test surface grows. Each flag doubles the number of code paths. Stale flags create paths that are tested but will never run in production.
- Bugs hide in dead paths. When someone refactors code around a stale flag, the dead branch might not get updated. Months later, if that flag somehow gets toggled, you're running code that's incompatible with the current state of the system.
Why it happens
Teams know stale flags are a problem. Most schedule cleanup sprints. An engineer or tech lead periodically audits flags, files tickets, and the team burns days removing dead code.
But the cleanup never catches up. New flags get added faster than old ones get removed. The quarterly cleanup sprint becomes a ritual that makes everyone feel responsible without actually solving the problem.
The flag management platforms (LaunchDarkly, Unleash, PostHog, etc.) are starting to add cleanup features. But they only detect staleness for their own flags. If you use LaunchDarkly for some flags and a custom implementation for others, LaunchDarkly's cleanup tools won't help you with the custom ones.
Nobody had a cross-platform tool that just looks at the code. That's why I built FlagShark.
Try it yourself
Run it on your repo. It takes about 2 seconds:
npx flagshark scan
No config needed. FlagShark auto-detects which flag provider you use by checking your imports. A function called isEnabled() in a file that doesn't import a flag SDK won't trigger a false positive.
Want JSON output for piping to other tools?
npx flagshark scan --json
Only care about files changed in your current branch?
npx flagshark scan --diff main
Add it to your CI
FlagShark is also available as a GitHub Action. It posts a flag health report on every PR:
- uses: FlagShark/[email protected]
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
The tool is open source and free. If you want automated cleanup PRs and trend tracking over time, check out flagshark.com.
One more thing
If the companies building feature flag tools have stale flags in their own codebases, every team does. The question isn't whether you have stale flags. It's how many, and which ones are sitting in code paths that matter.
npx flagshark scan will tell you in 2 seconds.