The leaderboard scoring system turns GitHub activity into ranked lists. It normalizes events into signals, scores them against a configurable ruleset, and aggregates results by entity type — contributor, repository, or team.
Architecture
GitHub GraphQL API Signals DB Computed Results
───────────────────── ──────────────── ─────────────────
fetchOrgScoringDataGraphQL()
─── normalizeGitHubData()
──> upsertSignals() ──> signals (partitioned)
│
▼
computeScores() ────> Contributor Leaderboard
aggregateByRepository() ──> Repository Leaderboard
aggregateByTeam() ──> Team Leaderboard
│
▼
leaderboard_materializations
computed_scores (per preset × time_period)Pipeline flow
The orchestration lives in lib/leaderboard/pipeline.ts:
- Resolve repos for the requested
scopeType— org-wide, a single repo, or a team’s repos - Fetch GitHub data via GraphQL with ETag caching (commits, PRs, issues, reviews, comments)
- Normalize raw data into
Signal[]with content-hash dedup - Upsert signals to the partitioned
signalstable (idempotent) - Score via the entity-type-specific aggregation function
- Write to
leaderboard_materializations+computed_scoresfor fast reads - Cache the response in Redis + in-memory LRU for subsequent requests
Entity-type aggregation
Three entity types, each with its own aggregation strategy:
| Entity Type | Aggregation Function | Scoring Logic |
|---|---|---|
contributor | computeScores() | Per-user — chronological, daily quotas, diminishing returns, multipliers |
repository | aggregateByRepository() | Per-repo — base points + multipliers, no quotas/diminishing returns |
team | aggregateByTeam() | Per-team — sum of member user scores |
Contributor (default)
Each user’s signals are processed chronologically with full per-user scoring context. This is the most detailed mode and the default view.
Repository
Signals grouped by signal.repo. Each repo gets a fresh scoring context with skipQuota: true — daily quotas and diminishing returns don’t apply (those are per-user concepts). Zero-point conditions and penalties still apply.
Team
User scores are computed first via computeScores() (per-user with full rules). Then aggregateByTeam() maps users to teams via GitHub team membership and sums scores per team.
Key behaviors:
- A user in multiple teams contributes their full score to each team
- Users are deduplicated within a single team
- Team memberships are fetched from GitHub via
fetchOrgTeamsDataGraphQL()(GraphQL, paginated)
Caching layers
Three caching layers work together to keep things fast:
- In-memory LRU cache (
lib/leaderboard/request-cache.ts) — max 64 entries. Evicts oldest entries to prevent unbounded growth. On read: if fresh → return immediately; if stale → return stale data + trigger background refresh; on cold start → full compute. - Redis cache — leaderboard responses cached in Redis with configurable TTL. Falls back gracefully if Redis is unavailable.
- ETag-based conditional requests — per-endpoint ETags (commits, PRs, issues) stored in
repository_sync_state.If-None-Matchheaders avoid payload transfer on unchanged data.
Related
- Scoring Engine — signal types, algorithm, multipliers, quotas
- Configuration & Presets — default ruleset, custom presets, API
- Reference — DB schema, types, module map