Case study: F1 dashboard
Context
I wanted a single page that answers the questions I keep asking during a race weekend:
- What sessions are coming up?
- Whoâs leading the standings?
- Whatâs the minimum I need to know right now?
Most F1 sites are feature-rich, but theyâre also noisy and heavy. Official F1 apps are slow, cluttered, and fight you with ads and auto-playing videos. Third-party sites have paywalls or require accounts. The goal here was the opposite: fast, compact, and readable.
Also wanted to see how much you can do by merging multiple open data sourcesâErgast/Jolpica for historical races, OpenF1 for session details and telemetry, Wikipedia for circuit descriptions, f1db for biographical data. Turns out: a lot.
Problem
Build a dashboard that:
- Loads quickly and stays responsive
- Is readable at a glance
- Combines multiple datasets into a cohesive view
- Doesnât require ten tabs (or a scrolling marathon)
- Works for both âwhatâs happening now?â (race weekends) and âwhat happened in 1998?â (historical browsing)
Constraints
- Data comes from multiple sources with different formats and reliability
- Some datasets are ânice to haveâ (visualizations), others are âmust haveâ (sessions/standings)
- The UI must degrade gracefully if a specific endpoint is down
- Historical data is stable (1950-2023 wonât change), but current season updates frequently
- Canât rely on official F1 APIs (licensing, rate limits, breaking changes)
Approach
1) Start with the user flow
I designed the UI around a ârace weekend glanceâ workflow:
- Home page: Next race countdown, recent results, current standings
- Season browser: Pick a year, see the calendar and championship outcome
- Race detail: Full results, qualifying, lap charts, pit stops
- Driver/constructor profiles: Career stats, teammate comparisons
- Deep dives: Historical analysis, cross-era rankings, chaos metrics
Navigation needed to feel instantâno loading spinners between pages if the data is already cached.
2) Normalize upstream data
The key step was building a normalized internal representation so the UI can stay simple even if upstream APIs are messy.
Data pipeline:
- Client requests
/api/races/2024 - Server checks SQLite
cache_metatable - If cached and
season < currentYear: return from database immediately - If not cached or current season: fetch from upstream API, normalize, persist, return
- OpenF1 enrichment runs automatically for 2023+ seasons (session times, team colors, driver headshots, lap data)
Why SQLite?
- Embedded, no separate database server needed
- Transactional, handles concurrent reads well
- Persistent cache survives container restarts
- Query performance is fine for this workload (mostly key lookups, small datasets per request)
Caching strategy:
- Historical seasons (< current year): fetch once, cache forever
- Current season: always refetch from upstream (standings change after each race)
- Enrichment data (OpenF1): cached with 1-hour TTL for live sessions, permanent for finished races
This meant the first load of historical data is slow (fetches all 70+ seasons from Jolpica), but subsequent loads are instant.
3) Merge multiple data sources
Each API provides different pieces:
| Source | What it provides |
|---|---|
| Jolpica (Ergast-compatible) | Race calendar, results, standings, quali, sprint (all eras) |
| OpenF1 | Session times, lap data, stints, pit stops, team colors, driver headshots (2023+) |
| Wikipedia REST API | Circuit descriptions and metadata |
| f1db (bundled JSON) | Circuit info, driver biographies, family relationships, constructor chronology, Driver of the Day |
The server normalizes and joins these:
- Race results from Jolpica + session times from OpenF1 = complete race weekend view
- Driver profiles from Jolpica + bios from f1db + family tree from manually curated data
- Circuit data from Wikipedia + lap records from Jolpica + track maps from bundled assets
Tradeoff: This creates a single point of failure (my server), but in practice itâs more reliable than the upstream APIs individually. If Jolpica goes down, the entire historical dataset is still in the cache. If OpenF1 is slow, it just means current-season enrichment is delayedâeverything else still works.
4) Performance as a feature
I treated performance as part of the product:
Client-side:
- TanStack Query for aggressive caching (stale data is better than loading spinners)
- Code-split routes (30+ pages, each lazy-loaded)
- Optimistic UI updates (ratings/favorites update instantly, sync in background)
- Minimal re-renders (React.memo on expensive components like lap charts)
Server-side:
- SQLite WAL mode for concurrent reads without blocking
- Batch inserts for historical data seeding (<1 minute for all 70+ seasons)
- Gzip compression on API responses
- Static asset caching headers (1 year TTL for builds)
Network:
- Docker serves both API and client from one process (no CORS, no extra roundtrips)
- Thumbnails lazy-loaded, low-res placeholders first
- No external fonts or analytics (privacy + speed)
5) Handle edge cases and data quirks
F1 data is messier than youâd think:
- Driver IDs inconsistent across sources: âmax_verstappenâ in Jolpica, âVERâ in OpenF1, âMax Emilian Verstappenâ in f1db â needed manual mapping
- Constructor name changes: Toleman â Benetton â Renault â Alpine, but treated as separate entities in some APIs â built a lineage graph
- Sprint races introduced mid-season 2021: older APIs donât have dedicated fields â detect by round format and session type
- Missing data for old seasons: lap times didnât exist before 1980s, qualifying formats changed every few years â UI degrades gracefully (show whatâs available, donât crash)
- Team color inconsistencies: OpenF1 provides hex codes, but only from 2023 onward â fallback to hardcoded palette for older seasons
The normalization layer handles all of this, so the UI code can assume clean, consistent data.
6) Build features incrementally
Started with the MVP: calendar, standings, race results. Added depth progressively:
- Iteration 1: Race calendar + standings
- Iteration 2: Full race results + qualifying
- Iteration 3: Driver profiles + career stats
- Iteration 4: Lap charts + pit stop timelines
- Iteration 5: Constructor profiles + lineage visualization
- Iteration 6: Cross-era rankings algorithm
- Iteration 7: Chaos Index (DNF rates, safety cars, lead changes)
- Iteration 8: Family tree graph
- Iteration 9: Daily challenge mode (pick a random historical race each day)
Each iteration added value without breaking existing features. The normalized data model made this easyâadding a new page often just meant writing a new query and rendering logic, not refactoring the entire backend.
Tradeoffs
Chose simplicity over comprehensiveness:
- Some âdeep detailâ views are intentionally missing â the dashboard is optimized for glanceability, not encyclopedic coverage
- No live timing during races (would require WebSocket infrastructure, adds complexity)
- No video highlights or audio commentary (licensing nightmare)
Chose self-hosting over cloud services:
- SQLite instead of PostgreSQL/MySQL (simpler deployment, fewer moving parts)
- Bundled data files instead of CDN (privacy, no external dependencies)
- Docker container instead of serverless (easier to reason about, no cold starts)
Chose read-heavy optimization over write-heavy:
- Historical data never updates, so cache forever and never invalidate
- Current season refetched on every load (inefficient, but simpler than WebSocket subscriptions or polling logic)
What I learned
Data normalization is 80% of the work. The UI is straightforward once the data is clean. Most of the effort went into mapping inconsistent APIs into a single coherent model.
Caching makes everything better. Historical F1 data is perfect for aggressive cachingâitâs stable, large, and frequently accessed. SQLite turned out to be ideal for this.
Users forgive missing features, not slow features. I spent more time on performance tuning than adding new pages. The result is a dashboard that feels instant even when browsing 70+ years of data.
Open data is powerful. By combining 4 different sources, I built something more comprehensive than any single API provides. The trick is handling the seams between them.
Results
Current state:
- 30+ pages: Home, standings, schedule, race results, driver profiles, constructor profiles, circuits, chaos rankings, cross-era analysis, family tree, lineage graph, and more
- 70+ seasons cached: Full race results back to 1950
- Sub-100ms responses for cached historical data
- Zero external dependencies at runtime (all data served from local SQLite)
- Docker deployment: Single container, <2 minutes to deploy from scratch
User outcomes (anecdotal, since Iâm the primary user):
- Went from opening 5+ tabs during race weekends to just one
- Historical browsing went from âlet me Google thatâ to âalready have it openâ
- Race-weekend prep time: ~10 minutes â ~30 seconds
What Iâd do next
- Better error boundaries per module: right now, if one API fails, the whole page can break. Would be better to show âsessions loading failedâ but keep standings/results working
- Data freshness indicators: show âlast updated 5 minutes agoâ per data source, so users know if theyâre looking at stale info
- Integration test suite: detect upstream API shape changes early (Jolpica has broken twice, took 2 weeks to notice)
- Export functionality: download race results as CSV, share driver comparisons as images
- Mobile app: PWA with offline support for cached data