← Highlights · Projects · Related project

Case study: F1 dashboard

A compact dashboard that merges multiple open data sources into a fast, readable race-weekend view.

Context

I wanted a single page that answers the questions I keep asking during a race weekend:

What sessions are coming up?
Who’s leading the standings?
What’s the minimum I need to know right now?

Most F1 sites are feature-rich, but they’re also noisy and heavy. Official F1 apps are slow, cluttered, and fight you with ads and auto-playing videos. Third-party sites have paywalls or require accounts. The goal here was the opposite: fast, compact, and readable.

Also wanted to see how much you can do by merging multiple open data sources—Ergast/Jolpica for historical races, OpenF1 for session details and telemetry, Wikipedia for circuit descriptions, f1db for biographical data. Turns out: a lot.

Problem

Build a dashboard that:

Loads quickly and stays responsive
Is readable at a glance
Combines multiple datasets into a cohesive view
Doesn’t require ten tabs (or a scrolling marathon)
Works for both “what’s happening now?” (race weekends) and “what happened in 1998?” (historical browsing)

Constraints

Data comes from multiple sources with different formats and reliability
Some datasets are “nice to have” (visualizations), others are “must have” (sessions/standings)
The UI must degrade gracefully if a specific endpoint is down
Historical data is stable (1950-2023 won’t change), but current season updates frequently
Can’t rely on official F1 APIs (licensing, rate limits, breaking changes)

Approach

1) Start with the user flow

I designed the UI around a “race weekend glance” workflow:

Home page: Next race countdown, recent results, current standings
Season browser: Pick a year, see the calendar and championship outcome
Race detail: Full results, qualifying, lap charts, pit stops
Driver/constructor profiles: Career stats, teammate comparisons
Deep dives: Historical analysis, cross-era rankings, chaos metrics

Navigation needed to feel instant—no loading spinners between pages if the data is already cached.

2) Normalize upstream data

The key step was building a normalized internal representation so the UI can stay simple even if upstream APIs are messy.

Data pipeline:

Client requests /api/races/2024
Server checks SQLite cache_meta table
If cached and season < currentYear: return from database immediately
If not cached or current season: fetch from upstream API, normalize, persist, return
OpenF1 enrichment runs automatically for 2023+ seasons (session times, team colors, driver headshots, lap data)

Why SQLite?

Embedded, no separate database server needed
Transactional, handles concurrent reads well
Persistent cache survives container restarts
Query performance is fine for this workload (mostly key lookups, small datasets per request)

Caching strategy:

Historical seasons (< current year): fetch once, cache forever
Current season: always refetch from upstream (standings change after each race)
Enrichment data (OpenF1): cached with 1-hour TTL for live sessions, permanent for finished races

This meant the first load of historical data is slow (fetches all 70+ seasons from Jolpica), but subsequent loads are instant.

3) Merge multiple data sources

Each API provides different pieces:

Source	What it provides
Jolpica (Ergast-compatible)	Race calendar, results, standings, quali, sprint (all eras)
OpenF1	Session times, lap data, stints, pit stops, team colors, driver headshots (2023+)
Wikipedia REST API	Circuit descriptions and metadata
f1db (bundled JSON)	Circuit info, driver biographies, family relationships, constructor chronology, Driver of the Day

The server normalizes and joins these:

Race results from Jolpica + session times from OpenF1 = complete race weekend view
Driver profiles from Jolpica + bios from f1db + family tree from manually curated data
Circuit data from Wikipedia + lap records from Jolpica + track maps from bundled assets

Tradeoff: This creates a single point of failure (my server), but in practice it’s more reliable than the upstream APIs individually. If Jolpica goes down, the entire historical dataset is still in the cache. If OpenF1 is slow, it just means current-season enrichment is delayed—everything else still works.

4) Performance as a feature

I treated performance as part of the product:

Client-side:

TanStack Query for aggressive caching (stale data is better than loading spinners)
Code-split routes (30+ pages, each lazy-loaded)
Optimistic UI updates (ratings/favorites update instantly, sync in background)
Minimal re-renders (React.memo on expensive components like lap charts)

Server-side:

SQLite WAL mode for concurrent reads without blocking
Batch inserts for historical data seeding (<1 minute for all 70+ seasons)
Gzip compression on API responses
Static asset caching headers (1 year TTL for builds)

Network:

Docker serves both API and client from one process (no CORS, no extra roundtrips)
Thumbnails lazy-loaded, low-res placeholders first
No external fonts or analytics (privacy + speed)

5) Handle edge cases and data quirks

F1 data is messier than you’d think:

Driver IDs inconsistent across sources: “max_verstappen” in Jolpica, “VER” in OpenF1, “Max Emilian Verstappen” in f1db → needed manual mapping
Constructor name changes: Toleman → Benetton → Renault → Alpine, but treated as separate entities in some APIs → built a lineage graph
Sprint races introduced mid-season 2021: older APIs don’t have dedicated fields → detect by round format and session type
Missing data for old seasons: lap times didn’t exist before 1980s, qualifying formats changed every few years → UI degrades gracefully (show what’s available, don’t crash)
Team color inconsistencies: OpenF1 provides hex codes, but only from 2023 onward → fallback to hardcoded palette for older seasons

The normalization layer handles all of this, so the UI code can assume clean, consistent data.

6) Build features incrementally

Started with the MVP: calendar, standings, race results. Added depth progressively:

Iteration 1: Race calendar + standings
Iteration 2: Full race results + qualifying
Iteration 3: Driver profiles + career stats
Iteration 4: Lap charts + pit stop timelines
Iteration 5: Constructor profiles + lineage visualization
Iteration 6: Cross-era rankings algorithm
Iteration 7: Chaos Index (DNF rates, safety cars, lead changes)
Iteration 8: Family tree graph
Iteration 9: Daily challenge mode (pick a random historical race each day)

Each iteration added value without breaking existing features. The normalized data model made this easy—adding a new page often just meant writing a new query and rendering logic, not refactoring the entire backend.

Tradeoffs

Chose simplicity over comprehensiveness:

Some “deep detail” views are intentionally missing — the dashboard is optimized for glanceability, not encyclopedic coverage
No live timing during races (would require WebSocket infrastructure, adds complexity)
No video highlights or audio commentary (licensing nightmare)

Chose self-hosting over cloud services:

SQLite instead of PostgreSQL/MySQL (simpler deployment, fewer moving parts)
Bundled data files instead of CDN (privacy, no external dependencies)
Docker container instead of serverless (easier to reason about, no cold starts)

Chose read-heavy optimization over write-heavy:

Historical data never updates, so cache forever and never invalidate
Current season refetched on every load (inefficient, but simpler than WebSocket subscriptions or polling logic)

What I learned

Data normalization is 80% of the work. The UI is straightforward once the data is clean. Most of the effort went into mapping inconsistent APIs into a single coherent model.

Caching makes everything better. Historical F1 data is perfect for aggressive caching—it’s stable, large, and frequently accessed. SQLite turned out to be ideal for this.

Users forgive missing features, not slow features. I spent more time on performance tuning than adding new pages. The result is a dashboard that feels instant even when browsing 70+ years of data.

Open data is powerful. By combining 4 different sources, I built something more comprehensive than any single API provides. The trick is handling the seams between them.

Results

Current state:

30+ pages: Home, standings, schedule, race results, driver profiles, constructor profiles, circuits, chaos rankings, cross-era analysis, family tree, lineage graph, and more
70+ seasons cached: Full race results back to 1950
Sub-100ms responses for cached historical data
Zero external dependencies at runtime (all data served from local SQLite)
Docker deployment: Single container, <2 minutes to deploy from scratch

User outcomes (anecdotal, since I’m the primary user):

Went from opening 5+ tabs during race weekends to just one
Historical browsing went from “let me Google that” to “already have it open”
Race-weekend prep time: ~10 minutes → ~30 seconds

What I’d do next

Better error boundaries per module: right now, if one API fails, the whole page can break. Would be better to show “sessions loading failed” but keep standings/results working
Data freshness indicators: show “last updated 5 minutes ago” per data source, so users know if they’re looking at stale info
Integration test suite: detect upstream API shape changes early (Jolpica has broken twice, took 2 weeks to notice)
Export functionality: download race results as CSV, share driver comparisons as images
Mobile app: PWA with offline support for cached data