1BRC Job Walkthrough
Build a production job to solve the One Billion Row Challenge
Overview
This walkthrough teaches you how to build a production-ready job application using Humus to solve the One Billion Row Challenge. You’ll learn how to:
- Build a job that processes large datasets
- Parse and aggregate 1 billion temperature measurements
- Start with local files for fast iteration
- Refactor to cloud storage (MinIO/S3)
- Test your job with minimal infrastructure before adding observability
- Retrofit OpenTelemetry instrumentation (traces, metrics, logs) to working code
- Monitor your job in Grafana with distributed tracing
- Use Humus patterns for configuration, error handling, and graceful shutdown
This walkthrough follows a practical development workflow: get your core algorithm working with local files first, then progressively add cloud storage and observability.
What You’ll Build
A complete job application that:
- Parses temperature measurement data in the format
city;temperature - Calculates min/mean/max statistics per city
- Writes formatted results to output
- Starts with local file I/O for fast development
- Refactors to MinIO (S3-compatible storage) for production
- Exports telemetry to Tempo (traces), Mimir (metrics), and Loki (logs)
Input: 1 billion lines like Tokyo;35.6\nJakarta;-6.2\n...
Output: One line per city: Jakarta=-10.0/26.5/45.3\nTokyo=-5.2/35.6/50.1\n...
Prerequisites
- Go 1.24+ installed
- Podman or Docker for running infrastructure
- Basic understanding of Go (contexts, interfaces, error handling)
- Familiarity with command-line tools
What You’ll Learn
Humus Framework Patterns
- Builder + Runner pattern: How Humus composes apps with middleware
- Config embedding: Using
job.Config with custom configuration - Automatic OTel: Zero-manual-setup observability
- Graceful shutdown: OS signal handling and resource cleanup
OpenTelemetry Integration
- Creating manual spans for fine-grained tracing
- Recording custom metrics
- Structured logging with trace correlation
- Viewing distributed traces in Grafana
Job Architecture
- Separating concerns (storage, parsing, calculation, orchestration)
- Streaming large files without loading into memory
- Error handling and context propagation
Time Estimate
- Setup: 10 minutes
- Code walkthrough: 30-45 minutes
- Running and monitoring: 15 minutes
Walkthrough Sections
- Project Setup - Directory structure and dependencies
- Building a Basic Job - Core job structure with minimal config
- 1BRC Algorithm with Local Files - Parsing and calculating with local file I/O
- Running With Local Files - Test quickly with zero infrastructure
- Refactoring to MinIO Storage - Upgrade to S3-compatible cloud storage
- Infrastructure Setup - Adding the LGTM observability stack
- Adding Observability - Retrofitting traces, metrics, and logs
- Running and Monitoring - Execute and view telemetry in Grafana
Source Code
The complete working example is located at:
github.com/z5labs/humus/example/job/1brc-walkthrough
Next Steps
Begin with Project Setup to understand the code structure.
Create the directory structure and understand the project layout
Create a minimal “hello world” job to verify setup
Implement the core parsing and calculation logic using local file I/O
Test your job quickly with local file I/O
Replace local file I/O with S3-compatible cloud storage
Adding the LGTM observability stack
Retrofitting your job with traces, metrics, and logs
Execute the instrumented job and view telemetry in Grafana