1BRC Job Walkthrough

Build a production job to solve the One Billion Row Challenge

Overview

This walkthrough teaches you how to build a production-ready job application using Humus to solve the One Billion Row Challenge. You’ll learn how to:

Build a job that processes large datasets
Parse and aggregate 1 billion temperature measurements
Start with local files for fast iteration
Refactor to cloud storage (MinIO/S3)
Test your job with minimal infrastructure before adding observability
Retrofit OpenTelemetry instrumentation (traces, metrics, logs) to working code
Monitor your job in Grafana with distributed tracing
Use Humus patterns for configuration, error handling, and graceful shutdown

This walkthrough follows a practical development workflow: get your core algorithm working with local files first, then progressively add cloud storage and observability.

What You’ll Build

A complete job application that:

Parses temperature measurement data in the format city;temperature
Calculates min/mean/max statistics per city
Writes formatted results to output
Starts with local file I/O for fast development
Refactors to MinIO (S3-compatible storage) for production
Exports telemetry to Tempo (traces), Mimir (metrics), and Loki (logs)

Input: 1 billion lines like Tokyo;35.6\nJakarta;-6.2\n... Output: One line per city: Jakarta=-10.0/26.5/45.3\nTokyo=-5.2/35.6/50.1\n...

Prerequisites

Go 1.24+ installed
Podman or Docker for running infrastructure
Basic understanding of Go (contexts, interfaces, error handling)
Familiarity with command-line tools

What You’ll Learn

Humus Framework Patterns

Builder + Runner pattern: How Humus composes apps with middleware
Config embedding: Using job.Config with custom configuration
Automatic OTel: Zero-manual-setup observability
Graceful shutdown: OS signal handling and resource cleanup

OpenTelemetry Integration

Creating manual spans for fine-grained tracing
Recording custom metrics
Structured logging with trace correlation
Viewing distributed traces in Grafana

Job Architecture

Separating concerns (storage, parsing, calculation, orchestration)
Streaming large files without loading into memory
Error handling and context propagation

Time Estimate

Setup: 10 minutes
Code walkthrough: 30-45 minutes
Running and monitoring: 15 minutes

Walkthrough Sections

Project Setup - Directory structure and dependencies
Building a Basic Job - Core job structure with minimal config
1BRC Algorithm with Local Files - Parsing and calculating with local file I/O
Running With Local Files - Test quickly with zero infrastructure
Refactoring to MinIO Storage - Upgrade to S3-compatible cloud storage
Infrastructure Setup - Adding the LGTM observability stack
Adding Observability - Retrofitting traces, metrics, and logs
Running and Monitoring - Execute and view telemetry in Grafana

Source Code

The complete working example is located at:

github.com/z5labs/humus/example/job/1brc-walkthrough

Next Steps

Begin with Project Setup to understand the code structure.

Project Setup

Create the directory structure and understand the project layout

Building a Basic Job

Create a minimal “hello world” job to verify setup

1BRC Algorithm with Local Files

Implement the core parsing and calculation logic using local file I/O

Running With Local Files

Test your job quickly with local file I/O

Refactoring to MinIO Storage

Replace local file I/O with S3-compatible cloud storage

Infrastructure Setup

Adding the LGTM observability stack

Adding Observability

Retrofitting your job with traces, metrics, and logs

Running and Monitoring

Execute the instrumented job and view telemetry in Grafana