Refactoring to MinIO Storage
Your job works with local files! Now let’s refactor it to use MinIO (S3-compatible storage) so you can process files in the cloud.
Why Cloud Storage?
Benefits of MinIO/S3:
- Scalable storage for large datasets
- Data durability and redundancy
- Separation of compute and storage
- Production-ready architecture
- Works with any S3-compatible service (AWS S3, MinIO, Backblaze B2, etc.)
Starting MinIO Locally
First, we need MinIO running. Create podman-compose.yaml:
services:
minio:
image: docker.io/minio/minio:latest
command: server /data --console-address ":9001"
environment:
- MINIO_ROOT_USER=minioadmin
- MINIO_ROOT_PASSWORD=minioadmin
ports:
- "9000:9000" # API
- "9001:9001" # Console
volumes:
- minio-data:/data:z
volumes:
minio-data:
Start MinIO:
podman-compose up -d
Verify it’s running:
podman ps
You should see the minio container running. Access the console at http://localhost:9001 (login: minioadmin/minioadmin).
Add Storage Package
We’ll create a new service package to abstract our storage operations.
First, create the service directory:
mkdir service
Then create service/minio.go:
package service
import (
"context"
"io"
"github.com/minio/minio-go/v7"
"github.com/minio/minio-go/v7/pkg/credentials"
)
type MinIOClient struct {
mc *minio.Client
}
func NewMinIOClient(endpoint, accessKey, secretKey string) (*MinIOClient, error) {
mc, err := minio.New(endpoint, &minio.Options{
Creds: credentials.NewStaticV4(accessKey, secretKey, ""),
Secure: false, // Use HTTP for local development
})
if err != nil {
return nil, err
}
return &MinIOClient{mc: mc}, nil
}
func (c *MinIOClient) GetObject(ctx context.Context, bucket, key string) (io.ReadCloser, error) {
return c.mc.GetObject(ctx, bucket, key, minio.GetObjectOptions{})
}
func (c *MinIOClient) PutObject(ctx context.Context, bucket, key string, reader io.Reader, size int64) error {
_, err := c.mc.PutObject(ctx, bucket, key, reader, size, minio.PutObjectOptions{})
return err
}
Key Design Decisions
Wrapper pattern:
- Hides MinIO-specific details
- Makes testing easier (mock the interface)
- Provides only needed methods
Streaming I/O:
GetObjectreturnsio.ReadCloserfor streaming readsPutObjectacceptsio.Readerto stream uploads- No buffering of entire files in memory
Context propagation:
- All methods accept
context.Context - Enables cancellation and timeout
- Ready for trace spans (we’ll add later)
Update Dependencies
Update go.mod to add MinIO:
module 1brc-walkthrough
go 1.24.0
require (
github.com/z5labs/humus v0.20.2
github.com/minio/minio-go/v7 v7.0.97
)
Run go mod tidy to download the dependency.
Refactor Handler to Use Storage Interface
Update onebrc/handler.go to accept a storage interface instead of file paths:
package onebrc
import (
"bufio"
"bytes"
"context"
"fmt"
"io"
"log/slog"
"os"
)
type Storage interface {
GetObject(ctx context.Context, bucket, key string) (io.ReadCloser, error)
PutObject(ctx context.Context, bucket, key string, reader io.Reader, size int64) error
}
type Handler struct {
storage Storage
bucket string
inputKey string
outputKey string
log *slog.Logger
}
func NewHandler(storage Storage, bucket, inputKey, outputKey string) *Handler {
return &Handler{
storage: storage,
bucket: bucket,
inputKey: inputKey,
outputKey: outputKey,
log: slog.New(slog.NewJSONHandler(os.Stdout, nil)),
}
}
func (h *Handler) Handle(ctx context.Context) error {
h.log.InfoContext(ctx, "starting 1BRC processing",
slog.String("bucket", h.bucket),
slog.String("input_key", h.inputKey),
slog.String("output_key", h.outputKey),
)
// 1. Fetch from S3
rc, err := h.storage.GetObject(ctx, h.bucket, h.inputKey)
if err != nil {
h.log.ErrorContext(ctx, "failed to fetch input object", slog.Any("error", err))
return fmt.Errorf("get object: %w", err)
}
defer func() {
if cerr := rc.Close(); cerr != nil {
h.log.WarnContext(ctx, "failed to close input object", slog.Any("error", cerr))
}
}()
// 2. Parse
cityStats, err := Parse(bufio.NewReader(rc))
if err != nil {
h.log.ErrorContext(ctx, "failed to parse temperature data", slog.Any("error", err))
return fmt.Errorf("parse: %w", err)
}
// 3. Calculate
results := Calculate(cityStats)
// 4. Write results
output := FormatResults(results)
outputBytes := []byte(output)
err = h.storage.PutObject(ctx, h.bucket, h.outputKey,
bytes.NewReader(outputBytes), int64(len(outputBytes)))
if err != nil {
h.log.ErrorContext(ctx, "failed to upload results", slog.Any("error", err))
return fmt.Errorf("put object: %w", err)
}
h.log.InfoContext(ctx, "1BRC processing completed successfully",
slog.Int("cities_processed", len(cityStats)),
)
return nil
}
Key changes:
- Changed from file paths to
Storageinterface - Uses
GetObjectinstead ofos.Open - Uses
PutObjectinstead ofos.WriteFile - Core parsing/calculation logic unchanged (that’s the beauty of interfaces!)
Update Configuration
Update config.yaml to add MinIO settings:
minio:
endpoint: {{env "MINIO_ENDPOINT" | default "localhost:9000"}}
access_key: {{env "MINIO_ACCESS_KEY" | default "minioadmin"}}
secret_key: {{env "MINIO_SECRET_KEY" | default "minioadmin"}}
bucket: {{env "MINIO_BUCKET" | default "onebrc"}}
onebrc:
input_key: {{env "INPUT_KEY" | default "measurements.txt"}}
output_key: {{env "OUTPUT_KEY" | default "results.txt"}}
Note: We changed input_file/output_file to input_key/output_key to reflect S3 terminology.
Update App Initialization
Update app/app.go to create the MinIO client:
package app
import (
"context"
"1brc-walkthrough/onebrc"
"1brc-walkthrough/service"
"github.com/z5labs/humus/job"
)
type Config struct {
Minio struct {
Endpoint string `config:"endpoint"`
AccessKey string `config:"access_key"`
SecretKey string `config:"secret_key"`
Bucket string `config:"bucket"`
} `config:"minio"`
OneBRC struct {
InputKey string `config:"input_key"`
OutputKey string `config:"output_key"`
} `config:"onebrc"`
}
func Init(ctx context.Context, cfg Config) (*job.App, error) {
// Create MinIO client
minioClient, err := service.NewMinIOClient(
cfg.Minio.Endpoint,
cfg.Minio.AccessKey,
cfg.Minio.SecretKey,
)
if err != nil {
return nil, err
}
// Create handler with MinIO client
handler := onebrc.NewHandler(
minioClient,
cfg.Minio.Bucket,
cfg.OneBRC.InputKey,
cfg.OneBRC.OutputKey,
)
return job.NewApp(handler), nil
}
Update Data Generation Tool
Update tool/main.go to upload directly to MinIO:
package main
import (
"bytes"
"context"
"flag"
"fmt"
"log"
"math/rand"
"github.com/minio/minio-go/v7"
"github.com/minio/minio-go/v7/pkg/credentials"
)
var cities = []string{
"Tokyo", "Jakarta", "Delhi", "Manila", "Shanghai",
"Sao Paulo", "Mumbai", "Beijing", "Cairo", "Mexico City",
"New York", "London", "Paris", "Moscow", "Sydney",
}
func main() {
count := flag.Int("count", 10000, "number of measurements to generate")
flag.Parse()
// Connect to MinIO
mc, err := minio.New("localhost:9000", &minio.Options{
Creds: credentials.NewStaticV4("minioadmin", "minioadmin", ""),
Secure: false,
})
if err != nil {
log.Fatal(err)
}
// Create bucket if needed
ctx := context.Background()
exists, err := mc.BucketExists(ctx, "onebrc")
if err != nil {
log.Fatal(err)
}
if !exists {
err = mc.MakeBucket(ctx, "onebrc", minio.MakeBucketOptions{})
if err != nil {
log.Fatal(err)
}
log.Println("Created bucket: onebrc")
}
// Generate data
log.Printf("Generating %d measurements...\n", *count)
var buf bytes.Buffer
for i := 0; i < *count; i++ {
city := cities[rand.Intn(len(cities))]
temp := -20.0 + rand.Float64()*70.0 // -20 to 50°C
buf.WriteString(fmt.Sprintf("%s;%.1f\n", city, temp))
}
// Upload to MinIO
data := buf.Bytes()
_, err = mc.PutObject(ctx, "onebrc", "measurements.txt",
bytes.NewReader(data), int64(len(data)), minio.PutObjectOptions{})
if err != nil {
log.Fatal(err)
}
log.Printf("Uploaded %d bytes to onebrc/measurements.txt\n", len(data))
}
Run the Refactored Job
# Make sure MinIO is running
podman ps
# Generate test data
go run tool/main.go -count 10000
# Ensure dependencies are installed
go mod tidy
# Run the job
go run .
You should see:
{"time":"...","level":"INFO","msg":"starting 1BRC processing","bucket":"onebrc","input_key":"measurements.txt","output_key":"results.txt"}
{"time":"...","level":"INFO","msg":"1BRC processing completed successfully","cities_processed":15}
Verify Results in MinIO Console
- Open http://localhost:9001
- Login with minioadmin/minioadmin
- Browse the
onebrcbucket - You should see both files:
measurements.txt(input)results.txt(output)
- Download or preview
results.txt
Expected format:
Beijing=-19.5/16.3/49.8
Cairo=-18.2/17.9/48.5
Delhi=-17.9/15.8/47.3
...
What We Refactored
Before (local files):
- Direct
os.Openandos.WriteFile - File paths in configuration
- Simple and fast for development
After (cloud storage):
Storageinterface withGetObject/PutObject- Bucket and key configuration
- Production-ready architecture
- Same core business logic!
Benefits of This Refactoring
Testability:
- Can mock the
Storageinterface - Unit tests don’t need MinIO running
- Easier to test error paths
Flexibility:
- Works with any S3-compatible service
- Easy to swap implementations
- Can add caching, retries, etc.
Production-ready:
- Scalable storage
- Cloud-native architecture
- Separation of concerns
What’s Next
Now let’s add the full LGTM observability stack so you can see traces, metrics, and logs in Grafana.