Stop Exporting CSVs: Stream PostgreSQL to Google Sheets in Go (Production‑Ready Guide)
#DevOps

Stop Exporting CSVs: Stream PostgreSQL to Google Sheets in Go (Production‑Ready Guide)

Backend Reporter
5 min read

Engineering teams often rely on manual CSV exports to get data into Google Sheets, but this approach breaks down at scale. This guide shows how to build a robust Go pipeline that streams millions of rows from PostgreSQL to Sheets using chunked batch updates, retry logic, and partitioning to handle API limits and memory constraints.

Google Sheets is often the last mile for business data. Product managers, analysts, and operations teams love it because it's: - collaborative - visual - familiar But engineering teams often end up maintaining manual CSV exports or fragile scripts to get data into Sheets. There's a better way. In this guide we'll build a production‑ready Google Sheets syncer in Go that:

Streams millions of rows safely from PostgreSQL

Avoids memory explosions

Handles API quotas and retries

Writes data efficiently with chunked batch updates

Partitions datasets across sheets when they get large

Uses secure service‑account credentials

This pattern works extremely well for: analytics exports operational reporting finance dashboards product metrics internal tooling Let's walk through how to build it.

Why not just export CSV? Most teams start with something like: Run SQL query Dump CSV Upload to Google Sheets Repeat via cron This approach breaks down when: datasets reach hundreds of thousands of rows exports run frequently business teams need fresh data scripts start failing due to memory or timeouts The better architecture: PostgreSQL │ │ stream rows ▼ Go Worker │ │ chunk rows (2k) ▼ Retry + Backoff │ ▼ Google Sheets API │ ▼ Spreadsheet Key idea: Stream rows from the DB and write them in chunks to Sheets. This avoids loading the entire dataset into memory.

Google Sheets limits you must respect Google Sheets is not a database. Key limits:

Limit Value Maximum cells per spreadsheet 10,000,000 Max columns 18,278 Recommended API rate ~60 writes/min Payload size ~10MB/request

Implications: Write in 1k--2k row chunks Partition large datasets across sheets Avoid extremely wide tables Secure credentials (never commit keys) Use a Google Cloud service account. Steps: Enable Google Sheets API Create a service account Generate JSON key Share spreadsheet with the service account email Store credentials in a secret manager Environment variables: GOOGLE_SHEETS_CREDENTIALS_PATH=/secrets/sheets.json GOOGLE_SHEETS_SPREADSHEET_ID= Never commit keys to Git.

Project structure A clean layout: go-sheets-sync/ ├─ cmd/ │ └─ sync-job/ │ ├─ pkg/ │ ├─ sheets/ │ │ ├─ service.go │ │ ├─ retry.go │ │ │ └─ stream/ │ └─ streamer.go │ └─ internal/ └─ jobs/ This keeps Sheets logic reusable.

Minimal Google Sheets service A small wrapper around the official API client. type SheetService struct { srv *sheets.Service spreadsheetID string } func NewSheetService(ctx context.Context, credentialsPath, spreadsheetID string) (*SheetService, error) { srv, err := sheets.NewService(ctx, option.WithCredentialsFile(credentialsPath)) if err != nil { return nil, fmt.Errorf("init sheets service: %w", err) } return &SheetService{ srv: srv, spreadsheetID: spreadsheetID}, nil } Batch updates are the key to performance. func (s *SheetService) BatchUpdateValues( ctx context.Context, data map[string][][]interface{}, ) error { var vr []*sheets.ValueRange for rng, vals := range data { vr = append(vr, &sheets.ValueRange{ Range: rng, Values: vals}) } req := &sheets.BatchUpdateValuesRequest{ ValueInputOption: "USER_ENTERED", Data: vr} _, err := s.srv.Spreadsheets.Values. BatchUpdate(s.spreadsheetID, req). Context(ctx). Do() return err } One API call can write thousands of rows.

Retry with exponential backoff Google APIs will occasionally return: 429 Too Many Requests 5xx server errors Retries must be built in. func withRetry(ctx context.Context, op func() error) error { backoff := time.Second maxAttempts := 6 for attempt := 0; attempt < maxAttempts; attempt++ { err := op() if err == nil { return nil } if gerr, ok := err.(*googleapi.Error); ok { if gerr.Code != 429 && gerr.Code < 500 { return err } } jitter := time.Duration(rand.Intn(500)) * time.Millisecond select { case <-time.After(backoff + jitter): backoff *= 2 case <-ctx.Done(): return ctx.Err() } } return fmt.Errorf("retry attempts exhausted") } This makes the pipeline resilient.

Streaming rows from PostgreSQL Never load millions of rows into memory. Instead: iterate DB cursor buffer small chunks flush to Sheets const chunkSize = 2000 func StreamRowsToSheet( ctx context.Context, svc *SheetService, sheetTitle string, rows *sql.Rows, ) error { defer rows.Close() cols, err := rows.Columns() if err != nil { return err } batch := make([][]interface{}, 0, chunkSize) startRow := 2 for rows.Next() { vals := make([]interface{}, len(cols)) ptrs := make([]interface{}, len(cols)) for i := range vals { ptrs[i] = &vals[i] } if err := rows.Scan(ptrs...); err != nil { return err } batch = append(batch, vals) if len(batch) >= chunkSize { rng := fmt.Sprintf("%s!A%d", sheetTitle, startRow) err := withRetry(ctx, func() error { return svc.BatchUpdateValues(ctx, map[string][][]interface{}{ rng: batch}) }) if err != nil { return err } startRow += len(batch) batch = batch[:0] time.Sleep(250 * time.Millisecond) } } if err := rows.Err(); err != nil { return err } return nil } This pattern can stream millions of rows safely.

Partition large datasets across sheets Remember: 1 spreadsheet = 10M cells If you write 20 columns: 10,000,000 / 20 ≈ 500,000 rows A safe strategy: fund_history_1 fund_history_2 fund_history_3 Example logic: func nextSheetIndex(rowCount int) int { const maxRows = 500000 return rowCount/maxRows + 1 } Partitioning keeps Sheets responsive.

Concurrency control Avoid overwhelming the API. A simple semaphore works well. var limiter = make(chan struct{}, 3) func limitedWrite(fn func() error) error { limiter <- struct{}{} defer func(){ <-limiter }() return fn() } Use this to cap concurrent writes.

Observability (important in production) Add metrics like: sheets_rows_written_total sheets_api_requests_total sheets_retry_total sheets_write_latency_seconds Also log: batch size retry attempts API failures This helps detect quota issues early.

Running the job Example main: func main() { ctx := context.Background() creds := os.Getenv("GOOGLE_SHEETS_CREDENTIALS_PATH") spreadsheetID := os.Getenv("GOOGLE_SHEETS_SPREADSHEET_ID") svc, err := sheets.NewSheetService(ctx, creds, spreadsheetID) if err != nil { log.Fatal(err) } job := jobs.NewSheetsUpdateJob(db) if err := job.Execute(ctx, svc); err != nil { log.Fatal(err) } } Run via: cron Kubernetes job workflow scheduler

Production best practices ✓ Never commit credential JSON ✓ Use separate spreadsheets per environment ✓ Chunk writes (1k--2k rows) ✓ Partition large datasets ✓ Implement retries with backoff ✓ Add metrics and alerts

Final thoughts Google Sheets works best as a human‑friendly reporting surface, not a database. By combining: streaming database reads chunked API writes exponential backoff retries sheet partitioning secure credential management you can build a robust data pipeline that reliably syncs large datasets into Google Sheets. This approach scales surprisingly well and is used in many production internal tools. Happy coding 🚀

Build gen AI apps that run anywhere with MongoDB Atlas

Comments

Loading comments...