Unveiling Granola’s Hidden API

Granola, the collaboration platform that turns meetings into searchable documents, has long kept its API under wraps. A recent reverse‑engineering project by @getprobo has mapped out the entire authentication chain, the key endpoints, and the data structures that drive the service. The result is a ready‑to‑use toolkit that lets developers pull documents, transcripts, and workspace information programmatically.

Why Granola Matters for Developers

Granola’s core promise is to turn spoken conversation into structured, searchable content. For data‑centric teams, that means a rich source of meeting notes, action items, and context that can feed knowledge graphs, AI assistants, or compliance audits. With a documented API, developers can:

Automate archival of meeting notes.
Build custom dashboards that surface key insights.
Integrate transcripts into downstream NLP pipelines.
Sync Granola data with other collaboration tools.

The reverse‑engineering effort unlocks exactly those capabilities.

Authentication Mechanics: WorkOS + OAuth 2.0

Granola relies on WorkOS for identity management. The flow is a classic OAuth 2.0 refresh‑token exchange, but with a twist: refresh‑token rotation.

POST https://api.workos.com/user_management/authenticate
Content-Type: application/json

{
  "client_id": "<WORKOS_CLIENT_ID>",
  "grant_type": "refresh_token",
  "refresh_token": "<CURRENT_REFRESH_TOKEN>"
}

The response returns a short‑lived access_token (1‑hour TTL) and a new refresh_token. The old token is immediately invalidated—re‑using it will trigger an authentication failure. This design mitigates token replay attacks.

Security takeaway: Any automation that stores refresh tokens must persist the latest token after each exchange. Failing to do so will break subsequent API calls.

Endpoints Walkthrough

Endpoint	Purpose	Notes
`POST /v2/get-documents`	Paginated list of user‑owned documents	Does not return shared docs. Use `get-documents-batch` for folder‑based queries.
`POST /v1/get-document-transcript`	Retrieve transcript for a single document	Returns 404 if no transcript exists.
`POST /v1/get-workspaces`	List all workspaces (organizations) the user belongs to	Each document is linked to a workspace via `workspace_id`.
`POST /v2/get-document-lists` (fallback to `/v1`)	List all folders (document lists)	V2 returns full document objects; V1 returns IDs only.
`POST /v1/get-documents-batch`	Fetch multiple documents by ID, including shared ones	Batch limit ~100 per request.

Sample Request for Documents

POST https://api.granola.ai/v2/get-documents
Authorization: Bearer <ACCESS_TOKEN>
Content-Type: application/json

{
  "limit": 100,
  "offset": 0,
  "include_last_viewed_panel": true
}

The response contains an array of documents, each with id, title, timestamps, and an optional last_viewed_panel that holds ProseMirror content.

Practical Scripts and Tooling

The repository ships with a handful of Python scripts that orchestrate the API calls:

main.py – Pulls workspaces, folders, and documents, then stores each document in a dedicated folder with metadata, Markdown conversion, and transcript files.
list_workspaces.py / list_folders.py – Quick CLI utilities to enumerate workspaces and folders.
filter_by_workspace.py / filter_by_folder.py – Filter documents by workspace or folder, supporting name or ID queries.
token_manager.py – Handles the refresh‑token rotation logic.

Running python3 main.py /path/to/output produces a tidy directory tree:

output_directory/
├── workspaces.json
├── document_lists.json
├── granola_api_response.json
├── <doc_id>/
│   ├── document.json
│   ├── metadata.json
│   ├── resume.md
│   ├── transcript.json
│   └── transcript.md

The metadata.json file enriches each note with workspace and folder context, making downstream processing straightforward.

Data Structures: From ProseMirror to Markdown

Granola stores document content in ProseMirror JSON. The reverse‑engineering scripts convert this to Markdown with front‑matter:

---
granola_id: doc_123456
title: "My Meeting Notes"
created_at: 2025-01-15T10:30:00Z
updated_at: 2025-01-15T11:45:00Z
---

# Meeting Notes

…

The accompanying metadata.json captures additional context such as:

{
  "document_id": "doc_123456",
  "workspace_id": "wks_7890",
  "workspace_name": "Team Workspace",
  "folders": [ {"id": "fld_1", "name": "Sales Calls"} ],
  "meeting_date": "2025-01-15T10:30:00Z",
  "sources": ["microphone", "system"]
}

Security Implications

The most striking feature is refresh‑token rotation. While many OAuth providers offer this, Granola’s implementation is strict: a single-use token per exchange. This reduces the attack surface but also imposes a maintenance burden on automation scripts. Developers must:

Persist the latest refresh_token after each exchange.
Handle authentication failures gracefully, triggering a fresh rotation.
Store tokens securely (e.g., environment variables, secrets manager).

Failure to follow these practices will result in silent API failures, potentially disrupting data pipelines.

Final Thoughts

By exposing Granola’s API, the reverse‑engineering effort turns a closed‑source collaboration tool into an open data platform. Developers can now build custom analytics, integrate meeting notes into knowledge graphs, or feed transcripts into AI models—all while respecting Granola’s security model. The scripts and documentation provided by @getprobo serve as a solid foundation for anyone looking to harness Granola’s rich conversational data.

Source: https://github.com/getprobo/reverse-engineering-granola-api

#GranolaAPI #OAuth2 #ReverseEngineering

Unveiling Granola’s Hidden API: A Deep Dive into Reverse‑Engineered Endpoints, OAuth Rotation, and Developer Tooling

Share this article