Article illustration 1

Working with timestamps in SQL seems straightforward until time zones enter the equation. DuckDB's documentation highlights several insidious pitfalls that can sabotage temporal queries, especially when combining TIMESTAMPTZ with implicit casting operations. These issues aren't just academic—they directly impact data integrity and system performance.

The Hidden Danger of Time Zone Casts

When filtering TIMESTAMPTZ columns against date literals, DuckDB's casting rules trigger subtle bugs:

SET timezone = 'America/Los_Angeles';
CREATE TABLE times AS
    FROM range('2025-08-30'::TIMESTAMPTZ, '2025-08-31'::TIMESTAMPTZ, INTERVAL 1 HOUR) tbl(t);

-- Returns 1 row in LA timezone
FROM times WHERE t <= '2025-08-30';

Change the session's time zone, and the results shift dramatically:

SET timezone = 'HST';
-- Now returns 4 rows!
FROM times WHERE t <= '2025-08-30';

The culprit? DuckDB promotes DATE to TIMESTAMPTZ by interpreting the date as midnight in the current time zone. This implicit conversion means identical queries yield different results based on session settings. The documentation bluntly advises: "Unless you need the current time zone for display, use plain TIMESTAMPs" to avoid these landmines.

When Correctness Cripples Performance

DuckDB relies on the ICU library for time zone handling—a robust solution that correctly handles daylight savings beyond 2037 (unlike Pandas). But this correctness comes at a cost: ICU operations are computationally expensive. For high-frequency temporal operations, the documentation recommends precomputing calendar tables:

CREATE OR REPLACE TABLE hourly AS
    SELECT 
        ts, 
        year::SMALLINT AS year,
        month::TINYINT AS month,
        -- Additional time parts
    FROM (
        SELECT ts, unnest(date_part(['year','month','day','hour'], ts))
        FROM generate_series(
            '2020-01-01'::DATE::TIMESTAMPTZ, 
            '2100-01-01'::DATE::TIMESTAMPTZ, 
            INTERVAL 1 HOUR
        ) tbl(ts)
    ) parts;

This 700K-row table acts as a performant lookup for temporal bins. Note the optimization: extracting multiple date parts in a single unnest call avoids redundant ICU computations. The cast to smaller integer types also reduces storage overhead.

The BETWEEN Trap

Perhaps the most widespread pitfall lurks in interval queries:

-- Closed-closed interval (dangerous!)
SELECT * FROM events 
WHERE event_time BETWEEN '2023-01-01' AND '2023-01-02';

This seemingly innocent syntax creates overlapping bins because BETWEEN includes both endpoints—equivalent to event_time >= start AND event_time <= end. Temporal analytics almost always require half-open intervals (start <= event_time < end) to prevent double-counting edge cases. The documentation's prescription is unambiguous: "Be explicit about comparison boundaries instead of using BETWEEN."*

Why This Matters

These timestamp behaviors aren't DuckDB-specific quirks—they reflect fundamental challenges in temporal data processing. Systems dealing with financial transactions, IoT sensor streams, or global event logs must handle these edge cases deliberately. DuckDB's clear documentation serves as a field guide for navigating temporal minefields, reminding us that:
1. Implicit conversions often betray us
2. Temporal correctness demands performance tradeoffs
3. SQL's convenient syntax sometimes introduces semantic errors

For temporal analytics, explicit is always safer than clever. As time zones continue to challenge distributed systems, DuckDB's lessons in defensive timestamp handling become increasingly vital.

Source: DuckDB Documentation - Timestamp Issues