FlowG’s latest v0.55.0 release introduces VRL Log Splitting, a capability that allows log transformers to emit multiple output records from a single input, simplifying pipeline design for batched logs and multi-destination routing. The release also overhauls the tool’s Go-Rust interoperability layer using MessagePack serialization to eliminate redundant data copies and clarify memory ownership, paving the way for future support of non-log observability signals.
The release of FlowG v0.55.0 represents a dual advancement for the log processing pipeline tool, combining a user-facing workflow capability that resolves long-standing limitations in how log transformers handle complex input data with a foundational technical overhaul of the cross-language interoperability layer that eliminates performance bottlenecks and paves the way for future feature expansion.
Key Arguments
VRL Log Splitting: From One-to-One to One-to-Many Transformations
Previously, FlowG’s Vector Remap Language (VRL) transformers operated under a strict one-to-one input-output model, where a single log record fed into a transformer would produce exactly one log record as output. This design served well for common use cases like parsing unstructured log lines, enriching records with metadata, or reshaping payloads to fit downstream schema requirements. It failed, however, to account for real-world log complexity, where a single incoming log line may encode multiple discrete events, such as batched payloads from OpenTelemetry instrumentation that group dozens of individual log entries into a single transmission to reduce network overhead. It also created friction for use cases where a single log record contains data destined for multiple downstream systems, such as extracting metrics to send to a time-series database while routing user audit information to a dedicated compliance tool, as every downstream node in the pipeline had to understand and parse the full nested payload to extract relevant fields.
FlowG v0.55.0 removes this constraint by allowing VRL transformers to return arrays, where each element of the array is processed as an independent log record. The tool automatically normalizes and flattens all data types in the array to fit FlowG’s internal data model, ensuring compatibility with existing pipeline nodes. For example, a VRL script assigning an array of mixed types to the root of the record, written as . = [ 1, 2, "hello", { "foo": { "bar": "baz" } } ], will produce four separate log records: {"value": "1"}, {"value": "2"}, {"value": "hello"}, and {"foo.bar": "baz"}. Each generated record is passed to subsequent nodes in the pipeline as if it were a standalone input, eliminating the need to carry batch-shaped payloads through the entire pipeline when individual event processing is the end goal.
This change simplifies pipeline design significantly. Instead of building custom logic into every downstream filter, router, or output node to handle nested batch payloads, engineers can split records at the edge of the pipeline, allowing subsequent nodes to operate on simple, single-event records. Routing rules become easier to write, as they no longer need to account for nested batch structures. Storage outputs grow more predictable, as each written record has a consistent shape. Forwarding to third-party systems also grows more natural, as external tools often expect individual event payloads rather than batched arrays.
Foundational FFI Overhaul: Eliminating Redundant Allocations and Clarifying Memory Ownership
The VRL Log Splitting feature is accompanied by a less visible but equally impactful rework of the layer that allows FlowG’s Go-based core to interoperate with the Rust-implemented VRL library, which communicates via the C Foreign Function Interface (FFI). The previous implementation relied on a series of redundant data copies and an unsafe memory ownership model that created performance overhead and limited future flexibility.
In the pre-v0.55.0 flow, data moved through six discrete conversion steps, each requiring full data copies: a Go map[string]string was first converted to a custom C hash map (C hmap) with a full copy of all key-value pairs. That C hmap was then converted to a Rust HashMap<String, String> with another full copy, followed by an unnecessary intermediate conversion to a vrl::value::ObjectMap wrapped in a vrl::value::Value to serve as input to the VRL program. After the VRL program executed, the resulting vrl::value::Value was converted back to a Rust HashMap<String, String> (another copy), then to a C hmap (another copy, with the added risk of Rust-allocated memory being freed by Go’s allocator, a cross-language memory safety issue), and finally back to a Go map[string]string (one last copy). This process not only wasted CPU cycles and memory on redundant allocations, but also locked the API to only handling log-shaped map[string]string payloads, making it impossible to support non-log data types like metrics or traces in future VRL program invocations.
The v0.55.0 release reworks this flow by leveraging two key properties of the Rust VRL implementation: first, that vrl::value::Value implements serialization and deserialization via the serde framework, and second, that MessagePack provides a lightweight, binary serialization format ideal for cross-language data transfer. Instead of passing structured hash maps across the FFI boundary, the new implementation serializes data to byte arrays that are passed as pointers with no copies required.
The updated flow reduces conversions to only the necessary serialization and deserialization steps, with cached buffers to avoid per-log allocations: the Go map[string]string is serialized to a []byte buffer (cached for reuse across logs) and passed to C as a uint8_t* pointer and size_t length, with no copy and memory ownership retained by Go. The C pointer is converted to a Rust &[u8] slice with no copy, still owned by Go, which is then deserialized directly to a vrl::value::Value (the only unavoidable allocation in this step). After the VRL program runs, the resulting vrl::value::Value is serialized to a Rust Vec<u8> buffer (also cached for reuse), passed to C as a uint8_t* pointer with Rust retaining memory ownership, converted to a non-owning Go []byte, and deserialized to the target Go type. This eliminates all redundant intermediate copies, clarifies memory ownership (each language owns and frees its own buffers, avoiding cross-allocator errors), and makes the VRL program runner fully agnostic to the shape of input data, paving the way for future support of non-log data types.
Implications
For end users working with FlowG pipelines, the VRL Log Splitting feature unblocks several use cases that were previously difficult or impossible to implement without custom pre-processing steps. Teams processing OpenTelemetry batched logs can now split batches into individual events directly in the transformer node, rather than building custom batch-parsing logic into every downstream node. Organizations that need to route different subsets of data from a single log to multiple downstream systems can now perform that splitting at the pipeline edge, reducing complexity and potential errors in routing rules. The flattening and normalization of split records also ensures that all generated logs fit FlowG’s existing data model, so no updates to downstream nodes are required to handle split records beyond adjusting routing rules to account for the new individual event shape.
For the FlowG project itself, the FFI rework removes a major piece of technical debt that was blocking long-term roadmap items. The previous API’s reliance on log-shaped map[string]string payloads meant that supporting metrics, traces, or other observability data types in VRL programs would have required a full rewrite of the interoperability layer. The new serialization-based approach is data-type agnostic, as MessagePack and serde can handle any serializable value, not just string-keyed maps. This positions FlowG to expand VRL support to all observability signal types in future releases, aligning with the broader industry shift toward unified observability pipelines that handle logs, metrics, and traces in a single tool.
Performance improvements are another key implication of the FFI rework. The previous implementation’s six copy steps added significant latency and memory pressure for high-throughput pipelines, where every log record processed required multiple full copies of its payload. The new approach eliminates all redundant copies, and caches serialization buffers to avoid per-log allocations, reducing CPU usage and memory churn for busy pipelines. While the new approach adds serialization and deserialization overhead for MessagePack, this is offset by the elimination of multiple full data copies, and caching ensures that buffer allocation overhead is amortized across thousands of logs.
Counter-Perspectives and Limitations
While the VRL Log Splitting feature addresses clear user pain points, it introduces potential edge cases that users should account for. Transformers that return very large arrays, whether intentionally or due to misconfigured scripts, could generate thousands of individual log records from a single input, leading to unexpected spikes in pipeline load and downstream storage costs. FlowG’s current release notes do not mention limits on array size for split records, so users processing untrusted or high-volume batched inputs should implement their own safeguards to avoid accidental over-generation of records. Additionally, the automatic flattening of nested objects in split records may lead to loss of structural context for complex payloads: the example in the release notes flattens {"foo": {"bar": "baz"}} to {"foo.bar": "baz"}, but it is unclear how the tool handles nested arrays, repeated keys, or other complex data structures, which could create friction for users with highly nested input data.
The FFI rework also carries minor trade-offs. The reliance on MessagePack serialization and serde adds lightweight dependencies to both the Go and Rust components of FlowG, though both are widely used, well-maintained libraries with minimal overhead. For extremely small log payloads, the overhead of MessagePack serialization and deserialization may outweigh the gains from eliminated copies, though the caching of serialization buffers mitigates this for high-throughput pipelines. There is also a small risk of serialization incompatibility if the vrl::value::Value serde implementation changes in future VRL releases, though the use of a stable serialization format like MessagePack reduces this risk compared to custom binary formats.
Comments
Please log in or register to join the discussion