- The paper proposes VectraFlow, a unified architecture that combines LLM-driven semantic extraction with CEP-based temporal reasoning for unstructured data streams.
- It introduces continuous semantic operators like sem_window and sem_groupby, enabling dynamic adjustments and optimized throughput-token tradeoffs.
- Empirical results demonstrate enhanced F1 scores and reduced token consumption by decoupling event extraction from temporal matching.
VectraFlow: Long-Horizon Semantic Processing over Data and Event Streams with LLMs
Motivation and Problem Statement
Many real-world use cases demand actionable intelligence from continuous streams of unstructured data, requiring both near-real-time semantic understanding and sophisticated temporal reasoning. Conventional LLM-powered systems operate statelessly, evaluating each input independently and lacking memory of prior context or ability to reason across sequences. Classical Complex Event Processing (CEP) systems, while supporting efficient stateful processing and temporal pattern detection, are limited to highly structured, typed data streams—rendering them unsuitable for raw text analytics. "VectraFlow: Long-Horizon Semantic Processing over Data and Event Streams with LLMs" (2604.03855) proposes and implements a unified architecture that overcomes both gaps: a continuous streaming engine for complex, stateful analysis of unstructured text streams leveraging LLMs as first-class operators.
System Architecture and Execution Model
VectraFlow extends traditional streaming dataflow with a hierarchical architecture (Figure 1) consisting of three tightly coupled layers:
Figure 1: Layered architecture of VectraFlow, integrating a streaming DAG engine with LLM-backed semantic operators and natural language pipeline authoring.
- Streaming Dataflow DAG: The base layer represents computation as a DAG of operators, enabling unbounded streaming and pipelined parallelism.
- Semantic Operator Layer: This novel layer introduces LLM-augmented analogs of relational operators (sem_filter, sem_groupby, sem_aggregate, sem_window, etc.), allowing semantics-driven operations on unstructured streams. Configurable execution strategies (LLM-only, embedding-only, hybrid) expose a fundamental throughput-accuracy-token cost spectrum.
- Natural Language Interaction: An agentic component compiles user NL descriptions into executable operator graphs, incorporating structured critique, auto-repair, and targeted clarification loops to enable rapid, iterative development of analytics pipelines.
Continuous Semantic Operators
VectraFlow’s semantic operator suite generalizes classical query operators to unstructured domains, enabling continuous, streaming semantic interpretation:
- sem_window: Dynamically adjusts window boundaries based on topical or affective shifts, using semantic similarity or embedding clustering.
- sem_groupby: Maintains online clusters corresponding to evolving semantic categories, supporting on-the-fly event class emergence and dissolution.
- cont_rag: Facilitates retrieval-augmented generation with evolving retrieval context (continuous RAG).
- sem_pattern: Formalizes temporal pattern detection over extracted semantic events, detailed below.
A critical contribution is the explicit tradeoff between token cost, throughput, and output fidelity. VectraFlow provides multiple operator implementations, as illustrated for sem_groupby in Figure 2.
Figure 2: Comparative accuracy (F1, ARI, Purity) and throughput (tuples/s) of semantic group-by implementations on MiDe22, demonstrating the LLM/embedding-based tradeoff envelope.
Empirical results on MiDe22 show that pure embedding-based grouping (M3) achieves high F1/throughput but suffers from excessive fragmentation, while LLM-based methods (M1, M2) improve inter-cluster coherence (higher ARI/Purity) at the expense of increased latency and token usage. LLM with periodic refinement (M2) further enhances cluster quality, supporting use cases where label consistency and event aggregation are critical.
Semantic Pattern Operator: Lifting CEP to Unstructured Streams
VectraFlow's sem_pattern operator explicitly fuses LLM-based event extraction with NFA-based pattern detection, overcoming the limitations of both monolithic LLM approaches and classical CEP. The operator consists of two decoupled stages:
- Event Extraction: LLMs parse unstructured text to produce semantic events, each represented as a timestamped, typed tuple, suitable for downstream analysis.
- NFA-Based Temporal Matching: A compiled NFA evaluates user-defined CEP rules (sequences, conjunctions, disjunctions, negations, bounded windows) over the extracted event streams, per entity.
This modular design supports expressive temporal patterns without incurring quadratic context explosion characteristic of stateless LLM prompting. Patterns such as “sequence with bounded negation within a window” are naturally captured.
Evaluations on five event extraction tasks over clinical documents (MIMIC-IV) demonstrate strong improvements: sem_pattern consistently achieves higher F1 and lower token requirements than baselines, which repeatedly prompt LLMs with full or RAG-augmented context. Notably, the stateless full-context baseline exceeds GPU VRAM for Qwen deployments (OOM) and yields poor F1 (0.675), while sem_pattern (+RAG) attains F1 up to 0.862 with a near 5x reduction in token consumption. This evidence supports the core claim: decoupling extraction from temporal logic enables efficient, scalable long-horizon semantic pattern detection.
Interactive Pipeline Synthesis and Profiling
VectraFlow's interface exposes the complete semantic pipeline lifecycle, from natural language-based authoring to real-time debugging and profiling (Figure 3):


Figure 3: VectraFlow UI: (a) NL→Config pipeline authoring, (b) live per-operator inspection, (c) execution profiling with metrics and LLM analysis.
- NL→Config View: Users express intended analytics as natural language; the system synthesizes, annotates, and exposes the resultant operator graph and prompts, supporting in-place refinement.
- Query Processing View: Every operator’s transformation (including intermediate semantic event streams and rule detections) is inspectable, facilitating transparent model and logic debugging.
- Report View: Detailed per-operator metrics, including wall time, throughput, token counts, and (where possible) accuracy, support fine-grained profiling for optimization.
This capability facilitates pipeline-level prompt engineering, systematic exploration of tradeoffs, and robust debugging of both extraction and reasoning errors—capabilities not available in monolithic LLM agent frameworks.
Implications and Future Directions
VectraFlow generalizes LLM-based stream analytics along several dimensions:
- Unified Operator Abstractions: Merges relational and CEP-style operators over unstructured inputs with stateful, temporally aware reasoning.
- Systematic Throughput/Accuracy Tuning: Offers configurable tradeoff envelopes, enabling context-sensitive allocation of LLM and embedding resources to maximize throughput under cost or quality constraints.
- Transparent Pipeline Instrumentation: Promotes full-system observability, a prerequisite for enterprise adoption in regulated or mission-critical domains.
- Foundations for Agentic Reasoning: The separation of semantic extraction and temporal logic can serve as a substrate for future neuro-symbolic or memory-augmented agent architectures.
The architecture’s modularity and composability make it amenable to further advances, such as:
- Incorporating context-adaptive prompt optimization or cost-based semantic operator selection.
- Extending to multimodal or multi-entity event detection.
- Integrating learned pattern discovery atop programmable rule-based temporal logic.
Conclusion
VectraFlow (2604.03855) advances the state-of-the-art in LLM-powered analytics by introducing continuous, stateful, and semantically expressive stream processing over unstructured data. By unifying LLM-based semantic extraction with efficient, automaton-driven temporal reasoning in an inspectable, tunable dataflow system, it significantly broadens the applicability of AI-driven event analytics. This paradigm will inform the design of future real-time, robust, and accountable unstructured stream processing systems in clinical, financial, and compliance-critical domains.