Architecture¶

Video Intelligence Agent is a cloud-native agent deployed on Google Cloud Run, built to handle large numbers of concurrent video analysis requests without performance degradation.

System Overview¶

graph TB
    User(["User / Client"])

    subgraph VideoIntelligenceAgent["Video Intelligence Agent  (Google Cloud Run)"]
        Server["HTTP/2 Server"]
        Protocol["A2A Protocol Layer"]
        TaskMgr["Task Manager"]
        Executor["BDD Agent Executor"]
        CoreAgent["BDD Generator Agent"]
    end

    GeminiAPI(["Google Gemini AI"])

    User -->|"Video + optional text context"| Server
    Server --> Protocol
    Protocol --> TaskMgr
    Protocol --> Executor
    TaskMgr -->|"Tracks task state"| Executor
    Executor --> CoreAgent
    CoreAgent -->|"Uploads video & requests BDD generation"| GeminiAPI
    GeminiAPI -->|"Structured BDD JSON"| CoreAgent
    CoreAgent -->|"Feature files + summary"| Executor
    Executor -->|"Real-time streaming updates"| User

Components¶

HTTP/2 Server¶

Video Intelligence Agent runs on an HTTP/2-native web server. HTTP/2 allows multiple streams to flow over a single connection simultaneously, which is essential for delivering real-time SSE (Server-Sent Events) streams — one per status update and one per generated feature file — without any blocking.

A2A Protocol Layer¶

This layer implements the Agent-to-Agent (A2A) Protocol v1.0. It is responsible for:

Advertising the agent's identity and capabilities via the Agent Card
Accepting structured requests from any A2A-compatible client
Routing each request to the appropriate handler

Task Manager¶

Every request is tracked as an independent task. The task manager maintains the lifecycle state of each request — from the moment it is received until a result is returned or an error is reported. Because each video analysis is completely self-contained, tasks are ephemeral and do not persist across requests.

BDD Agent Executor¶

The executor bridges the A2A protocol to the actual generation logic. It:

Extracts the video and any optional text from the incoming request
Emits real-time status updates as the task progresses
Orchestrates the call to the core agent
Packages each generated feature file as a deliverable artifact
Handles failure scenarios gracefully

BDD Generator Agent¶

The core intelligence of Video Intelligence Agent. It:

Uploads your video to Gemini's File API for processing
Asks Gemini to analyze the video and produce structured BDD output
Validates and parses the response
Cleans up the uploaded video once processing is complete

Streaming Flow¶

Video Intelligence Agent streams results back to the caller in real time via Server-Sent Events (SSE). You receive updates progressively rather than waiting for the entire generation to complete.

sequenceDiagram
    participant User
    participant VideoIntelligenceAgent as Video Intelligence Agent
    participant Gemini as Gemini AI

    User->>Video Intelligence Agent: Request (video + optional context)
    Video Intelligence Agent-->>User: Task received
    Video Intelligence Agent-->>User: Status — Analyzing video…
    Video Intelligence Agent->>Gemini: Upload video
    Video Intelligence Agent->>Gemini: Generate BDD test cases
    Gemini-->>Video Intelligence Agent: Structured output
    Video Intelligence Agent-->>User: Status — Generating feature files…
    Video Intelligence Agent-->>User: Artifact — authentication/login.feature
    Video Intelligence Agent-->>User: Artifact — checkout/payment.feature
    Video Intelligence Agent-->>User: Artifact — summary.json
    Video Intelligence Agent-->>User: Completed

Request Lifecycle¶

flowchart LR
    A(["Incoming Request"]) --> B["A2A Protocol Layer"]
    B --> C["BDD Agent Executor"]
    C --> D{"Video present?"}
    D -- No --> E(["Failed: No video found"])
    D -- Yes --> F["Upload to Gemini"]
    F --> G["Generate BDD content"]
    G --> H["Emit feature artifacts"]
    H --> I["Emit summary"]
    I --> J(["Completed"])

Error Handling¶

Video Intelligence Agent handles all failure modes gracefully and returns a clear status message to the caller:

Failure	User-facing Message
No video in request	Prompt to provide a video file
Gemini API error	Suggestion to check credentials or quota
Unexpected AI response	Suggestion to retry the request
Network / IO error	Suggestion to check connectivity and retry