A2A Protocol¶

Video Intelligence Agent is built on the Agent-to-Agent (A2A) Protocol v1.0 — an open standard for enabling seamless, interoperable communication between AI agents and the clients or platforms that talk to them.

What is the A2A Protocol?¶

The A2A Protocol defines a standardized way for AI agents to:

Advertise their identity and capabilities (via an Agent Card)
Accept requests from any compatible client or platform
Manage tasks through a well-defined lifecycle (Submitted → Working → Completed)
Return results as structured artifacts
Stream updates in real time as work progresses

Because Video Intelligence Agent is A2A-compliant, it can be invoked by any client, orchestrator, or platform that understands the A2A standard — no custom integration code required.

Agent Card¶

Every A2A agent publishes an Agent Card — a machine-readable document that describes what the agent can do. Video Intelligence Agent's Agent Card is automatically served and contains:

Field	Value
Name	Video Intelligence Agent
Version	1.0.0
Protocol Version	1.0
Streaming	Enabled
Provider	LTIMindtree

Accepted Input¶

Format	Description
MP4 video	Standard video format
WebM video	Open web video format
QuickTime video	Apple video format
Plain text	Optional context alongside the video

Produced Output¶

Format	Description
Gherkin `.feature` files	One per identified business domain
`summary.json`	Metadata: total features, scenarios, flows identified

Skill¶

Video Intelligence Agent exposes a single skill:

BDD Test Case Generation from Video — Accepts a video demonstrating application workflows and generates comprehensive Gherkin feature files covering all visible user flows.

Example prompts recognized by this skill:

“Generate BDD test cases from this application demo video”
“Analyze this screen recording and create Gherkin feature files”
“Watch this video and write comprehensive test scenarios”

Task Lifecycle¶

Every request to Video Intelligence Agent is tracked as a task with a defined lifecycle:

stateDiagram-v2
    [*] --> Submitted : Request received
    Submitted --> Working : Analysis begins
    Working --> Working : Incremental status updates
    Working --> Completed : All artifacts delivered
    Working --> Failed : An error occurred
    Working --> Canceled : Client canceled the request
    Completed --> [*]
    Failed --> [*]
    Canceled --> [*]

What Each State Means¶

State	What is happening
Submitted	The request has been received and the task has been created
Working — Analyzing video	The video has been forwarded to Gemini AI for multimodal analysis
Working — Generating feature files	Gemini has responded; Video Intelligence Agent is assembling the feature files
Completed	All feature files and metadata have been delivered
Failed	An unrecoverable error occurred; a descriptive message is returned
Canceled	The client requested cancellation before completion

Communication Modes¶

Video Intelligence Agent supports two modes of communication:

Mode	Description	Best For
Send (non-streaming)	Single request — single response after full completion	Programmatic clients
Stream (SSE)	Real-time event stream — each status update and artifact arrives as it is ready	Interactive clients, dashboards

Security¶

Video Intelligence Agent uses standard Google identity-based authentication:

Scheme	Description
Google OpenID Connect	Authenticate via your Google account. Obtain a token using `gcloud auth print-identity-token`. Pass as a Bearer token.
HTTP Bearer (JWT)	Google-issued JWT for Cloud Run IAM or Cloud Marketplace authentication.

Health Checks¶

Two health-check endpoints are available outside the A2A protocol for infrastructure monitoring:

Endpoint	Response
`GET /health`	`{"status": "ok"}`
`GET /healthz`	`{"status": "ok"}`