Skip to content

A2A Protocol

Video Intelligence Agent is built on the Agent-to-Agent (A2A) Protocol v1.0 — an open standard for enabling seamless, interoperable communication between AI agents and the clients or platforms that talk to them.


What is the A2A Protocol?

The A2A Protocol defines a standardized way for AI agents to:

  • Advertise their identity and capabilities (via an Agent Card)
  • Accept requests from any compatible client or platform
  • Manage tasks through a well-defined lifecycle (Submitted → Working → Completed)
  • Return results as structured artifacts
  • Stream updates in real time as work progresses

Because Video Intelligence Agent is A2A-compliant, it can be invoked by any client, orchestrator, or platform that understands the A2A standard — no custom integration code required.


Agent Card

Every A2A agent publishes an Agent Card — a machine-readable document that describes what the agent can do. Video Intelligence Agent's Agent Card is automatically served and contains:

Field Value
Name Video Intelligence Agent
Version 1.0.0
Protocol Version 1.0
Streaming Enabled
Provider LTIMindtree

Accepted Input

Format Description
MP4 video Standard video format
WebM video Open web video format
QuickTime video Apple video format
Plain text Optional context alongside the video

Produced Output

Format Description
Gherkin .feature files One per identified business domain
summary.json Metadata: total features, scenarios, flows identified

Skill

Video Intelligence Agent exposes a single skill:

BDD Test Case Generation from Video — Accepts a video demonstrating application workflows and generates comprehensive Gherkin feature files covering all visible user flows.

Example prompts recognized by this skill:

  • “Generate BDD test cases from this application demo video”
  • “Analyze this screen recording and create Gherkin feature files”
  • “Watch this video and write comprehensive test scenarios”

Task Lifecycle

Every request to Video Intelligence Agent is tracked as a task with a defined lifecycle:

stateDiagram-v2
    [*] --> Submitted : Request received
    Submitted --> Working : Analysis begins
    Working --> Working : Incremental status updates
    Working --> Completed : All artifacts delivered
    Working --> Failed : An error occurred
    Working --> Canceled : Client canceled the request
    Completed --> [*]
    Failed --> [*]
    Canceled --> [*]

What Each State Means

State What is happening
Submitted The request has been received and the task has been created
Working — Analyzing video The video has been forwarded to Gemini AI for multimodal analysis
Working — Generating feature files Gemini has responded; Video Intelligence Agent is assembling the feature files
Completed All feature files and metadata have been delivered
Failed An unrecoverable error occurred; a descriptive message is returned
Canceled The client requested cancellation before completion

Communication Modes

Video Intelligence Agent supports two modes of communication:

Mode Description Best For
Send (non-streaming) Single request — single response after full completion Programmatic clients
Stream (SSE) Real-time event stream — each status update and artifact arrives as it is ready Interactive clients, dashboards

Security

Video Intelligence Agent uses standard Google identity-based authentication:

Scheme Description
Google OpenID Connect Authenticate via your Google account. Obtain a token using gcloud auth print-identity-token. Pass as a Bearer token.
HTTP Bearer (JWT) Google-issued JWT for Cloud Run IAM or Cloud Marketplace authentication.

Health Checks

Two health-check endpoints are available outside the A2A protocol for infrastructure monitoring:

Endpoint Response
GET /health {"status": "ok"}
GET /healthz {"status": "ok"}