Part

Part

A Part is the atomic unit of content that makes up a Message or Artifact in the A2A protocol. It allows different types of data to be combined and exchanged within a single message or artifact.

Role of Parts

  • Content Carrier: Each Part contains a specific piece of content.
  • Type Identification: Each Part explicitly identifies its content type (type), typically using MIME types (e.g., text/plain, image/png, application/json) or other predefined types.
  • Metadata: Can contain metadata related to that specific part.
  • Flexibility: By combining different Parts, Messages and Artifacts can flexibly represent complex information structures, such as a message containing text instructions, JSON data, and file references simultaneously.

Different Types of Parts

The A2A protocol defines various Part types to accommodate different interaction needs. Here are some examples (based on source documentation and common patterns):

  • TextPart: Used for transmitting plain text content.
    interface TextPart {
      type: "text" | "text/plain"; // Or other text-related MIME types
      text: string;
    }
  • FilePart: Used for referencing or transmitting files. Specific implementations might involve file URIs, inline data (Base64), etc.
    // Example structure (specific definition may vary)
    interface FilePart {
      type: "file" | string; // e.g., "image/jpeg", "application/pdf"
      uri?: string; // URI pointing to the file
      data?: string; // Base64 encoded file content
      filename?: string;
    }
  • JsonPart: Used for transmitting structured JSON data.
    interface JsonPart {
      type: "json" | "application/json";
      json: any; // The actual JSON object or array
    }
  • FormPart: Used for presenting and submitting forms.
  • IFramePart: Used for embedding web content.
  • Other Specific Types: The protocol can also define other specific Part types to support particular application scenarios (e.g., video streams, audio streams).

The design of Part grants the A2A protocol high extensibility and the capability to support multi-modal content.