Deprecated: The each() function is deprecated. This message will be suppressed on further calls in /home/zhenxiangba/zhenxiangba.com/public_html/phproxy-improved-master/index.php on line 456
Open AGI Codes | Your Codes Reflect! | Transforming Tomorrow, One Algorithm at a Time: The AI Revolution | Model Context Protocol
[go: Go Back, main page]

loader

Do you want to check out our featured section?

Featured

What is the Model Context Protocol (MCP)?

The Model Context Protocol (MCP) is an open protocol that enables seamless integration between LLM applications and external data sources and tools. Whether you're building an AI-powered IDE, enhancing a chat interface, or creating custom AI workflows, MCP provides a standardized way to connect LLMs with the context they need. The current baseline is the official MCP specification (2025-06-18). The next major revision — the 2026-07-28 release candidate — was published on May 21, 2026 and represents the largest update to the protocol since its launch, with the final specification expected on July 28, 2026.

MCP enables applications to:

  • Share contextual information with language models
  • Expose tools and capabilities to AI systems
  • Build composable integrations and workflows
  • Deliver rich, interactive UIs through MCP Apps and agentic UI frameworks

The protocol uses JSON-RPC 2.0 messages to establish communication between:

  • Hosts: LLM applications that initiate connections
  • Clients: Connectors within the host application
  • Servers: Services that provide context and capabilities

MCP takes inspiration from the Language Server Protocol, which standardizes how to add support for programming languages across a whole ecosystem of development tools. In a similar way, MCP standardizes how to integrate additional context and tools into the ecosystem of AI applications. MCP is now governed by the Agentic AI Foundation (AAIF) under the Linux Foundation, ensuring vendor-neutral governance and broad industry support.

MCP 2026-07-28 Release Candidate: What's New

On May 21, 2026, the MCP maintainers released the 2026-07-28 release candidate — the largest revision of the protocol since its launch. This release delivers on the 2026 roadmap and contains breaking changes. The final specification ships on July 28, 2026, with a 10-week validation window for implementers. As analyzed by the Agentic AI Foundation (AAIF), this release makes MCP easier to run, reason about, and extend in agentic systems.

Stateless Protocol Core

The headline change is that MCP is now stateless at the protocol layer. Six Specification Enhancement Proposals (SEPs) work together to eliminate the session-based architecture that previously required sticky sessions and shared session stores for horizontal deployments.

  • Handshake Removed: The initialize/initialized handshake is eliminated (SEP-2575). Protocol version, client info, and client capabilities now travel in _meta on every request.
  • Session IDs Removed: The Mcp-Session-Id header and protocol-level sessions are removed (SEP-2567). Any MCP request can land on any server instance.
  • Routable Headers: The Streamable HTTP transport now requires Mcp-Method and Mcp-Name headers (SEP-2243) so load balancers, gateways, and rate-limiters can route on the operation without inspecting the body.
  • Server Discovery: A new server/discover method lets clients fetch server capabilities on demand.

Stateless protocol, stateful applications: Removing protocol-level sessions does not mean applications must be stateless. Servers that need state across calls can mint explicit handles (e.g., a basket_id or browser_id) and have the model pass them back as ordinary arguments. This makes state visible to the model rather than hidden in transport metadata.

Before and After: tools/call Request

2025-11-25 (Session-based): Client must establish a session first, then carry the Mcp-Session-Id on every request:

POST /mcp HTTP/1.1
Mcp-Session-Id: 1868a90c-3a3f-4f5b
Content-Type: application/json

{"jsonrpc":"2.0","id":2,"method":"tools/call",
 "params":{"name":"search","arguments":{"q":"otters"}}}

2026-07-28 (Stateless): A single self-contained request that any server instance can handle:

POST /mcp HTTP/1.1
MCP-Protocol-Version: 2026-07-28
Mcp-Method: tools/call
Mcp-Name: search
Content-Type: application/json

{"jsonrpc":"2.0","id":1,"method":"tools/call",
 "params":{"name":"search","arguments":{"q":"otters"},
 "_meta":{"io.modelcontextprotocol/clientInfo":
   {"name":"my-app","version":"1.0"}}}}

Extensions Become First-Class

Extensions existed in the 2025-11-25 release but had no formal process. SEP-2133 adds a proper framework: extensions are identified by reverse-DNS IDs, negotiated through an extensions map on client and server capabilities, live in their own ext-* repositories with delegated maintainers, and version independently of the specification. A new Extensions Track in the SEP process gives them a path from experimental to official.

MCP Apps: Server-Rendered User Interfaces

MCP Apps (SEP-1865) lets servers ship interactive HTML interfaces that hosts render in a sandboxed iframe. Tools declare their UI templates ahead of time so hosts can prefetch, cache, and security-review them before anything runs. The rendered UI talks back to the host over the same JSON-RPC protocol used everywhere in MCP, so every UI-initiated action goes through the same audit and consent path as a direct tool call.

Tasks Extension

Tasks shipped as an experimental core feature in 2025-11-25. Production use surfaced enough redesign needs that it has graduated to an official extension. The Tasks extension reshapes the lifecycle around the stateless model:

  • A server can answer tools/call with a task handle
  • The client drives the task via tasks/get, tasks/update, and tasks/cancel
  • Task creation is server-directed: the server decides when a call should run as a task
  • tasks/list is removed because it cannot be scoped safely without sessions

Authorization Hardening

Six SEPs harden the authorization specification to align more closely with how OAuth 2.0 and OpenID Connect are deployed in practice:

  • Issuer Validation: Clients must validate the iss parameter on authorization responses per RFC 9207 (SEP-2468), mitigating mix-up attacks prevalent in MCP's single-client, many-server pattern.
  • Application Type: Clients now declare their OpenID Connect application_type during Dynamic Client Registration (SEP-837), preventing authorization servers from defaulting desktop/CLI clients to "web."
  • Credential Binding: Clients bind registered credentials to the authorization server that issued them, preventing cross-server credential confusion.
  • Token Refresh: Improved token refresh semantics for long-running agentic workflows.

Multi Round-Trip Requests

Server-initiated requests (e.g., elicitation prompts) are restructured for the stateless model. Instead of holding SSE streams open, the server returns an InputRequiredResult:

{
  "resultType": "inputRequired",
  "inputRequests": {
    "confirm": {
      "type": "elicitation",
      "message": "Delete 3 files?",
      "schema": { "type": "boolean" }
    }
  },
  "requestState": "eyJzdGVwIjoxLC..."
}

The client gathers answers and re-issues the original call with inputResponses and the echoed requestState. Any server instance can pick up the retry because everything it needs is in the payload.

Caching, Observability, and Schema Improvements

  • Caching: List and resource read results now carry ttlMs and cacheScope (SEP-2549), modeled on HTTP Cache-Control. Clients know exactly how long a tools/list response is fresh and whether it's safe to share across users.
  • Observability: W3C Trace Context propagation in _meta is now documented (SEP-414), locking down traceparent, tracestate, and baggage key names for distributed traces across SDKs and gateways via OpenTelemetry.
  • Full JSON Schema 2020-12: Tool input schemas now support composition, conditionals, and references (still with type: "object" root constraint). Output schemas are unrestricted, and structuredContent can be any JSON value.

Deprecations and Feature Lifecycle

The release candidate introduces a formal feature lifecycle policy and deprecates three features:

Deprecated Feature Replacement Direction
Roots Tool parameters, resource URIs, or server configuration
Sampling Direct integration with LLM provider APIs
Logging stderr for stdio; OpenTelemetry for structured observability

Deprecations are annotation-only: methods and capability flags still work in this release and in every specification version published within 12 months. Removal requires a separate SEP.

The Future of Agentic UI Frameworks

AI agents are evolving beyond text-only chat to deliver interactive, server-rendered user interfaces directly within conversations. This shift — from "pages you navigate" to "outcomes you request" — is being driven by three major approaches: the official MCP Apps standard, the community-driven MCP-UI framework, and OpenAI's Apps SDK for ChatGPT.

MCP Apps (SEP-1865) — The Official MCP Standard

MCP Apps is the official extension in the 2026-07-28 release candidate that enables servers to ship interactive HTML interfaces directly within AI agent conversations. It represents the protocol-level standard for agentic UI.

  • Sandboxed Rendering: Hosts render server-provided UIs in sandboxed iframes, ensuring security isolation between the agent interface and the embedded application.
  • Pre-declared Templates: Tools declare their UI templates ahead of time, allowing hosts to prefetch, cache, and security-review them before rendering.
  • Unified Audit Path: The rendered UI communicates back to the host over the same JSON-RPC protocol used throughout MCP, ensuring every UI-initiated action goes through the same audit and consent path as a direct tool call.
  • Portability: As a protocol-level standard, MCP Apps work across any MCP-compatible host application.

MCP-UI — The Community Pioneer

MCP-UI is an open-source SDK collection that pioneered the delivery of interactive UI components over MCP, directly influencing the official MCP Apps specification (SEP-1865). It provides comprehensive tooling for both server developers and host application builders.

  • Server SDKs: Available for TypeScript, Python, and Ruby — providing utilities like createUIResource to generate UI content that can be sent to AI agents.
  • Client SDKs: The @mcp-ui/client package provides React components (<AppRenderer />, <UIResourceRenderer />) and standard Web Components for secure rendering.
  • Multiple Content Types: Supports Inline HTML (self-contained snippets), External URLs (full web apps in iframes), and Remote DOM (lightweight host-rendered interfaces based on Shopify's remote-dom pattern).
  • Testing Tools: The ui-inspector tool enables local testing of MCP servers to verify UI resource rendering before deployment.
  • Compatibility: Designed to work with both the official MCP Apps standard and legacy MCP-UI implementations.

OpenAI Apps SDK — Branded Mini-Applications for ChatGPT

The OpenAI Apps SDK is a framework for building branded applications that run inside ChatGPT. Built on MCP, it consists of an MCP server backend and an interactive UI frontend rendered within the ChatGPT interface.

  • Three Display Modes:
    • Inline: Apps render directly within the conversation flow — ideal for quick interactions, status updates, and small widgets.
    • Picture-in-Picture (PiP): Apps appear as floating cards above the conversation for multi-tasking scenarios.
    • Fullscreen: Apps take over the entire conversation area for immersive experiences like dashboards and design editors.
  • Context-Aware Invocation: Users invoke apps via @AppName or ChatGPT suggests them automatically when relevant intent is detected.
  • Complementary Tools: AgentKit for agent orchestration, ChatKit for embeddable chat interfaces with customizable themes and widgets.
  • Distribution: Apps are published via the ChatGPT App Directory after review. MCP standards are encouraged for cross-platform portability.

Agentic UI Frameworks Comparison

Aspect MCP Apps (Official) MCP-UI (Community) OpenAI Apps SDK
Scope Protocol-level standard Open-source SDK / Framework Platform-specific SDK
Rendering Sandboxed iframes Inline HTML, URLs, Remote DOM Inline, PiP, Fullscreen
Server SDKs Via MCP SDK TypeScript, Python, Ruby Python, Node.js
Client SDKs Host implements React, Web Components React (ChatGPT host)
Portability Any MCP host Any MCP host ChatGPT + MCP-compatible hosts
Governance AAIF / Linux Foundation Open source community OpenAI

Trends Shaping Agentic UI

  • Protocol Convergence: Industry efforts (MCP, A2A, AG-UI) are increasingly interoperable, building toward an "Agent Internet" where agents negotiate and delegate across platforms.
  • Human-in-the-Loop: Approval workflows are now a first-class architectural concern, moving from a limitation to a strategic necessity for enterprise trust.
  • From Chat to Flow: The user experience is shifting toward "Agentic UX" where chat is the control plane for managing autonomous processes, not just a text conversation.
  • Generative UI: Applications are moving from "pages you navigate" to "outcomes you request," with UI assembled dynamically based on user intent and agent capabilities.

2026 MCP Roadmap

The 2026 MCP roadmap is organized around priority areas rather than release milestones, reflecting the project's transition to Working Group-driven development. Core maintainers ranked candidate areas, resulting in four priorities where SEPs receive expedited review.

Transport Evolution and Scalability

The work focuses on evolving the Streamable HTTP transport so servers can scale horizontally without holding state, and establishing a standard metadata format (served via .well-known) so server capabilities are discoverable without a live connection. The maintainers are explicit: no new official transports are being added this cycle — keeping the set small is a deliberate design decision.

Agent Communication

The Tasks primitive (SEP-1686) works well for its intended purpose, but production use has surfaced lifecycle gaps: retry semantics for transient failures and expiry policies for completed task results. The approach is to ship experimental, gather production feedback, and iterate.

Governance Maturation

Currently every SEP requires full Core Maintainer review regardless of domain — a bottleneck. The goal is a documented contributor ladder (clear path from community participant to maintainer) and a delegation model where trusted Working Groups can accept SEPs in their domain without waiting on full core review.

Enterprise Readiness

Enterprises are deploying MCP and need conformance suites, registry support, and authorization improvements that match real deployment patterns. This area focuses on making MCP production-grade for Fortune 500 environments where AI agents connect to internal systems like CRMs, ticketing platforms, and databases.

AAIF Governance and MCP Ecosystem

MCP development is now guided by the Agentic AI Foundation (AAIF) under the Linux Foundation, ensuring a vendor-neutral governance process. This transition from a single-company project to community-governed standard marks a major milestone in MCP's maturity.

Governance Structure

  • Working Groups: Specialized groups drive protocol development in areas such as transport evolution, agent communication, and governance maturation.
  • Interest Groups: Community groups explore emerging areas and provide feedback to Working Groups.
  • SEP Process: Specification Enhancement Proposals follow a formal track with conformance requirements before reaching Final status.
  • Vendor-Neutral: Development is driven by community needs rather than single-company product decisions.

Ecosystem Adoption (Mid-2026)

  • 97 million+ monthly SDK downloads
  • 10,000+ active public MCP servers
  • Major Platform Support: Anthropic, OpenAI, Google DeepMind, Microsoft, and many more
  • Enterprise Deployment: Fortune 500 companies deploying MCP in production for agentic workflows connecting AI to CRMs, Jira, databases, and internal systems

MCP Architecture and Components

MCP standardizes interactions between AI models and data systems through three core elements:

  1. Standardized Integrations: Replaces custom connectors for each data source with a universal protocol, similar to USB-C for AI.
  2. Architecture:
    • MCP Hosts: Applications (e.g., Claude Desktop) that initiate data requests.
    • MCP Clients: Manage secure, one-to-one connections with servers.
    • MCP Servers: Lightweight programs exposing data sources (e.g., Google Drive, Slack) via structured primitives like Tools, Resources, and Prompts.
  3. Security: Maintains strict access controls and usage policies to prevent unauthorized data exposure.

This framework ensures AI models receive up-to-date, domain-specific context, improving response relevance and reducing hallucinations.

MCP Transport Mechanisms: From SSE to Streamable HTTP

MCP defines standard transport mechanisms for communication between clients and servers. The protocol has evolved significantly since its initial release to support more flexible and scalable deployments.

Standard Transport Mechanisms

The MCP specification currently defines two primary transport mechanisms:

  1. stdio (Standard Input/Output): Communication over standard in and out, designed for local MCP connections where the client launches the server as a subprocess. This remains the most common transport for local development and desktop applications.
  2. Streamable HTTP: The modern HTTP-based transport introduced in March 2025 that uses a single HTTP endpoint for bidirectional messaging. This is the recommended transport for web-based and remote MCP deployments.

Note: The original HTTP+SSE transport (which required separate /sse and /messages endpoints) has been deprecated and is no longer recommended for new implementations. Existing implementations should migrate to Streamable HTTP for better performance and maintainability.

Migration from Deprecated HTTP+SSE to Streamable HTTP

The transition from the deprecated HTTP+SSE transport to Streamable HTTP represents a significant advancement in MCP's capabilities. The HTTP+SSE transport has been officially deprecated and should not be used for new implementations:

Feature HTTP+SSE (Deprecated) Streamable HTTP (Current)
Status Deprecated - Not recommended for new implementations Current - Recommended for all HTTP-based deployments
Endpoint Structure Required separate /sse and /messages endpoints Single endpoint supporting both GET and POST methods
Server State Management Required stateful connections Supports both stateful and stateless server modes
Session Management Session ID passed as query parameter Session ID included in Mcp-Session-Id header
Resumability Limited support for connection recovery Built-in resumability with Last-Event-ID support
Scalability Required high-availability long-lived connections Enables stateless servers and horizontal scaling
Maintenance No longer maintained or updated Actively maintained with regular updates

Streamable HTTP Key Features

The new Streamable HTTP transport introduces several important capabilities:

  • Unified Endpoint: Single HTTP endpoint that handles both POST requests for client-to-server messages and GET requests for establishing SSE streams.
  • Flexible Server Modes: Servers can operate in stateless mode for simple tool-based interactions or stateful mode for complex workflows requiring session persistence.
  • Enhanced Security: Improved validation requirements including Origin header validation and proper authentication mechanisms.
  • Connection Resumability: Support for resuming broken connections using event IDs and Last-Event-ID headers.
  • Session Management: Optional session ID assignment for stateful servers with secure session termination capabilities.
  • Multiple Concurrent Streams: Clients can maintain multiple SSE streams simultaneously for different purposes.

Server-Sent Events in Streamable HTTP Context

Server-Sent Events (SSE) continue to play a crucial role in MCP's real-time communication capabilities within the Streamable HTTP transport:

  • Streaming Responses: When servers need to provide streaming responses to tool calls or send progress notifications, they can initiate SSE streams by returning Content-Type: text/event-stream.
  • Server-to-Client Notifications: SSE enables servers to proactively send notifications, tool list changes, or logging messages to clients.
  • Graceful Degradation: Servers can choose whether to support SSE streams, returning HTTP 405 Method Not Allowed if streaming is not available.
  • Event-Driven Architecture: SSE streams support event IDs for proper message ordering and replay capabilities.
  • Enhanced Resumability: Streamable HTTP improves on the original SSE implementation with better connection recovery and session management.

Note: While SSE remains a core technology within Streamable HTTP, the deprecated HTTP+SSE transport (which used separate endpoints) has been replaced by the unified Streamable HTTP approach.

Implementation Considerations

When implementing MCP with Streamable HTTP, developers should consider:

  • Transport Selection: Use Streamable HTTP for all new HTTP-based implementations. The deprecated HTTP+SSE transport should not be used for new projects.
  • Migration Strategy: If maintaining existing HTTP+SSE implementations, plan migration to Streamable HTTP as the deprecated transport will eventually be removed.
  • Infrastructure Compatibility: Streamable HTTP is "just HTTP," ensuring better compatibility with existing middleware, load balancers, and infrastructure tools.
  • Security Requirements: Proper validation of Origin headers, secure session ID generation, and appropriate authentication mechanisms are essential.
  • Error Handling: Robust error handling for disconnections, invalid sessions, and malformed requests.
  • Future-Proofing: Streamable HTTP is the future of MCP HTTP transport, ensuring long-term maintainability and feature support.

MCP Elicitation: Interactive User Input

Elicitation in the Model Context Protocol (MCP) is a feature introduced in the 2025-06-18 revision that enables servers to request additional information or structured input from users via the client during an interaction. This powerful capability allows for dynamic, interactive workflows while maintaining strong user control and privacy.

Core Elicitation Concepts

Elicitation enables servers to request user input at any point during an interaction, even nested inside other operations. This approach provides several key benefits:

  • Interactive Workflows: Servers can request user input at any point, even nested inside other operations
  • Flexible UI: Clients can implement any UI model for gathering user input; MCP doesn't dictate the interface
  • Structured Data: Requests specify the format using a restricted subset of JSON schema, ensuring correct data types and validation on the client side

Protocol Declaration and Capabilities

Clients supporting elicitation must declare this capability when initializing the protocol. The capability declaration ensures that both client and server understand the elicitation features available:

// Client capability declaration
{
  "capabilities": {
    "elicitation": {}
  }
}

Request and Response Flow

The elicitation process follows a structured request-response pattern:

  • Server Request: Server sends an elicitation request with a JSON schema defining the required input structure
  • Client Processing: Client presents the request to the user through its chosen UI implementation
  • User Response: User can accept (submit data), decline, or cancel the request
  • Server Handling: Server processes the response and continues the workflow accordingly

Example Elicitation Request

Here's an example of how a server might request a GitHub username:

// Server elicitation request
{
  "method": "elicitation/request",
  "params": {
    "id": "github-username-request",
    "title": "GitHub Username Required",
    "description": "Please provide your GitHub username to access repository information",
    "schema": {
      "type": "object",
      "properties": {
        "username": {
          "type": "string",
          "description": "Your GitHub username",
          "minLength": 1,
          "maxLength": 39
        }
      },
      "required": ["username"]
    }
  }
}

// User response (accept)
{
  "method": "elicitation/response",
  "params": {
    "id": "github-username-request",
    "action": "accept",
    "data": {
      "username": "octocat"
    }
  }
}

Schema Support and Limitations

Elicitation supports a restricted subset of JSON schema to simplify client implementation and ensure consistent behavior:

  • Primitive Types: Strings, numbers, booleans, and enums are supported
  • Flat Objects: Only "flat" objects with primitive properties are allowed (no nested objects or arrays)
  • Validation: Standard JSON schema validation rules apply (minLength, maxLength, minimum, maximum, etc.)
  • Required Fields: Properties can be marked as required using the standard JSON schema approach

User Control and Privacy

Elicitation is designed with strong user control and privacy protections:

  • Server Identification: Clients must always show which server is making the request
  • User Editing: Users can edit their responses before submitting
  • Clear Options: Users have explicit options to decline or cancel requests
  • Sensitive Data Protection: Servers are not permitted to request sensitive data
  • Explicit Approval: The protocol emphasizes explicit user approval and validation at all stages

Security Considerations

Elicitation includes several security measures to protect users:

  • Data Minimization: Only request the minimum necessary information
  • Transparent Requests: Clear descriptions of what data is being requested and why
  • User Consent: Explicit user consent required for all data collection
  • Audit Trail: All elicitation requests and responses should be logged for security auditing
  • Timeout Handling: Requests should have reasonable timeouts to prevent indefinite blocking

Implementation Best Practices

When implementing elicitation in MCP applications:

  • Progressive Disclosure: Request information only when needed, not all at once
  • Contextual Requests: Provide clear context about why the information is needed
  • Graceful Degradation: Design workflows to handle declined or cancelled requests
  • User Experience: Implement intuitive UI patterns for data collection
  • Error Handling: Handle validation errors and provide helpful feedback to users

Security Best Practices

MCP implementations must follow comprehensive security practices to protect against various attack vectors. The following security considerations are based on the official MCP specification and OAuth 2.0 security best practices.

Confused Deputy Problem

MCP proxy servers that connect to third-party APIs can be vulnerable to "confused deputy" attacks when using static client IDs:

  • Attack Vector: Attackers can exploit MCP servers proxying other resource servers by using the static client ID to bypass user consent
  • Risk: Unauthorized access to third-party APIs without explicit user approval
  • Mitigation: MCP proxy servers using static client IDs MUST obtain user consent for each dynamically registered client before forwarding to third-party authorization servers

Token Passthrough Anti-Pattern

Token passthrough is explicitly forbidden in the MCP authorization specification due to significant security risks:

  • Security Control Circumvention: Bypasses rate limiting, request validation, and traffic monitoring that depend on token audience constraints
  • Accountability Issues: MCP servers cannot identify or distinguish between clients when using upstream-issued tokens
  • Trust Boundary Violations: Breaks assumptions about origin and client behavior patterns
  • Future Compatibility Risk: Makes it difficult to evolve security models as requirements change
  • Mitigation: MCP servers MUST NOT accept any tokens that were not explicitly issued for the MCP server

Session Hijacking Prevention

Session hijacking attacks can occur when multiple stateful HTTP servers handle MCP requests:

  • Session Hijack Prompt Injection: Attackers can inject malicious events into shared queues that get processed by legitimate clients
  • Session Hijack Impersonation: Attackers can use stolen session IDs to impersonate legitimate users
  • Mitigation Strategies:
    • MCP servers that implement authorization MUST verify all inbound requests
    • MCP servers MUST NOT use sessions for authentication
    • Use secure, non-deterministic session IDs (UUIDs with secure random number generators)
    • Bind session IDs to user-specific information using format: <user_id>:<session_id>
    • Rotate or expire session IDs to reduce attack surface

Additional Security Measures

  • Input Validation: Validate all inputs using JSON schemas and implement proper sanitization
  • Rate Limiting: Implement rate limiting to prevent abuse and DoS attacks
  • Audit Logging: Log all authentication attempts, authorization decisions, and sensitive operations
  • Secure Headers: Implement security headers like Content-Security-Policy, X-Frame-Options, and HSTS
  • Transport Security: Use HTTPS/TLS for all HTTP-based MCP communications
  • Error Handling: Ensure error messages don't leak sensitive information or system details

Security

The Model Context Protocol (MCP) unlocks powerful AI-driven workflows, but its security posture requires deliberate hardening. The insights below summarize the most critical vulnerabilities discussed by the MCP security community and outline practical mitigations you can apply immediately.

Protocol-Level Weaknesses

  1. Session Exposure: Earlier MCP transports required session identifiers inside URLs (for example, /messages?sessionId=UUID), leaking sensitive tokens through logs, caches, and browser histories. Note: The 2026-07-28 release candidate removes protocol-level session IDs entirely (SEP-2567), eliminating this attack vector. The Mcp-Session-Id header and initialize/initialized handshake are deprecated.
  2. Missing Message Integrity: JSON-RPC messages are not signed by default, so tampering in transit can go undetected without an additional verification layer.
  3. Minimal Authentication Guidance: The specification historically left authentication and authorization unspecified, leading to inconsistent implementations and confused-deputy scenarios where servers act on behalf of the wrong principal.
  4. Optimistic Trust Model: MCP assumes cooperative servers and tools; malicious servers can redefine tools or shadow legitimate ones to siphon data.

Implementation Hotspots

  • Command & SQL Injection: Poorly validated tool arguments frequently flow directly into shell commands or database queries.
  • Prompt Injection: Attackers hide instructions in documents or cached context so that an autonomous agent invokes dangerous tools without human review.
  • Token Theft: MCP servers often store OAuth tokens for multiple downstream services, creating “keys to the kingdom” if a single server is compromised.
  • Over-Permissioned Tools: Tools commonly request broad scopes such as unrestricted filesystem or network access, amplifying the blast radius of any exploit.
  • Session Hijacking: Weak or shared session identifiers enable attackers to impersonate legitimate users, especially in horizontally scaled HTTP deployments. Note: The 2026-07-28 release candidate makes MCP stateless at the protocol layer, removing session IDs and eliminating session hijacking as an attack class. Servers that need state use explicit application-level handles instead.
  • Known CVE: CVE-2025-49596 in Anthropic’s MCP Inspector (CVSS 9.4) allows unauthenticated remote code execution when the proxy is exposed without additional controls.

Supply Chain & Cross-Server Risks

MCP’s ecosystem encourages installing third-party servers straight from package registries. Backdoored binaries or configuration files can silently exfiltrate data once deployed. When multiple servers are chained together, a malicious server can override tool definitions from another server, log payloads, or forward traffic to untrusted destinations. Treat every MCP server as an untrusted component until it is audited, signed, and sandboxed.

MCP vs. UTCP Security Comparison

The Universal Tool Calling Protocol (UTCP) positions itself as a security-first alternative. The table summarizes the most relevant differences.

Aspect MCP UTCP
Architecture Proxy-based servers that broker every tool call, concentrating risk. Direct client-to-tool calls (HTTP, gRPC, CLI) remove the middle layer.
Authentication Dynamic OAuth client registration with frequent “token passthrough” anti-patterns. Uses each tool’s native auth (API keys, OAuth, Basic Auth) with automatic refresh.
Attack Surface High risk of local compromise, token theft, and cross-tool prompt abuse. Reduced credential aggregation but increased exposure to browser-based spoofing.
Compliance Hard to prove audience restrictions when tokens are proxied through servers. Clear attribution because credentials stay with their originating APIs.
Bidirectional Features Supports server-initiated calls, which attackers can abuse for token farming. No bidirectional callbacks; eliminates that attack class but also limits workflows.

In short, MCP enables richer orchestration but carries higher local-system risk. UTCP lowers operational overhead but shifts attention to capability spoofing and browser-origin protections.

Hardening Checklist

  • Least Privilege: Scope every tool credential and filesystem permission to the minimum necessary.
  • Zero-Trust Networking: Place MCP servers behind gateways, restrict outbound traffic, and log every tool invocation.
  • Input Validation: Treat LLM output as untrusted—sanitize arguments before they touch shells, SQL engines, or cloud APIs.
  • Credential Hygiene: Store secrets in dedicated secret managers, rotate regularly, and avoid token passthrough.
  • Sandboxing: Run servers in containers or VMs with seccomp/AppArmor profiles and read-only filesystems where practical.
  • Auditable Supply Chain: Pin versions, verify signatures, and review changelogs before promoting new MCP servers.
  • User Oversight: Require explicit approval for high-impact operations and expose clear provenance for every tool call.

Sources: Red Hat, Pillar Security, Strobes Security, Practical DevSecOps, Bitdefender, Qualys ThreatProtect, Upwind, UTCP project documentation.

2026-07-28 Authorization Hardening

The 2026-07-28 release candidate includes six SEPs that significantly harden the authorization specification, aligning it with production OAuth 2.0 and OpenID Connect deployments:

  • Issuer Validation (SEP-2468): Clients must validate the iss parameter on authorization responses per RFC 9207, mitigating mix-up attacks in MCP's single-client, many-server pattern.
  • Application Type (SEP-837): Clients declare their OIDC application_type during Dynamic Client Registration, preventing misclassification of desktop/CLI clients as "web."
  • Credential Binding: Registered credentials are bound to their issuing authorization server, preventing cross-server credential confusion.
  • Token Refresh: Improved refresh semantics for long-running agentic workflows that may span hours or days.
  • Stateless Model: With session IDs removed, there are no session tokens to steal — reducing the attack surface for horizontal deployments behind load balancers.

These changes address many of the "Minimal Authentication Guidance" and "Token Theft" concerns listed above, making MCP significantly more secure for enterprise deployments.

JSON-RPC Basics

JSON-RPC is a lightweight, stateless remote procedure call (RPC) protocol encoded in JSON, often used for communication between client and server applications. Below is an explanation and a basic example of using JSON-RPC in Python.

What is JSON-RPC?

  • JSON-RPC sends requests as JSON objects describing the method to call, its parameters, and an ID for tracking the response.
  • The server responds with a JSON object containing either the result or an error, along with the same ID for correlation.
  • It is transport-agnostic—can run over HTTP, WebSocket, etc.—and is commonly found in blockchain and API integrations.

Example: JSON-RPC in Python

Server Example

The following Python code creates a simple JSON-RPC server using the json-rpc library and Werkzeug:

from werkzeug.wrappers import Request, Response
from werkzeug.serving import run_simple
from jsonrpc import JSONRPCResponseManager, dispatcher

@dispatcher.add_method
def foobar(**kwargs):
    return kwargs["foo"] + kwargs["bar"]

@Request.application
def application(request):
    dispatcher["echo"] = lambda s: s
    dispatcher["add"] = lambda a, b: a + b

    response = JSONRPCResponseManager.handle(
        request.data, dispatcher)
    return Response(response.json, mimetype='application/json')

if __name__ == '__main__':
    run_simple('localhost', 4000, application)

This server can handle "add", "echo", and "foobar" methods via JSON-RPC.

Client Example

A simple client using the requests library:

import requests
import json

def main():
    url = "http://localhost:4000/jsonrpc"
    headers = {'content-type': 'application/json'}
    payload = {
        "method": "echo",
        "params": ["echome!"],
        "jsonrpc": "2.0",
        "id": 0,
    }
    response = requests.post(url, data=json.dumps(payload), headers=headers).json()
    print(response)

if __name__ == "__main__":
    main()

This client sends an "echo" call and prints the server's response.

Typical JSON-RPC Message Structure

  • Request:
    {
      "jsonrpc": "2.0",
      "method": "add",
      "params": [3, 4],
      "id": 1
    }
  • Response:
    {
      "jsonrpc": "2.0",
      "result": 7,
      "id": 1
    }

The server executes the requested method and returns the result in this format.

Model Context Protocol (MCP): JSON-RPC Foundation for LLM-Tool Integration

The Model Context Protocol (MCP) represents a standardized approach to connecting Large Language Models (LLMs) with external tools and resources through JSON-RPC 2.0. Developed by Anthropic, MCP enables secure, efficient communication between AI models and various data sources, APIs, and computational tools, creating a foundation for more capable and context-aware AI applications.

MCP Architecture and Core Concepts

MCP operates on a client-server model where LLMs act as clients that can discover, access, and utilize tools and resources provided by MCP servers. The protocol is built on JSON-RPC 2.0, ensuring standardized communication patterns across different implementations.

Key Components
  • MCP Servers expose tools, resources, and prompts to LLM clients
  • MCP Clients (typically LLMs) discover and utilize available capabilities
  • Tools are executable functions that perform specific actions
  • Resources are data sources that can be read, written, or monitored
  • Prompts are reusable templates for common interactions

JSON-RPC Implementation in MCP

MCP leverages JSON-RPC 2.0 as its communication foundation, providing a robust, language-agnostic protocol for LLM-tool interactions. The protocol structure follows standard JSON-RPC patterns:

{
  "jsonrpc": "2.0",
  "method": "tools/call",
  "params": {
    "name": "search_database",
    "arguments": {
      "query": "customer orders",
      "limit": 10
    }
  },
  "id": 1
}

MCP extends JSON-RPC with specialized methods for tool discovery, resource management, and prompt handling, creating a comprehensive framework for LLM-tool integration.

MCP Communication Flow

The typical MCP interaction demonstrates how JSON-RPC enables seamless LLM-tool communication:

Initialization Phase

MCP clients establish connections with servers through standardized handshake procedures, exchanging capability information and authentication credentials.

Discovery Phase

Clients query servers to discover available tools, resources, and prompts using JSON-RPC methods like tools/list, resources/list, and prompts/list.

Tool Execution
  1. Tool Invocation: Client sends JSON-RPC request with tool name and parameters
  2. Server Processing: MCP server executes the tool and processes the request
  3. Result Return: Server returns results through JSON-RPC response format
Resource Management

MCP supports dynamic resource access through methods like resources/read, resources/write, and resources/subscribe for real-time data monitoring:

{
  "jsonrpc": "2.0",
  "method": "resources/read",
  "params": {
    "uri": "file:///data/customer_analytics.json"
  },
  "id": 2
}

MCP vs. Other AI Integration Protocols

MCP distinguishes itself from other AI agent protocols through its specific focus and implementation approach:

Protocol Primary Focus Communication Method Use Case
MCP LLM-tool integration JSON-RPC 2.0 client-server AI model resource access
A2A Agent-to-agent collaboration JSON-RPC 2.0 over HTTP/SSE Enterprise multi-agent workflows
OpenAI Function Calling Tool integration Proprietary API format OpenAI model tool access
LangChain Tools Framework-specific tools Python object methods LangChain ecosystem integration

MCP Implementation Benefits

MCP's JSON-RPC foundation provides several advantages for LLM-tool integration:

  • Standardized tool discovery enabling LLMs to dynamically find and use available capabilities
  • Secure resource access with fine-grained permissions and authentication
  • Language-agnostic implementation supporting tools written in any programming language
  • Real-time resource monitoring through subscription-based updates
  • Vendor-neutral architecture reducing dependency on specific AI model providers

Python Implementation Example

A basic MCP server implementation demonstrating tool and resource exposure:

import asyncio
from mcp.server import Server
from mcp.types import Tool, Resource

# Create MCP server instance
server = Server("example-mcp-server")

# Define a tool for database queries
@server.tool("search_database")
async def search_database(query: str, limit: int = 10) -> dict:
    """Search the customer database for matching records."""
    # Simulate database search
    results = await perform_database_search(query, limit)
    return {
        "results": results,
        "count": len(results),
        "query": query
    }

# Define a resource for customer analytics
@server.resource("customer_analytics")
async def get_customer_analytics() -> dict:
    """Get current customer analytics data."""
    return {
        "total_customers": 1250,
        "active_orders": 89,
        "revenue_today": 15420.50
    }

# Handle MCP client connections
async def handle_mcp_client(reader, writer):
    await server.handle_connection(reader, writer)

# Start MCP server
async def main():
    server_instance = await asyncio.start_server(
        handle_mcp_client, 
        'localhost', 
        8080
    )
    print("MCP Server running on localhost:8080")
    await server_instance.serve_forever()

if __name__ == "__main__":
    asyncio.run(main())

MCP Client Integration

LLM clients can connect to MCP servers to access tools and resources:

import asyncio
from mcp.client import Client

async def main():
    # Connect to MCP server
    client = Client("localhost", 8080)
    await client.connect()
    
    # Discover available tools
    tools = await client.list_tools()
    print(f"Available tools: {[tool.name for tool in tools]}")
    
    # Use a tool
    result = await client.call_tool(
        "search_database",
        {"query": "recent orders", "limit": 5}
    )
    print(f"Search results: {result}")
    
    # Access a resource
    analytics = await client.read_resource("customer_analytics")
    print(f"Customer analytics: {analytics}")
    
    await client.disconnect()

if __name__ == "__main__":
    asyncio.run(main())

Advanced MCP Features

MCP supports several advanced features that enhance LLM-tool integration:

Resource Subscriptions

Clients can subscribe to resource changes for real-time updates:

{
  "jsonrpc": "2.0",
  "method": "resources/subscribe",
  "params": {
    "uri": "file:///data/live_metrics.json"
  },
  "id": 3
}
Prompt Templates

MCP supports reusable prompt templates for common interactions:

{
  "jsonrpc": "2.0",
  "method": "prompts/get",
  "params": {
    "name": "data_analysis_prompt",
    "arguments": {
      "dataset": "customer_behavior",
      "analysis_type": "trend"
    }
  },
  "id": 4
}

Security and Authentication

MCP implements robust security measures for LLM-tool interactions:

  • Token-based authentication for secure server access
  • Capability-based permissions controlling tool and resource access
  • Input validation ensuring safe parameter handling
  • Rate limiting preventing abuse and ensuring fair resource usage

Future of LLM-Tool Integration

The Model Context Protocol represents a significant advancement in standardized LLM-tool integration. As AI models become more capable and organizations deploy increasingly sophisticated AI applications, MCP enables:

  • Seamless tool discovery allowing LLMs to dynamically find and use available capabilities
  • Secure resource access with fine-grained control over data and functionality exposure
  • Vendor-neutral tool ecosystems reducing lock-in and increasing flexibility
  • Scalable AI architectures supporting complex workflows with multiple tools and resources

The adoption of JSON-RPC as the foundation for MCP demonstrates how established web standards can be effectively adapted to meet the unique requirements of LLM-tool integration, providing a solid technical foundation for the next generation of context-aware AI applications.

Practical Implementation Resources

For comprehensive Python-based examples and implementations of MCP, including working code samples, server configurations, and detailed documentation, visit the AI Agents Basics repository. This resource provides production-ready implementations that demonstrate best practices for building MCP servers and integrating them with LLM clients.

Production-Ready RAG Architecture for Model Context Protocol Servers

Main Takeaway:
An optimized Retrieval-Augmented Generation (RAG) pipeline tailored for Model Context Protocol (MCP) servers can deliver 98.5% cost savings and 13× lower latency by separating offline preprocessing from runtime execution, employing efficient semantic chunking, lightweight embeddings, and adaptive retrieval strategies.

Architecture Overview

This production-ready RAG system is specifically optimized for Model Context Protocol servers, emphasizing extreme cost efficiency (98.5% savings) and minimal latency while maintaining high-quality context retrieval. This architecture separates concerns into two distinct phases: offline preprocessing and runtime query execution.

This RAG architecture decouples the heavy lifting into:

  1. Offline Preprocessing: Ingest, segment, and embed documentation into a vector index.
  2. Runtime Query Execution: Embed incoming queries, retrieve top-k context chunks, and assemble a concise prompt.

Such separation ensures production readiness—minimal latency, tiny compute footprint, and dramatically reduced LLM token costs.

Offline Processing: The Foundation

Documentation Sources and Ingestion

The system begins by fetching documentation from various sources—GitHub repositories, API documentation, and user guides. This one-time setup phase is critical for establishing the knowledge base that the MCP server will expose to AI clients.

MCP servers typically focus on specific integration points (GitHub, PostgreSQL, file systems), and each requires tailored documentation to enable semantic understanding. The offline processing ensures that when an MCP client (embedded in applications like Claude Desktop or Cursor) connects, the server can immediately provide contextually relevant information.

  • Sources: GitHub repos, API references, user manuals, SQL schemas.
  • MCP Integration Points: Each connector (GitHub, PostgreSQL, filesystem) uses a tailored ingestion script to normalize metadata (function signatures, code examples, config snippets).

Chunking Engine: Semantic Segmentation

The chunking strategy uses 512-1024 token chunks with 20% overlap. This configuration represents industry best practices for RAG systems based on recent research.

Why 512-1024 tokens? This range balances semantic coherence with retrieval granularity. Chunks that are too small (under 200 tokens) fragment context and lose meaning, while overly large chunks (over 2048 tokens) dilute relevance and increase noise. For technical documentation typical in MCP use cases, 600-1000 tokens captures complete concepts—function definitions, usage examples, or configuration patterns—without splitting critical information mid-thought.

Why 20% overlap? Overlap prevents information loss at chunk boundaries, particularly important for technical content where a concept might span multiple sentences. A 20% overlap (roughly 100-200 tokens) ensures that:

  • Context continuity is preserved across chunks
  • Key phrases appearing near boundaries are captured in multiple chunks, improving retrieval recall by 15-30%
  • The model can reconstruct coherent narratives when multiple chunks are retrieved

Higher overlap (30%+) increases storage costs and redundancy without proportional retrieval gains, while lower overlap (under 10%) risks losing critical transitional information.

  • Chunk Size: 600–1,000 tokens, balancing complete technical concepts with retrieval granularity.
  • Overlap: 20% (≈120–200 tokens) to preserve context across splits and boost recall by 15–30%.
  • Advanced Option: Document-aware chunking that respects code blocks, headers, and tables—up to +40% retrieval accuracy for structured docs.

Embedding Model: all-MiniLM-L6-v2

The architecture specifies all-MiniLM-L6-v2 with 384-dimensional embeddings. This is an excellent choice for MCP server implementations due to several factors:

Efficiency: With only 22.7 million parameters and a 91MB model size, it's lightweight enough for edge deployment and local MCP server hosting. This aligns with MCP's design philosophy where servers often run as local processes via stdio transport.

Performance: Despite its compact size, all-MiniLM-L6-v2 was trained on over 1 billion sentence pairs using contrastive learning. It produces semantically rich embeddings suitable for technical documentation retrieval, capturing nuanced relationships between API methods, configuration parameters, and usage patterns.

Semantic Similarity Matching: The model was specifically trained using cosine similarity as its distance metric. This is crucial because cosine similarity measures directional alignment between vectors rather than absolute magnitude, making it ideal for semantic search where conceptual similarity matters more than exact phrasing.

The 384-dimensional output balances expressiveness with computational efficiency. Lower dimensions (128-256) might miss subtle semantic distinctions in technical content, while higher dimensions (768+, as in all-mpnet-base-v2) increase storage and compute costs without proportional gains for most MCP use cases.

  • Model: all-MiniLM-L6-v2 (384-dim)
    • Lightweight (22.7M parameters, 91 MB) for edge deployment.
    • Trained via contrastive learning on 1B+ sentence pairs, yielding semantically rich vectors.
    • Cosine similarity–optimized metric ensures robust semantic ranking.
  • Batch Processing: Executes on CPU or GPU cluster, writing embeddings to FAISS or SQLite-vec index.

Vector Database: Storage Options

The system indexes 4,250 chunks in-memory or SQLite. This dual-option approach reflects practical deployment considerations:

In-Memory Storage: Offers microsecond-level query latency and is ideal for MCP servers handling frequent, real-time queries. With 4,250 chunks at 384 dimensions (float32), the memory footprint is approximately 6.5MB for vectors alone—trivial for modern systems. In-memory databases like FAISS or Chroma can handle 10,000+ operations per second.

SQLite Storage: Provides persistence without requiring separate database infrastructure. SQLite's on-disk approach reduces application memory footprint to ~250-400KB overhead, with data stored on the filesystem. Modern SQLite supports vector similarity search through extensions like sqlite-vec, enabling HNSW indexing for efficient approximate nearest neighbor queries.

For MCP servers that need to persist across sessions or run in resource-constrained environments (edge devices, mobile apps), SQLite offers the perfect balance. The stdio transport pattern where MCP clients spawn servers as subprocesses makes SQLite particularly attractive—each server instance can quickly load its vector index from disk rather than recomputing embeddings.

  • In-Memory (FAISS/Chroma): < 10 ms retrieval, ideal for high-throughput local MCP deployments (e.g., VS Code, Claude Desktop).
  • SQLite with HNSW extension: Durable on-disk storage, ~0.4 MB overhead, perfect for edge or subprocess-spawned servers.

Runtime Query Retrieval Flow

User Query Processing

When a user issues a query like "How do I create an agent?" (~50 tokens), the MCP client sends it to the server via JSON-RPC 2.0 messages. MCP mandates JSON-RPC 2.0 for all client-server communication, ensuring standardized request/response patterns.

The query travels through the MCP transport layer—either stdio (for local integrations) or HTTP with SSE (for remote connections). Stdio is preferred for MCP implementations because it offers sub-millisecond latency by eliminating network stack overhead. When the server runs locally (common for Claude Desktop, Cursor, VS Code integrations), stdio achieves 10,000+ operations per second versus HTTP's 100-1,000 ops/sec.

Embedding and Semantic Search

The user query is embedded using the same all-MiniLM-L6-v2 model used during offline processing. Consistency between indexing and query embedding models is critical—using different models would map queries and documents into incompatible vector spaces, degrading retrieval accuracy.

The embedded query vector is then compared against the indexed chunks using cosine similarity. Cosine similarity computes the cosine of the angle between vectors, ranging from -1 (opposite directions) to +1 (identical directions). For normalized embeddings (as produced by all-MiniLM-L6-v2), cosine similarity is equivalent to the dot product, enabling highly optimized computation.

The system retrieves the top-5 most semantically similar chunks. This top-k parameter balances context richness with token efficiency. With chunks averaging 512 tokens, 5 chunks provide ~2,560 tokens of context—sufficient to answer most queries while leaving headroom for the query itself and model's response within typical context windows.

Context Assembly and LLM Integration

The retrieved chunks (totaling ~3,200 tokens including metadata) are assembled into a coherent context block. The MCP server then sends this context to the LLM client (Claude, GPT, etc.) along with the original user query (~50 tokens), resulting in approximately 3,250 total input tokens.

This is where the architecture's efficiency shines. By retrieving only the most relevant 3,250 tokens instead of naively passing all 217,600 tokens of documentation, the system achieves dramatic improvements across multiple dimensions.

  • Protocol: JSON-RPC 2.0 over stdio (local) or HTTP/SSE (remote). stdio eliminates network overhead, achieving < 1 ms transport latency.
  • Embed Query: Same all-MiniLM-L6-v2 model for index consistency.
  • Top-k Retrieval: k = 5 chosen to provide ~2,560 tokens of context—maximizing relevance while conserving context window.
  • Similarity Metric: Cosine similarity (dot product on normalized vectors), enabling optimized ANN search.
  • Concatenate top chunks (≈3,200 tokens incl. metadata) with the user query (≈50 tokens).
  • Total Input: ≈3,250 tokens, leaving >98% spare capacity in a 200 K token window for multi-turn dialogues, tool responses, or code snippets.

Performance Comparison: Anti-Pattern vs Smart RAG

The Anti-Pattern: Full Context Injection

The "anti-pattern" approach dumps all 217,600 tokens into the LLM context. While this ensures nothing is missed, it creates severe problems:

Cost: At $0.66 per call, this approach quickly becomes prohibitively expensive. For Claude 3.5 Sonnet priced at $3 per million input tokens, 217,600 tokens costs approximately $0.65. Over 1,000 daily queries, that's $650/day or $237,000 annually just for input tokens.

Context Window Utilization: The 217,600 tokens represent 108% of a typical context window. Claude 3.5 Sonnet supports 200,000 token contexts, meaning this approach literally exceeds the model's capacity without aggressive truncation. Even models with 200K+ windows suffer quality degradation when contexts approach their limits.

Latency: Processing 217,600 tokens introduces 2.6 seconds of latency. LLM inference scales roughly linearly with input token count, as each token must pass through attention mechanisms. For real-time MCP interactions (code completion, live documentation lookup), 2.6-second delays destroy user experience.

The Smart RAG Approach

By contrast, the optimized RAG system achieves remarkable efficiency:

Cost: $0.01 per call—a 98.5% reduction. With 3,250 input tokens at Claude 3.5 Sonnet's $3/MTok rate: (3,250 / 1,000,000) × $3 = $0.00975 ≈ $0.01. Over 1,000 daily queries, that's $10/day or $3,650 annually—a $233,000 savings compared to the anti-pattern.

Context Window Utilization: Only 1.6% of the context window. This leaves massive headroom for multi-turn conversations, code snippets, or additional tool outputs—essential for agentic MCP workflows where multiple servers contribute context.

Latency: 0.2 seconds—a 13x improvement. Sub-200ms response times enable real-time interactions where MCP servers feel instantaneous to users.

Metric Naïve Full-Dump Optimized RAG
Input Tokens per Query 217,600 3,250
Cost per Call $0.65 $0.01
Cost Savings 98.5%
Latency ~2.6 s ~0.2 s
Context Window Usage 108% (exceeds limit) 1.6%

Advanced Considerations

Context Window Utilization as a Hyper-Parameter

Recent research introduces Context Window Utilization as a formal RAG hyper-parameter. The optimal chunk size balances providing sufficient context against minimizing irrelevant information. The 512-1024 token range with top-5 retrieval represents a sweet spot: enough context to answer complex queries without overwhelming the model.

For MCP servers handling diverse query types (quick lookups vs. multi-step reasoning), dynamic adjustment of top-k based on query complexity can further optimize this balance.

Semantic Chunking Enhancements

While the basic approach uses fixed-size chunking with overlap, advanced implementations might incorporate semantic chunking—splitting documents based on meaning rather than token counts. For highly structured MCP documentation (API references, code examples), document-aware chunking that respects headers, code blocks, and tables can improve retrieval accuracy by 40%+.

RAG-MCP Integration Pattern

The architecture embodies principles from the RAG-MCP paper, which proposes using retrieval to dynamically select relevant tools/documentation rather than overwhelming the LLM with everything upfront. This is particularly powerful for MCP ecosystems where dozens of servers might be available—retrieving tool schemas on-demand prevents "prompt bloat" and scales gracefully.

Deployment Patterns

Given your background with cloud cost optimization and MCP server deployment:

Local MCP Servers (stdio transport): All-MiniLM-L6-v2 + SQLite enables fully self-contained servers that bundle documentation, embeddings, and retrieval logic in a single process. Startup time is under 1 second with persistent SQLite storage.

Remote MCP Servers (HTTP/SSE transport): For shared documentation services or enterprise deployments, the same architecture scales to handle multiple concurrent clients. A single RAG backend can serve hundreds of MCP clients, with retrieval costs amortized across users.

Cost Analysis: For a 1,000 request/day MCP server, the Smart RAG approach costs ~$3,650/year for LLM inference. Adding embedding costs (all-MiniLM-L6-v2 runs locally at zero marginal cost), vector storage (~1GB for 4,250 chunks), and compute (minimal for semantic search), total cost of ownership is under $5,000/year—trivial compared to productivity gains.

  • Dynamic Top-k: Adjust k based on query complexity—smaller for boolean lookups, larger for multi-step reasoning.
  • Adaptive Chunk Sizing: Automatically tune chunk length per document type (e.g., 800 tokens for prose, 512 tokens for code).
  • Relevance Thresholding: Discard chunks below a similarity cutoff to reduce noise.

Conclusion

This RAG architecture represents a mature, production-ready pattern for MCP server implementations. By combining efficient chunking strategies, lightweight embedding models, and intelligent retrieval, it achieves 98.5% cost savings and 13x latency improvements over naive approaches while maintaining high-quality responses.

The design aligns perfectly with MCP's philosophy of modular, standardized context provision. Whether you're building MCP servers for GitHub integration, database queries, or custom documentation systems, this architecture provides a proven blueprint for scalable, cost-effective semantic search.

Key Benefits:
By combining efficient chunking, lightweight embeddings, and adaptive retrieval, this RAG-MCP blueprint delivers production readiness, extreme cost efficiency, and sub-200 ms latency—empowering seamless AI agent interactions across local and cloud environments.

MCP Inspector: Visual Testing and Debugging Tool

The MCP Inspector is a powerful, open-source developer tool that provides an interactive visual interface for testing, debugging, and exploring MCP servers. It bridges the gap between server development and deployment by offering real-time insights into server behavior and capabilities.

Architecture Overview

The MCP Inspector consists of two main components working in tandem:

  • MCP Inspector Client (MCPI): A React-based web UI that provides an intuitive interface for interacting with MCP servers. It runs on port 6274 by default (derived from the T9 dialpad mapping of MCPI as a mnemonic).
  • MCP Proxy (MCPP): A Node.js server that acts as a protocol bridge, functioning as both an MCP client (connecting to your server) and an HTTP server (serving the web UI). It runs on port 6277 by default (T9 mapping of MCPP).

The proxy enables browser-based interaction with MCP servers using different transport protocols (stdio, SSE, streamable-http) without requiring direct browser support for these protocols.

Getting Started with MCP Inspector

The Inspector can be run directly through npx without requiring installation, making it immediately accessible for any MCP server development workflow.

Basic Usage

# Quick start with UI mode
npx @modelcontextprotocol/inspector

# Inspect a specific server
npx @modelcontextprotocol/inspector node build/index.js

# With arguments and environment variables
npx @modelcontextprotocol/inspector -e API_KEY=value node build/index.js arg1 arg2

Server Types Support

Server Type Command Example Use Case Transport
Local Development npx @modelcontextprotocol/inspector node build/index.js Testing locally developed servers stdio
NPM Packages npx @modelcontextprotocol/inspector npx server-postgres postgres://127.0.0.1/testdb Testing published NPM packages stdio
Python Packages npx @modelcontextprotocol/inspector uvx mcp-server-git --repository ~/code/repo.git Testing Python MCP servers stdio
Remote Servers npx @modelcontextprotocol/inspector https://my-server.com Testing remote HTTP-based servers Streamable HTTP (default)
Legacy Servers npx @modelcontextprotocol/inspector --transport http-sse https://legacy-server.com Testing deprecated HTTP+SSE servers HTTP+SSE (deprecated)

Core Features and Capabilities

Server Connection Management

  • Transport Selection: Supports current MCP transport mechanisms (stdio, Streamable HTTP) and deprecated HTTP+SSE for legacy compatibility
  • Connection Status Monitoring: Real-time connection status and capability negotiation visualization
  • Environment Configuration: Customizable command-line arguments and environment variables
  • Authentication Support: Bearer token authentication for HTTP-based connections with customizable header names
  • Transport Recommendations: Automatically recommends Streamable HTTP for new HTTP-based server configurations

Resources Tab

  • Resource Discovery: Lists all available resources with metadata (MIME types, descriptions)
  • Content Inspection: Allows detailed examination of resource content
  • Subscription Testing: Enables testing of resource change notifications and subscriptions
  • URI Validation: Helps verify proper resource URI formatting

Tools Tab

  • Tool Schema Visualization: Displays complete tool schemas and descriptions
  • Interactive Testing: Form-based parameter input with JSON editor support
  • Execution Results: Real-time display of tool execution results and errors
  • Parameter Validation: Helps ensure tools receive properly formatted inputs

Prompts Tab

  • Template Management: View and test available prompt templates
  • Argument Testing: Test prompts with custom arguments and different parameter combinations
  • Message Preview: Preview generated messages before execution
  • Sampling Support: Interactive prompt sampling with streaming responses

Notifications Pane

  • Comprehensive Logging: Displays all logs recorded from the server
  • Server Notifications: Shows real-time notifications received from the server
  • Error Tracking: Detailed error information with stack traces
  • Performance Monitoring: Request timing and performance metrics

Advanced Features

Configuration Export

The Inspector provides convenient export functionality for integrating tested configurations with production clients:

  • Server Entry Export: Copies individual server configurations for integration into existing mcp.json files
  • Complete Configuration Export: Generates full configuration files ready for use with clients like Cursor, Claude Code, or other MCP-compatible applications
  • Transport-Specific Configurations: Properly formatted configurations for stdio and Streamable HTTP transports, with deprecation warnings for HTTP+SSE
  • Migration Guidance: Provides recommendations for migrating from deprecated HTTP+SSE to Streamable HTTP configurations

CLI Mode

The Inspector also provides a command-line interface for programmatic interaction, ideal for automation and CI/CD integration:

# CLI mode for scripting
npx @modelcontextprotocol/inspector --cli node build/index.js

# List available tools
npx @modelcontextprotocol/inspector --cli node build/index.js --method tools/list

# Call a specific tool
npx @modelcontextprotocol/inspector --cli node build/index.js --method tools/call --tool-name mytool --tool-arg key=value

# Test remote servers (uses Streamable HTTP by default)
npx @modelcontextprotocol/inspector --cli https://my-server.com --method resources/list

# Test with specific transport (for legacy compatibility)
npx @modelcontextprotocol/inspector --cli https://my-server.com --transport http-sse --method resources/list

Development Workflow Integration

Best Practices for MCP Server Development

  1. Initial Development:
    • Launch Inspector with your server during development
    • Verify basic connectivity and capability negotiation
    • Test each MCP primitive (Tools, Resources, Prompts) individually
  2. Iterative Testing:
    • Make server code changes
    • Rebuild the server
    • Reconnect the Inspector to test changes
    • Monitor logs and messages for issues
  3. Edge Case Testing:
    • Test with invalid inputs to verify error handling
    • Test missing prompt arguments
    • Verify concurrent operations work correctly
    • Test connection recovery and resumability
    • Test transport-specific features (Streamable HTTP resumability, stdio process management)
    • Verify proper handling of deprecated transport warnings

Configuration and Customization

The Inspector supports extensive configuration options for different development environments:

Setting Description Default
MCP_SERVER_REQUEST_TIMEOUT Timeout for requests to the MCP server (ms) 10000
MCP_REQUEST_TIMEOUT_RESET_ON_PROGRESS Reset timeout on progress notifications true
MCP_REQUEST_MAX_TOTAL_TIMEOUT Maximum total timeout for requests (ms) 60000
MCP_PROXY_FULL_ADDRESS Custom proxy server address localhost
CLIENT_PORT / SERVER_PORT Custom ports for UI and proxy 6274 / 6277

UI Mode vs CLI Mode Comparison

Use Case UI Mode CLI Mode
Server Development Visual interface for interactive testing during development Scriptable commands for automated testing and CI/CD integration
Resource Exploration Interactive browser with hierarchical navigation and JSON visualization Programmatic listing and reading for automation scripts
Tool Testing Form-based parameter input with real-time response visualization Command-line tool execution with JSON output for scripting
Debugging Request history, visualized errors, and real-time notifications Direct JSON output for log analysis and integration with other tools
Automation N/A Ideal for CI/CD pipelines and integration with coding assistants

Security Considerations

When using the MCP Inspector, it's important to understand its security implications:

  • Local Development: The Inspector proxy server can spawn local processes and connect to any specified MCP server
  • Network Exposure: The proxy server should not be exposed to untrusted networks as it has permissions to execute local commands
  • Authentication: Bearer token authentication is supported for secure connections to remote MCP servers
  • Data Access: The Inspector has the same data access permissions as the MCP server it's testing

Benefits for MCP Server Development

  • Rapid Prototyping: Quickly test and iterate on MCP server implementations without building custom clients
  • Visual Debugging: Understand server behavior through visual interfaces rather than log parsing
  • Protocol Compliance: Verify that servers correctly implement MCP protocol requirements
  • Integration Testing: Test how servers behave with different transport mechanisms and authentication schemes
  • Documentation: Generate configuration examples for production deployment
  • Performance Analysis: Monitor request timing and identify performance bottlenecks

Building MCP Applications: Pet Store

Let us show you how to build a complete MCP server and client application using a pet store scenario. You'll learn to create an MCP server that manages pet inventory and a client application that interacts with it through the Model Context Protocol.

Overview

In this section, we'll build:

  • Pet Store MCP Server: A server that provides tools for managing pets, inventory, and customer orders
  • AI Client Application: A client that uses the MCP server to help customers find and purchase pets
  • Secure Integration: Implementation of authentication and authorization controls
  • Testing and Debugging: Using MCP Inspector to validate the implementation

Prerequisites and Setup

Required Tools

  • Node.js: Version 18 or later
  • TypeScript: For type safety and better development experience
  • MCP SDK: Official TypeScript SDK for MCP development
  • Database: SQLite for simple data persistence

Project Structure

pet-store-mcp/
├── server/
│   ├── src/
│   │   ├── index.ts
│   │   ├── tools/
│   │   │   ├── petManagement.ts
│   │   │   ├── inventory.ts
│   │   │   └── orders.ts
│   │   ├── resources/
│   │   │   └── petDatabase.ts
│   │   └── database/
│   │       └── schema.sql
│   ├── package.json
│   └── tsconfig.json
├── client/
│   ├── src/
│   │   ├── index.ts
│   │   └── petStoreClient.ts
│   ├── package.json
│   └── tsconfig.json
└── README.md

Step 1: Setting Up the Development Environment

Initialize the Project

# Create project directory
mkdir pet-store-mcp
cd pet-store-mcp

# Create server directory
mkdir server
cd server
npm init -y
npm install @modelcontextprotocol/sdk sqlite3 @types/sqlite3
npm install -D typescript @types/node tsx

# Create client directory
cd ../
mkdir client
cd client
npm init -y
npm install @modelcontextprotocol/sdk
npm install -D typescript @types/node tsx

TypeScript Configuration

# server/tsconfig.json
{
  "compilerOptions": {
    "target": "ES2022",
    "module": "NodeNext",
    "moduleResolution": "NodeNext",
    "outDir": "./dist",
    "strict": true,
    "esModuleInterop": true,
    "skipLibCheck": true,
    "forceConsistentCasingInFileNames": true
  },
  "include": ["src/**/*"],
  "exclude": ["node_modules", "dist"]
}

Step 2: Creating the Pet Store MCP Server

Database Schema

-- server/src/database/schema.sql
CREATE TABLE IF NOT EXISTS pets (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    name TEXT NOT NULL,
    species TEXT NOT NULL,
    breed TEXT,
    age INTEGER,
    price DECIMAL(10,2),
    description TEXT,
    available BOOLEAN DEFAULT 1,
    created_at DATETIME DEFAULT CURRENT_TIMESTAMP
);

CREATE TABLE IF NOT EXISTS orders (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    customer_name TEXT NOT NULL,
    customer_email TEXT NOT NULL,
    pet_id INTEGER,
    status TEXT DEFAULT 'pending',
    total_amount DECIMAL(10,2),
    order_date DATETIME DEFAULT CURRENT_TIMESTAMP,
    FOREIGN KEY (pet_id) REFERENCES pets (id)
);

-- Sample data
INSERT INTO pets (name, species, breed, age, price, description) VALUES
('Buddy', 'Dog', 'Golden Retriever', 2, 1200.00, 'Friendly and energetic golden retriever puppy'),
('Whiskers', 'Cat', 'Persian', 1, 800.00, 'Beautiful long-haired Persian cat'),
('Charlie', 'Dog', 'Labrador', 3, 1000.00, 'Well-trained family dog, great with kids'),
('Luna', 'Cat', 'Siamese', 2, 600.00, 'Elegant Siamese cat with striking blue eyes'),
('Max', 'Dog', 'German Shepherd', 4, 1500.00, 'Intelligent and loyal working dog');

Pet Management Tools

// server/src/tools/petManagement.ts
import { Tool } from '@modelcontextprotocol/sdk/types.js';
import Database from 'sqlite3';

export class PetManagementTools {
    private db: Database.Database;

    constructor(db: Database.Database) {
        this.db = db;
    }

    // Tool: List available pets
    getListPetsTool(): Tool {
        return {
            name: 'list_pets',
            description: 'List all available pets in the store with filtering options',
            inputSchema: {
                type: 'object',
                properties: {
                    species: {
                        type: 'string',
                        description: 'Filter by species (dog, cat, bird, etc.)',
                        enum: ['dog', 'cat', 'bird', 'fish', 'rabbit']
                    },
                    maxPrice: {
                        type: 'number',
                        description: 'Maximum price filter'
                    },
                    availableOnly: {
                        type: 'boolean',
                        description: 'Show only available pets',
                        default: true
                    }
                }
            }
        };
    }

    async listPets(args: any): Promise {
        return new Promise((resolve, reject) => {
            let query = 'SELECT * FROM pets WHERE 1=1';
            const params: any[] = [];

            if (args.species) {
                query += ' AND LOWER(species) = LOWER(?)';
                params.push(args.species);
            }

            if (args.maxPrice) {
                query += ' AND price <= ?';
                params.push(args.maxPrice);
            }

            if (args.availableOnly !== false) {
                query += ' AND available = 1';
            }

            query += ' ORDER BY name';

            this.db.all(query, params, (err, rows) => {
                if (err) {
                    reject(err);
                } else {
                    resolve({
                        success: true,
                        pets: rows,
                        count: rows.length
                    });
                }
            });
        });
    }

    // Tool: Get detailed pet information
    getPetDetailsTool(): Tool {
        return {
            name: 'get_pet_details',
            description: 'Get detailed information about a specific pet',
            inputSchema: {
                type: 'object',
                properties: {
                    petId: {
                        type: 'integer',
                        description: 'The ID of the pet to retrieve details for'
                    }
                },
                required: ['petId']
            }
        };
    }

    async getPetDetails(args: any): Promise {
        return new Promise((resolve, reject) => {
            this.db.get(
                'SELECT * FROM pets WHERE id = ?',
                [args.petId],
                (err, row) => {
                    if (err) {
                        reject(err);
                    } else if (!row) {
                        resolve({
                            success: false,
                            error: 'Pet not found'
                        });
                    } else {
                        resolve({
                            success: true,
                            pet: row
                        });
                    }
                }
            );
        });
    }

    // Tool: Add new pet to inventory
    getAddPetTool(): Tool {
        return {
            name: 'add_pet',
            description: 'Add a new pet to the store inventory',
            inputSchema: {
                type: 'object',
                properties: {
                    name: { type: 'string', description: 'Pet name' },
                    species: { type: 'string', description: 'Species (dog, cat, etc.)' },
                    breed: { type: 'string', description: 'Breed of the pet' },
                    age: { type: 'integer', description: 'Age in years' },
                    price: { type: 'number', description: 'Price in dollars' },
                    description: { type: 'string', description: 'Description of the pet' }
                },
                required: ['name', 'species', 'price']
            }
        };
    }

    async addPet(args: any): Promise {
        return new Promise((resolve, reject) => {
            this.db.run(
                `INSERT INTO pets (name, species, breed, age, price, description) 
                 VALUES (?, ?, ?, ?, ?, ?)`,
                [args.name, args.species, args.breed || null, args.age || null, args.price, args.description || null],
                function(err) {
                    if (err) {
                        reject(err);
                    } else {
                        resolve({
                            success: true,
                            petId: this.lastID,
                            message: `Successfully added ${args.name} to inventory`
                        });
                    }
                }
            );
        });
    }
}

Order Management Tools

// server/src/tools/orders.ts
import { Tool } from '@modelcontextprotocol/sdk/types.js';
import Database from 'sqlite3';

export class OrderManagementTools {
    private db: Database.Database;

    constructor(db: Database.Database) {
        this.db = db;
    }

    getCreateOrderTool(): Tool {
        return {
            name: 'create_order',
            description: 'Create a new order for a pet purchase',
            inputSchema: {
                type: 'object',
                properties: {
                    customerName: { type: 'string', description: 'Customer full name' },
                    customerEmail: { type: 'string', description: 'Customer email address' },
                    petId: { type: 'integer', description: 'ID of the pet to purchase' }
                },
                required: ['customerName', 'customerEmail', 'petId']
            }
        };
    }

    async createOrder(args: any): Promise {
        return new Promise((resolve, reject) => {
            // First check if pet is available
            this.db.get(
                'SELECT * FROM pets WHERE id = ? AND available = 1',
                [args.petId],
                (err, pet: any) => {
                    if (err) {
                        reject(err);
                        return;
                    }

                    if (!pet) {
                        resolve({
                            success: false,
                            error: 'Pet not available or not found'
                        });
                        return;
                    }

                    // Create the order
                    this.db.run(
                        `INSERT INTO orders (customer_name, customer_email, pet_id, total_amount) 
                         VALUES (?, ?, ?, ?)`,
                        [args.customerName, args.customerEmail, args.petId, pet.price],
                        function(err) {
                            if (err) {
                                reject(err);
                            } else {
                                // Mark pet as unavailable
                                this.db.run(
                                    'UPDATE pets SET available = 0 WHERE id = ?',
                                    [args.petId],
                                    (updateErr) => {
                                        if (updateErr) {
                                            reject(updateErr);
                                        } else {
                                            resolve({
                                                success: true,
                                                orderId: this.lastID,
                                                petName: pet.name,
                                                totalAmount: pet.price,
                                                message: `Order created successfully for ${pet.name}`
                                            });
                                        }
                                    }
                                );
                            }
                        }
                    );
                }
            );
        });
    }

    getOrderStatusTool(): Tool {
        return {
            name: 'get_order_status',
            description: 'Get the status of an existing order',
            inputSchema: {
                type: 'object',
                properties: {
                    orderId: { type: 'integer', description: 'Order ID to check status for' }
                },
                required: ['orderId']
            }
        };
    }

    async getOrderStatus(args: any): Promise {
        return new Promise((resolve, reject) => {
            this.db.get(
                `SELECT o.*, p.name as pet_name, p.species, p.breed 
                 FROM orders o 
                 JOIN pets p ON o.pet_id = p.id 
                 WHERE o.id = ?`,
                [args.orderId],
                (err, row) => {
                    if (err) {
                        reject(err);
                    } else if (!row) {
                        resolve({
                            success: false,
                            error: 'Order not found'
                        });
                    } else {
                        resolve({
                            success: true,
                            order: row
                        });
                    }
                }
            );
        });
    }
}

Main Server Implementation

// server/src/index.ts
import { Server } from '@modelcontextprotocol/sdk/server/index.js';
import { StdioServerTransport } from '@modelcontextprotocol/sdk/server/stdio.js';
import {
    CallToolRequestSchema,
    ListToolsRequestSchema,
    ListResourcesRequestSchema,
    ReadResourceRequestSchema,
} from '@modelcontextprotocol/sdk/types.js';
import Database from 'sqlite3';
import { readFileSync } from 'fs';
import { join, dirname } from 'path';
import { fileURLToPath } from 'url';
import { PetManagementTools } from './tools/petManagement.js';
import { OrderManagementTools } from './tools/orders.js';

const __filename = fileURLToPath(import.meta.url);
const __dirname = dirname(__filename);

class PetStoreMCPServer {
    private server: Server;
    private db: Database.Database;
    private petTools: PetManagementTools;
    private orderTools: OrderManagementTools;

    constructor() {
        this.server = new Server(
            {
                name: 'pet-store-server',
                version: '1.0.0',
            },
            {
                capabilities: {
                    tools: {},
                    resources: {},
                },
            }
        );

        this.setupDatabase();
        this.setupTools();
        this.setupHandlers();
    }

    private setupDatabase() {
        this.db = new Database.Database(':memory:');
        
        // Initialize database schema
        const schema = readFileSync(join(__dirname, 'database/schema.sql'), 'utf8');
        this.db.exec(schema, (err) => {
            if (err) {
                console.error('Database initialization error:', err);
            } else {
                console.log('Database initialized successfully');
            }
        });

        this.petTools = new PetManagementTools(this.db);
        this.orderTools = new OrderManagementTools(this.db);
    }

    private setupTools() {
        // Register pet management tools
        this.server.setRequestHandler(ListToolsRequestSchema, async () => ({
            tools: [
                this.petTools.getListPetsTool(),
                this.petTools.getPetDetailsTool(),
                this.petTools.getAddPetTool(),
                this.orderTools.getCreateOrderTool(),
                this.orderTools.getOrderStatusTool(),
            ],
        }));

        // Handle tool calls
        this.server.setRequestHandler(CallToolRequestSchema, async (request) => {
            try {
                switch (request.params.name) {
                    case 'list_pets':
                        const petsResult = await this.petTools.listPets(request.params.arguments);
                        return {
                            content: [
                                {
                                    type: 'text',
                                    text: JSON.stringify(petsResult, null, 2),
                                },
                            ],
                        };

                    case 'get_pet_details':
                        const petResult = await this.petTools.getPetDetails(request.params.arguments);
                        return {
                            content: [
                                {
                                    type: 'text',
                                    text: JSON.stringify(petResult, null, 2),
                                },
                            ],
                        };

                    case 'add_pet':
                        const addResult = await this.petTools.addPet(request.params.arguments);
                        return {
                            content: [
                                {
                                    type: 'text',
                                    text: JSON.stringify(addResult, null, 2),
                                },
                            ],
                        };

                    case 'create_order':
                        const orderResult = await this.orderTools.createOrder(request.params.arguments);
                        return {
                            content: [
                                {
                                    type: 'text',
                                    text: JSON.stringify(orderResult, null, 2),
                                },
                            ],
                        };

                    case 'get_order_status':
                        const statusResult = await this.orderTools.getOrderStatus(request.params.arguments);
                        return {
                            content: [
                                {
                                    type: 'text',
                                    text: JSON.stringify(statusResult, null, 2),
                                },
                            ],
                        };

                    default:
                        throw new Error(`Unknown tool: ${request.params.name}`);
                }
            } catch (error) {
                return {
                    content: [
                        {
                            type: 'text',
                            text: `Error: ${error.message}`,
                        },
                    ],
                    isError: true,
                };
            }
        });
    }

    private setupHandlers() {
        // Resources handler (for serving pet catalog data)
        this.server.setRequestHandler(ListResourcesRequestSchema, async () => ({
            resources: [
                {
                    uri: 'petstore://catalog',
                    name: 'Pet Store Catalog',
                    description: 'Complete catalog of available pets',
                    mimeType: 'application/json',
                },
                {
                    uri: 'petstore://inventory-stats',
                    name: 'Inventory Statistics',
                    description: 'Statistical overview of pet inventory',
                    mimeType: 'application/json',
                },
            ],
        }));

        this.server.setRequestHandler(ReadResourceRequestSchema, async (request) => {
            const uri = request.params.uri;

            if (uri === 'petstore://catalog') {
                const pets = await this.petTools.listPets({ availableOnly: false });
                return {
                    contents: [
                        {
                            uri,
                            mimeType: 'application/json',
                            text: JSON.stringify(pets, null, 2),
                        },
                    ],
                };
            } else if (uri === 'petstore://inventory-stats') {
                return new Promise((resolve, reject) => {
                    this.db.all(
                        `SELECT 
                            species,
                            COUNT(*) as total,
                            SUM(CASE WHEN available = 1 THEN 1 ELSE 0 END) as available,
                            AVG(price) as avg_price
                         FROM pets 
                         GROUP BY species`,
                        [],
                        (err, rows) => {
                            if (err) {
                                reject(err);
                            } else {
                                resolve({
                                    contents: [
                                        {
                                            uri,
                                            mimeType: 'application/json',
                                            text: JSON.stringify({
                                                inventoryStats: rows,
                                                lastUpdated: new Date().toISOString()
                                            }, null, 2),
                                        },
                                    ],
                                });
                            }
                        }
                    );
                });
            } else {
                throw new Error(`Unknown resource: ${uri}`);
            }
        });
    }

    async run() {
        const transport = new StdioServerTransport();
        await this.server.connect(transport);
        console.error('Pet Store MCP Server running on stdio');
    }
}

// Start the server
const server = new PetStoreMCPServer();
server.run().catch(console.error);

Step 3: Creating the Client Application

MCP Client Implementation

// client/src/petStoreClient.ts
import { Client } from '@modelcontextprotocol/sdk/client/index.js';
import { StdioClientTransport } from '@modelcontextprotocol/sdk/client/stdio.js';
import { spawn } from 'child_process';

export class PetStoreClient {
    private client: Client;
    private transport: StdioClientTransport;

    constructor() {
        this.client = new Client(
            {
                name: 'pet-store-client',
                version: '1.0.0',
            },
            {
                capabilities: {},
            }
        );
    }

    async connect() {
        // Spawn the MCP server process
        const serverProcess = spawn('node', ['../server/dist/index.js'], {
            stdio: ['pipe', 'pipe', 'pipe'],
        });

        this.transport = new StdioClientTransport({
            process: serverProcess,
        });

        await this.client.connect(this.transport);
        console.log('Connected to Pet Store MCP Server');
    }

    async disconnect() {
        await this.client.close();
    }

    async listAvailablePets(filters?: { species?: string; maxPrice?: number }) {
        try {
            const result = await this.client.request(
                {
                    method: 'tools/call',
                    params: {
                        name: 'list_pets',
                        arguments: {
                            ...filters,
                            availableOnly: true
                        },
                    },
                },
                {
                    timeout: 5000,
                }
            );

            return JSON.parse(result.content[0].text);
        } catch (error) {
            console.error('Error listing pets:', error);
            throw error;
        }
    }

    async getPetDetails(petId: number) {
        try {
            const result = await this.client.request(
                {
                    method: 'tools/call',
                    params: {
                        name: 'get_pet_details',
                        arguments: { petId },
                    },
                },
                {
                    timeout: 5000,
                }
            );

            return JSON.parse(result.content[0].text);
        } catch (error) {
            console.error('Error getting pet details:', error);
            throw error;
        }
    }

    async createOrder(customerName: string, customerEmail: string, petId: number) {
        try {
            const result = await this.client.request(
                {
                    method: 'tools/call',
                    params: {
                        name: 'create_order',
                        arguments: {
                            customerName,
                            customerEmail,
                            petId,
                        },
                    },
                },
                {
                    timeout: 5000,
                }
            );

            return JSON.parse(result.content[0].text);
        } catch (error) {
            console.error('Error creating order:', error);
            throw error;
        }
    }

    async getInventoryStats() {
        try {
            const result = await this.client.request(
                {
                    method: 'resources/read',
                    params: {
                        uri: 'petstore://inventory-stats',
                    },
                },
                {
                    timeout: 5000,
                }
            );

            return JSON.parse(result.contents[0].text);
        } catch (error) {
            console.error('Error getting inventory stats:', error);
            throw error;
        }
    }
}

Demo Application

// client/src/index.ts
import { PetStoreClient } from './petStoreClient.js';
import { createInterface } from 'readline';

class PetStoreDemo {
    private client: PetStoreClient;
    private rl: any;

    constructor() {
        this.client = new PetStoreClient();
        this.rl = createInterface({
            input: process.stdin,
            output: process.stdout,
        });
    }

    private async prompt(question: string): Promise {
        return new Promise((resolve) => {
            this.rl.question(question, resolve);
        });
    }

    async run() {
        try {
            console.log('🐾 Welcome to Pet Store MCP Demo!');
            console.log('Connecting to pet store server...\n');

            await this.client.connect();

            while (true) {
                console.log('\n--- Pet Store Menu ---');
                console.log('1. Browse available pets');
                console.log('2. Search pets by species');
                console.log('3. Get pet details');
                console.log('4. Purchase a pet');
                console.log('5. View inventory statistics');
                console.log('6. Exit');

                const choice = await this.prompt('\nEnter your choice (1-6): ');

                switch (choice) {
                    case '1':
                        await this.browsePets();
                        break;
                    case '2':
                        await this.searchPetsBySpecies();
                        break;
                    case '3':
                        await this.getPetDetails();
                        break;
                    case '4':
                        await this.purchasePet();
                        break;
                    case '5':
                        await this.viewInventoryStats();
                        break;
                    case '6':
                        console.log('Thank you for visiting our pet store! 🐾');
                        await this.client.disconnect();
                        this.rl.close();
                        return;
                    default:
                        console.log('Invalid choice. Please try again.');
                }
            }
        } catch (error) {
            console.error('Demo error:', error);
        }
    }

    private async browsePets() {
        console.log('\n📋 Available Pets:');
        try {
            const result = await this.client.listAvailablePets();
            if (result.success && result.pets.length > 0) {
                result.pets.forEach((pet: any) => {
                    console.log(`ID: ${pet.id} | ${pet.name} (${pet.species}) - $${pet.price}`);
                    console.log(`   Breed: ${pet.breed || 'Mixed'} | Age: ${pet.age || 'Unknown'}`);
                    console.log(`   ${pet.description}\n`);
                });
            } else {
                console.log('No pets available at the moment.');
            }
        } catch (error) {
            console.error('Failed to browse pets:', error.message);
        }
    }

    private async searchPetsBySpecies() {
        const species = await this.prompt('Enter species to search for (dog, cat, bird, fish, rabbit): ');
        
        console.log(`\n🔍 Searching for ${species}s...`);
        try {
            const result = await this.client.listAvailablePets({ species: species.toLowerCase() });
            if (result.success && result.pets.length > 0) {
                console.log(`Found ${result.count} ${species}(s):`);
                result.pets.forEach((pet: any) => {
                    console.log(`ID: ${pet.id} | ${pet.name} - $${pet.price}`);
                    console.log(`   ${pet.description}\n`);
                });
            } else {
                console.log(`No ${species}s available.`);
            }
        } catch (error) {
            console.error('Search failed:', error.message);
        }
    }

    private async getPetDetails() {
        const petIdStr = await this.prompt('Enter pet ID for details: ');
        const petId = parseInt(petIdStr);

        if (isNaN(petId)) {
            console.log('Invalid pet ID.');
            return;
        }

        try {
            const result = await this.client.getPetDetails(petId);
            if (result.success) {
                const pet = result.pet;
                console.log(`\n🐕 Pet Details:`);
                console.log(`Name: ${pet.name}`);
                console.log(`Species: ${pet.species}`);
                console.log(`Breed: ${pet.breed || 'Mixed'}`);
                console.log(`Age: ${pet.age || 'Unknown'} years`);
                console.log(`Price: $${pet.price}`);
                console.log(`Available: ${pet.available ? 'Yes' : 'No'}`);
                console.log(`Description: ${pet.description}`);
            } else {
                console.log('Pet not found.');
            }
        } catch (error) {
            console.error('Failed to get pet details:', error.message);
        }
    }

    private async purchasePet() {
        const petIdStr = await this.prompt('Enter pet ID to purchase: ');
        const petId = parseInt(petIdStr);

        if (isNaN(petId)) {
            console.log('Invalid pet ID.');
            return;
        }

        const customerName = await this.prompt('Enter your full name: ');
        const customerEmail = await this.prompt('Enter your email: ');

        console.log('\n💳 Processing your order...');
        try {
            const result = await this.client.createOrder(customerName, customerEmail, petId);
            if (result.success) {
                console.log(`\n✅ Order successful!`);
                console.log(`Order ID: ${result.orderId}`);
                console.log(`Pet: ${result.petName}`);
                console.log(`Total: $${result.totalAmount}`);
                console.log(`\nThank you for your purchase! 🎉`);
            } else {
                console.log(`❌ Order failed: ${result.error}`);
            }
        } catch (error) {
            console.error('Purchase failed:', error.message);
        }
    }

    private async viewInventoryStats() {
        console.log('\n📊 Inventory Statistics:');
        try {
            const stats = await this.client.getInventoryStats();
            console.log(`Last Updated: ${new Date(stats.lastUpdated).toLocaleString()}\n`);
            
            stats.inventoryStats.forEach((stat: any) => {
                console.log(`${stat.species.toUpperCase()}:`);
                console.log(`  Total: ${stat.total}`);
                console.log(`  Available: ${stat.available}`);
                console.log(`  Average Price: $${stat.avg_price.toFixed(2)}\n`);
            });
        } catch (error) {
            console.error('Failed to get statistics:', error.message);
        }
    }
}

// Run the demo
const demo = new PetStoreDemo();
demo.run().catch(console.error);

Step 4: Testing with MCP Inspector

Building and Testing the Server

# Build the server
cd server
npm run build

# Test with MCP Inspector
npx @modelcontextprotocol/inspector node dist/index.js

Inspector Testing Checklist

  • Connection Test: Verify the server connects successfully
  • Tools Discovery: Check that all 5 tools are listed correctly
  • Tool Execution: Test each tool with various parameters
  • Resource Access: Verify catalog and stats resources are accessible
  • Error Handling: Test with invalid inputs to ensure proper error responses
  • Performance: Monitor response times and resource usage

Step 5: Running the Complete Application

Package.json Scripts

// server/package.json
{
  "name": "pet-store-mcp-server",
  "version": "1.0.0",
  "type": "module",
  "scripts": {
    "build": "tsc",
    "start": "node dist/index.js",
    "dev": "tsx src/index.ts",
    "inspect": "npx @modelcontextprotocol/inspector node dist/index.js",
    "test": "npm run build && npm run inspect"
  }
}

// client/package.json
{
  "name": "pet-store-mcp-client",
  "version": "1.0.0",
  "type": "module",
  "scripts": {
    "build": "tsc", 
    "start": "node dist/index.js",
    "dev": "tsx src/index.ts"
  }
}

Running the Demo Application

# Terminal 1: Build and prepare server
cd server
npm run build

# Terminal 2: Build and run client
cd client
npm run build
npm start

Application Monitoring and Logging

For production readiness, add comprehensive logging to track application behavior:

// server/src/utils/logger.ts
export class Logger {
    static info(message: string, data?: any) {
        console.log(`[INFO] ${new Date().toISOString()} - ${message}`, data || '');
    }
    
    static error(message: string, error?: any) {
        console.error(`[ERROR] ${new Date().toISOString()} - ${message}`, error || '');
    }
    
    static audit(action: string, user: string, details?: any) {
        console.log(`[AUDIT] ${new Date().toISOString()} - ${action} by ${user}`, details || '');
    }
}

Performance Monitoring

Monitor key performance metrics during operation:

  • Response Times: Track tool execution times and database query performance
  • Error Rates: Monitor failed requests and their causes
  • Connection Health: Ensure MCP client-server connections remain stable
  • Resource Usage: Monitor memory and CPU usage for optimization opportunities

Common Startup Issues and Solutions

Issue Symptoms Solution
Port Already in Use Server fails to start with port binding error Kill existing processes or change port configuration
Database Connection Failed Tools return database errors Verify SQLite file permissions and schema initialization
Client Connection Timeout Client cannot connect to server Ensure server is running and check transport configuration
Tool Execution Errors Tools return unexpected errors Check parameter validation and database constraints

Step 6A: Production Readiness Basics

Before implementing advanced security features, ensure your application meets basic production standards:

Environment Configuration

// server/.env
NODE_ENV=production
DATABASE_URL=postgresql://user:password@localhost:5432/petstore
LOG_LEVEL=info
PORT=3000

// Load environment variables in server
import dotenv from 'dotenv';
dotenv.config();

const config = {
    nodeEnv: process.env.NODE_ENV || 'development',
    databaseUrl: process.env.DATABASE_URL || ':memory:',
    logLevel: process.env.LOG_LEVEL || 'debug',
    port: parseInt(process.env.PORT || '3000')
};

Error Handling Enhancement

// server/src/middleware/errorHandler.ts
export class ErrorHandler {
    static handleToolError(error: any, toolName: string) {
        Logger.error(`Tool ${toolName} failed`, error);
        
        if (error.code === 'SQLITE_CONSTRAINT') {
            return {
                success: false,
                error: 'Data validation failed',
                details: process.env.NODE_ENV === 'development' ? error.message : undefined
            };
        }
        
        return {
            success: false,
            error: 'Internal server error',
            requestId: Math.random().toString(36).substr(2, 9)
        };
    }
}

Basic Health Checks

// Add to main server implementation
private setupHealthChecks() {
    this.server.setRequestHandler(ListResourcesRequestSchema, async () => {
        // Add health check resource
        return {
            resources: [
                {
                    uri: 'petstore://health',
                    name: 'Health Check',
                    description: 'Server health status',
                    mimeType: 'application/json',
                },
                // ... existing resources
            ],
        };
    });
}

// Health check implementation
async getHealthStatus(): Promise {
    try {
        // Test database connection
        const dbTest = await this.testDatabaseConnection();
        
        return {
            status: 'healthy',
            timestamp: new Date().toISOString(),
            database: dbTest ? 'connected' : 'disconnected',
            uptime: process.uptime()
        };
    } catch (error) {
        return {
            status: 'unhealthy',
            error: error.message,
            timestamp: new Date().toISOString()
        };
    }
}

Step 6B: Advanced Security and Enterprise Features

Now that your application has basic production readiness, implement enterprise-grade security features:

Authentication Enhancement

Note: For production implementations, follow the official MCP security best practices documented at MCP Security Best Practices.

// server/src/middleware/auth.ts
export class AuthMiddleware {
    private validApiKeys: Set;

    constructor() {
        this.validApiKeys = new Set([
            process.env.API_KEY_1,
            process.env.API_KEY_2,
        ].filter(Boolean));
    }

    validateRequest(headers: any): boolean {
        const apiKey = headers['x-api-key'];
        return this.validApiKeys.has(apiKey);
    }

    authorize(operation: string, userRole: string): boolean {
        const permissions = {
            'customer': ['list_pets', 'get_pet_details', 'create_order', 'get_order_status'],
            'staff': ['list_pets', 'get_pet_details', 'add_pet', 'create_order', 'get_order_status'],
            'admin': ['*'] // All operations
        };

        return permissions[userRole]?.includes(operation) || permissions[userRole]?.includes('*');
    }

    // Prevent token passthrough - only accept tokens issued for this MCP server
    validateTokenAudience(token: any): boolean {
        return token.aud === process.env.MCP_SERVER_ID;
    }

    // Generate secure session IDs bound to user information
    generateSecureSessionId(userId: string): string {
        const sessionId = crypto.randomUUID();
        return `${userId}:${sessionId}`;
    }
}

Input Validation

// server/src/validation/schemas.ts
import Joi from 'joi';

export const petSchema = Joi.object({
    name: Joi.string().min(1).max(50).required(),
    species: Joi.string().valid('dog', 'cat', 'bird', 'fish', 'rabbit').required(),
    breed: Joi.string().max(50).optional(),
    age: Joi.number().integer().min(0).max(30).optional(),
    price: Joi.number().positive().max(10000).required(),
    description: Joi.string().max(500).optional()
});

export const orderSchema = Joi.object({
    customerName: Joi.string().min(2).max(100).required(),
    customerEmail: Joi.string().email().required(),
    petId: Joi.number().integer().positive().required()
});

export function validateInput(schema: Joi.ObjectSchema, data: any) {
    const { error, value } = schema.validate(data);
    if (error) {
        throw new Error(`Validation error: ${error.details[0].message}`);
    }
    return value;
}

Rate Limiting and Security Headers

// server/src/middleware/security.ts
export class SecurityMiddleware {
    private rateLimiter = new Map();

    checkRateLimit(clientId: string, limit: number = 100, windowMs: number = 60000): boolean {
        const now = Date.now();
        const clientData = this.rateLimiter.get(clientId);

        if (!clientData || now > clientData.resetTime) {
            this.rateLimiter.set(clientId, { count: 1, resetTime: now + windowMs });
            return true;
        }

        if (clientData.count >= limit) {
            return false;
        }

        clientData.count++;
        return true;
    }

    setSecurityHeaders(response: any) {
        response.headers = {
            'X-Content-Type-Options': 'nosniff',
            'X-Frame-Options': 'DENY',
            'X-XSS-Protection': '1; mode=block',
            'Strict-Transport-Security': 'max-age=31536000; includeSubDomains'
        };
    }
}

Troubleshooting Guide

Development Issues

Problem Check This Quick Fix
TypeScript Build Errors Import paths, type definitions Run npm run build to see specific errors
MCP Inspector Not Connecting Server process status, port conflicts Restart server and check console output
Database Errors Schema initialization, file permissions Delete SQLite file and restart to reinitialize
Tool Parameter Errors JSON schema validation, required fields Use MCP Inspector to test with valid parameters

Production Issues

  • Memory Leaks: Monitor database connection pooling and close unused connections
  • Performance Degradation: Check database query performance and add appropriate indexes
  • Security Alerts: Review authentication logs and implement proper session management
  • Scaling Issues: Consider horizontal scaling and load balancing for high traffic

Testing and Validation Checklist

Functional Testing

  • ✅ All tools execute successfully with valid parameters
  • ✅ Error handling works correctly with invalid inputs
  • ✅ Database operations maintain data integrity
  • ✅ Client-server communication is stable
  • ✅ Resources are accessible and return correct data

Security Testing

  • ✅ Input validation prevents malicious data
  • ✅ Authentication mechanisms work correctly
  • ✅ Rate limiting prevents abuse
  • ✅ Error messages don't leak sensitive information
  • ✅ Security headers are properly set
  • ✅ Token passthrough is prevented (only accept tokens issued for the MCP server)
  • ✅ Session hijacking protection is implemented
  • ✅ Confused deputy attacks are mitigated
  • ✅ Secure session ID generation with user binding
  • ✅ Audit logging captures all security-relevant events

Performance Testing

  • ✅ Response times are acceptable under normal load
  • ✅ Memory usage remains stable over time
  • ✅ Database queries are optimized
  • ✅ Connection pooling works effectively
  • ✅ Error recovery mechanisms function properly

Key Aspects Covered

MCP Architecture Benefits

  • Standardization: The same client can work with different MCP servers without modification
  • Discoverability: Tools and resources are self-describing through the protocol
  • Type Safety: JSON schemas ensure proper input/output validation
  • Flexibility: Easy to add new tools and extend functionality

Best Practices Demonstrated

  • Modular Design: Separate tool classes for different functional areas
  • Error Handling: Proper error responses and status codes
  • Data Validation: Input validation and sanitization
  • Resource Management: Proper database connection handling
  • Documentation: Clear tool descriptions and parameter schemas
  • Production Readiness: Logging, monitoring, and security considerations

Production Considerations

  • Database: Use persistent storage (PostgreSQL, MySQL) instead of in-memory SQLite
  • Authentication: Implement proper OAuth 2.1 authentication flow
  • Rate Limiting: Add rate limiting to prevent abuse
  • Monitoring: Add comprehensive logging and monitoring
  • Deployment: Containerize with Docker for easy deployment
  • Testing: Add unit tests and integration tests
  • Security: Implement enterprise-grade security controls
  • Performance: Optimize for scalability and reliability

Next Steps and Extensions

This section provides a solid foundation for building MCP applications. Consider extending it with:

  • Web Interface: Build a React frontend that uses the MCP client
  • Multi-tenant Support: Add support for multiple pet store locations
  • Payment Integration: Add payment processing tools
  • Inventory Management: Add supplier and restocking tools
  • Analytics: Add reporting and analytics capabilities
  • Mobile App: Create a mobile client using the same MCP server
  • Microservices: Split into multiple specialized MCP servers
  • AI Enhancement: Add machine learning features for recommendations

Practical Use Cases of MCP

Personal Productivity

  • File Organization: Organize files by type and date.
  • Email Management: Summarize unread emails and send responses.
  • Note Analysis: Create action plans from meeting notes.

Information Access

  • Document Search: Search private documents securely.
  • PDF Q&A: Extract key recommendations from reports.

Trip Planning Assistant

  • Calendar Integration: Check availability.
  • Flight Booking: Book flights using airline APIs.
  • Email Confirmations: Send confirmations via email.

Advanced Code Editor

  • Code Context Awareness: Fetch code context and documentation.
  • Version Control: Manage repositories via GitHub's API.

Enterprise Applications

  • Data Analytics: Interact with multiple databases.
  • HR Assistance: Securely access employee records.
  • Project Management: Check project details from management tools.
  • Team Communication: Post updates in Slack channels.

Advanced Automation

  • Web Automation: Navigate websites and scrape data.
  • Cloud Services: Manage cloud resources for DevOps.

IBM Technology: MCP vs API: Simplifying AI Agent Integration with External Data

In this video, Martin Keen discusses the benefits of using the Model Context Protocol (MCP) for AI agent integration with external data. MCP provides a standardized way to connect AI agents to various data sources, making it easier to integrate and manage external data. We also compare MCP with traditional API-based integration methods, highlighting the advantages of MCP's standardized approach.

Conclusion: The Future of AI Integration

By bridging AI models with live data ecosystems, MCP transforms LLMs into proactive, context-aware agents capable of enterprise-grade automation. Its standardization reduces development overhead while ensuring security and adaptability—positioning it as the future backbone of Agentic AI systems. As AI continues to evolve, MCP provides the foundation for more capable, efficient, and secure AI applications across industries.

Enterprise AI

Reimagining Enterprise ecosystem

Enterprise AI

Building, deploying, and managing AI at Enterprise Scale

1 Foundation & Strategy

Establish your AI strategy and understand the landscape

AI Transformation

Strategic roadmap for Enterprise AI adoption

Explore

Total Cost of Ownership

Calculate and optimize AI implementation costs

Calculate

AI Regulations Efforts

Navigate compliance and regulatory requirements

Learn More

2 Development & Engineering

Build robust AI applications with best practices

Enterprise LLM Applications

Build scalable large language model applications

Build

Spec-Driven Development

Development methodology for AI systems

Implement

Feature Engineering

Optimize data features for AI models

Optimize

Harness Engineering

Evaluate and test AI model performance

Evaluate

Forward Deployed Engineering

Integrate AI systems directly into client environments

Integrate

3 AI Capabilities & Techniques

Master advanced AI techniques and capabilities

AI Agents

Build autonomous AI agents for complex tasks

Create

Multi-Modal AI

Integrate text, image, and audio processing

Integrate

Prompt Engineering

Master the art of effective AI prompting

Master

4 Data & Infrastructure

Build scalable data and infrastructure foundations

Vector Databases

Implement vector search and indexing

Implement

Retrieval Augmented Generation

Enhance LLMs with external knowledge

Enhance

Agentic Context Engineering

Advanced context management for AI systems

Engineer

5 Integration & Protocols

Connect and integrate AI systems seamlessly

Model Context Protocol

Standardized protocol for AI model communication

Integrate

Agent2Agent (A2A) Protocol

Direct communication protocol between AI agents

Connect

Begin with small, deliberate steps to build Enterprise AI capability.

Strategy

Start with AI Transformation and TCO analysis

Build

Develop with Spec-Driven Development

Deploy

Implement Vector Databases and RAG

Scale

Integrate with MCP and AI Agents

References

Official Documentation

Major Platform Integrations

MCP Servers and Tools

LangChain Integration

Educational Resources

Industry Analysis and Commentary

Agentic UI Frameworks

Technical Implementation Guides

Security and Best Practices

Related Technologies

Check out updates from AI influencers

AI Engineering: Building Applications with Foundation Models , published 2025

About this book: A practical guide to building AI applications using foundation models, making AI accessible even to those without prior experience. It explores AI engineering, model adaptation techniques, evaluation strategies, and deployment challenges, helping developers navigate the evolving AI landscape., by Chip Huyen. Read More

The Book coverage on AI Engineering

Prompt engineering, Retrevial Augmented Generation, and fine-tuning are three very common AI Engineering techniques that you can use to adapt a model to your needs, than building a new model from scratch. Foundation models make it cheaper to develop AI applications and reduce time to market.

Source: © Huyen