Configure vMCP servers

This guide covers common configuration patterns for vMCP using the VirtualMCPServer resource. For a complete field reference, see the VirtualMCPServer CRD specification.

Create an MCPGroup

Before creating a VirtualMCPServer, you need an MCPGroup to organize the backend MCP servers. An MCPGroup is a logical container that groups related MCPServer and MCPRemoteProxy resources together.

Create a basic MCPGroup:

apiVersion: toolhive.stacklok.dev/v1alpha1
kind: MCPGroup
metadata:
  name: my-group
  namespace: toolhive-system
spec:
  description: Group of backend MCP servers for vMCP aggregation

The MCPGroup must exist in the same namespace as your VirtualMCPServer and be in a Ready state before the VirtualMCPServer can stafrt. Backend resources reference this group using the groupRef field in their spec.

Add backends to a group

vMCP supports two types of backends that can be added to an MCPGroup:

MCPServer (local containers)

MCPServer resources run container-based MCP servers in your cluster:

apiVersion: toolhive.stacklok.dev/v1alpha1
kind: MCPServer
metadata:
  name: fetch
  namespace: toolhive-system
spec:
  groupRef: my-group # Reference to the MCPGroup
  image: ghcr.io/stackloklabs/gofetch/server
  transport: streamable-http

MCPRemoteProxy (remote servers)

MCPRemoteProxy resources proxy external remote MCP servers. They can be added to an MCPGroup for discovery by vMCP:

apiVersion: toolhive.stacklok.dev/v1alpha1
kind: MCPRemoteProxy
metadata:
  name: context7-proxy
  namespace: toolhive-system
spec:
  groupRef: my-group # Reference to the MCPGroup
  remoteURL: https://mcp.context7.com/mcp
  transport: streamable-http
  port: 8080

  # Validate incoming requests
  oidcConfig:
    type: inline
    inline:
      issuer: https://auth.company.com
      audience: context7-proxy

Current limitation

vMCP can discover MCPRemoteProxy backends in a group, but authentication between vMCP and MCPRemoteProxy is not yet fully implemented. This limitation will be addressed in a future release. See Proxy remote MCP servers for details.

For complete MCPRemoteProxy configuration options, see Proxy remote MCP servers.

Create a VirtualMCPServer

At minimum, a VirtualMCPServer requires a reference to an MCPGroup (via config.groupRef) and an authentication type:

apiVersion: toolhive.stacklok.dev/v1alpha1
kind: VirtualMCPServer
metadata:
  name: my-vmcp
  namespace: toolhive-system
spec:
  config:
    groupRef: my-group
  incomingAuth:
    type: anonymous # Disables authentication; do not use in production

The MCPGroup must exist in the same namespace and be in a Ready state before the VirtualMCPServer can start. By default, vMCP automatically discovers and aggregates all MCPServer and MCPRemoteProxy resources in the referenced group. You can also define backends explicitly in the configuration (inline mode). See Backend discovery modes for details on both approaches.

Configure authentication

vMCP uses a two-boundary authentication model: client-to-vMCP (incoming) and vMCP-to-backends (outgoing). See the Authentication guide for complete configuration options including anonymous, OIDC, and Kubernetes service account authentication.

Expose the service

Choose how to expose the vMCP endpoint. The Service resource is created automatically on port 4483.

VirtualMCPServer resource
spec:
  serviceType: ClusterIP # Default: cluster-internal (can be exposed via Ingress/Gateway)
  # serviceType: LoadBalancer # Direct external access via cloud load balancer
  # serviceType: NodePort     # Direct external access via node ports

Service types:

ClusterIP (default): For production, use with Ingress or Gateway API for controlled external access with TLS termination
LoadBalancer: Direct external access via cloud provider's load balancer (simpler but less control)
NodePort: Direct access via node ports (typically for development/testing)

The Service is named vmcp-<NAME>, where <NAME> is from metadata.name in the VirtualMCPServer resource.

Monitor status

Check the VirtualMCPServer status to verify it's ready:

kubectl get virtualmcpserver my-vmcp

Key status fields:

Field	Description
`phase`	Current state (Pending, Ready, Degraded, Failed)
`url`	Service URL for client connections
`backendCount`	Number of discovered backend MCP servers
`discoveredBackends`	Details about each backend and its auth type

Operational configuration

Health checks

vMCP continuously monitors backend health to detect failures and route requests appropriately. Health check behavior is configurable via the VirtualMCPServer resource.

Health check configuration

Configure health monitoring in spec.config.operational.failureHandling:

apiVersion: toolhive.stacklok.dev/v1alpha1
kind: VirtualMCPServer
metadata:
  name: my-vmcp
  namespace: toolhive-system
spec:
  config:
    groupRef: my-group
    operational:
      failureHandling:
        # Health check interval (how often to check each backend)
        # Default: 30s
        healthCheckInterval: 30s

        # Health check timeout (max duration for a single check)
        # Should be less than healthCheckInterval
        # Default: 10s
        healthCheckTimeout: 10s

        # Number of consecutive failures before marking unhealthy
        # Default: 3
        unhealthyThreshold: 3

        # How often to report status updates to Kubernetes
        # Default: 30s
        statusReportingInterval: 30s
  incomingAuth:
    type: anonymous

Circuit breaker configuration

Circuit breakers prevent cascading failures by temporarily stopping requests to consistently failing backends. For detailed configuration, behavior, and troubleshooting, see Failure handling.

To enable circuit breaker:

spec:
  config:
    operational:
      failureHandling:
        circuitBreaker:
          enabled: true
          failureThreshold: 5 # Number of failures before opening circuit
          timeout: 60s # How long to wait before attempting recovery

Timeouts

Configure timeouts for backend requests:

spec:
  config:
    operational:
      timeouts:
        # Default timeout for all backend requests (default: 30s)
        default: 30s

        # Per-workload timeout overrides
        perWorkload:
          slow-backend: 60s
          fast-backend: 10s

note

Health check timeouts are configured separately via failureHandling.healthCheckTimeout (default: 10s), not via the timeouts section.

Remote workload health checks

By default, health checks are:

Always enabled for local backends (MCPServer)
Disabled by default for remote backends (MCPRemoteProxy)

To enable health checks for remote workloads, set the TOOLHIVE_REMOTE_HEALTHCHECKS environment variable in the vMCP pod:

apiVersion: toolhive.stacklok.dev/v1alpha1
kind: VirtualMCPServer
metadata:
  name: my-vmcp
spec:
  podTemplateSpec:
    spec:
      containers:
        - name: vmcp
          env:
            - name: TOOLHIVE_REMOTE_HEALTHCHECKS
              value: 'true'

For detailed backend health monitoring, see Verify backend status in the Backend discovery guide.

Next steps

Review performance and sizing guidance for resource planning
Discover your deployed MCP servers automatically using the Kubernetes registry feature in the ToolHive Registry Server

Create an MCPGroup​

Add backends to a group​

MCPServer (local containers)​

MCPRemoteProxy (remote servers)​

Create a VirtualMCPServer​

Configure authentication​

Expose the service​

Monitor status​

Operational configuration​

Health checks​

Health check configuration​

Circuit breaker configuration​

Timeouts​

Remote workload health checks​

Next steps​

Related information​