LangGraph Cloud: Managed Hosting Explained

Updated May 2026
LangGraph Cloud, rebranded to LangSmith Deployment in late 2025, is a managed hosting platform for running LangGraph agent applications in production. It provides horizontally scalable infrastructure, built-in task queues, persistent state management, and zero-maintenance updates, allowing teams to deploy agents without building their own infrastructure from scratch.

What LangGraph Cloud Provides

At its core, LangGraph Cloud solves the operational challenge of running stateful AI agents at scale. A production agent system needs more than just the LangGraph framework running on a server. It needs task queues to handle incoming requests, horizontal scaling to manage traffic spikes, persistent storage for checkpoints and memory, monitoring for health and performance, and deployment tooling for shipping updates without downtime.

LangGraph Cloud bundles all of these operational requirements into a managed platform. You push your LangGraph code, and the platform handles everything needed to run it reliably in production. This includes provisioning compute, managing the task queue, persisting checkpoints, scaling up during high traffic, and rolling back if a deployment fails.

The Rebranding: LangGraph Cloud to LangSmith Deployment

In October 2025, alongside the LangGraph 1.0 release, LangChain restructured its product naming. LangGraph Cloud became LangSmith Deployment, reflecting the fact that the deployment infrastructure is part of the broader LangSmith platform rather than a standalone product. The underlying technology and capabilities remained the same. In practice, many developers still refer to it as LangGraph Cloud, and both names appear in documentation and community discussions.

Deployment Options

Cloud SaaS

The fully managed option runs your agents on LangChain's infrastructure. You deploy through the CLI or API, and LangChain handles all server management, scaling, and updates. This is the fastest path to production because there is no infrastructure to configure. The trade-off is that your data, including agent state, conversation history, and tool call results, passes through LangChain's servers. For many applications this is acceptable, but teams with strict data residency or compliance requirements may need one of the other options.

Bring Your Own Cloud (BYOC)

BYOC runs the LangGraph platform within your own cloud account on AWS, GCP, or Azure. LangChain manages the software layer, deploying updates and managing the platform components, but all compute and storage resources live in your VPC. Your data never leaves your cloud environment. This satisfies most compliance requirements while still offloading operational management to LangChain.

BYOC requires granting LangChain limited access to your cloud account for managing the deployment. The access is scoped to the specific resources used by the platform and can be audited through your cloud provider's access logs.

Self-Hosted Enterprise

The fully self-hosted option deploys the complete LangGraph platform on your own infrastructure with no LangChain involvement in operations. You receive the platform software and manage installation, updates, scaling, and monitoring yourself. This provides maximum control and data isolation but requires significant internal DevOps expertise.

Self-Hosted Lite (Free)

Self-Hosted Lite is a free, limited version of the platform suitable for development and small-scale production workloads. It supports up to 1 million node executions and provides the core deployment capabilities without the advanced features of the paid tiers. This option lets teams evaluate the platform before committing to a paid plan.

Infrastructure Features

Horizontal Scaling

The platform automatically scales compute resources based on incoming request volume. During traffic spikes, additional instances are provisioned to handle the load. During quiet periods, resources scale back down to reduce costs. This elasticity is essential for agent applications with variable traffic patterns, such as customer support systems that see peak usage during business hours.

Task Queues

Incoming agent requests are managed through a built-in task queue that provides reliable delivery, retry handling, and priority scheduling. The queue ensures that requests are not lost during traffic spikes and that long-running agent tasks do not block other requests. Tasks that fail are automatically retried according to configurable policies.

Long-Running Agent Support

Unlike typical web request handlers that time out after 30 seconds, LangGraph Cloud is designed to support agents that run for minutes or even hours. The platform maintains persistent connections and checkpoints agent state continuously, so even agents with extended thinking and tool-calling sequences complete reliably.

Double Texting Handling

The platform handles the scenario where a user sends a follow-up message while the agent is still processing the previous one. Configurable policies let you choose whether to queue the new message, interrupt the current execution, or reject the new input until the agent finishes.

Deployment Workflow

Deploying to LangGraph Cloud follows a straightforward workflow. You define your graph in code as you would for any LangGraph application. You create a langgraph.json configuration file that specifies the graph entry point, environment variables, and dependencies. You then deploy using the langgraph deploy CLI command (which superseded langgraph up in March 2026) or through the LangSmith web interface.

The platform builds a container from your code, runs health checks, and promotes it to production. If the deployment fails or the health checks do not pass, the platform automatically rolls back to the previous working version. Zero-downtime deployments are the default behavior.

Integration with LangGraph Studio

LangGraph Studio connects directly to deployed agents on the platform, providing the same visual debugging and time-travel capabilities available during local development. You can inspect production agent runs, replay them from any checkpoint, and test changes before deploying them. This integration between the development and production environments reduces the gap between debugging and deploying fixes.

When to Use LangGraph Cloud vs Self-Hosting

LangGraph Cloud makes sense when your team wants to focus on agent development rather than infrastructure management, when you need reliable scaling without building it yourself, or when you want the convenience of managed deployments and monitoring. The platform cost is modest relative to the engineering time it saves.

Self-hosting makes sense when you have strict data sovereignty requirements, when your team already has strong DevOps capabilities, when your usage scale makes the per-execution pricing uneconomical, or when you need deep customization of the infrastructure layer that the managed platform does not support.

Key Takeaway

LangGraph Cloud (LangSmith Deployment) provides managed infrastructure for production agent systems with horizontal scaling, task queues, and zero-downtime deployments. Choose between fully managed, BYOC, and self-hosted options based on your data requirements and operational maturity.