Production-Grade AI Evals

TL;DR

LangGraph handles the workflow orchestration (nodes, edges, state, control flow).

LangChain/LCEL handles individual steps (prompt composition, model calls, output parsing).

Events are runtime signals (tool_failed, human_approved, etc.) that edges use to decide what happens next.

Use LangGraph when you need loops, retries, human approvals, or state persistence. Use LCEL alone for simple, one-shot prompt→parse flows.

Why LangGraph Exists

Modern AI apps aren't single prompts; they're workflows: plan → call tools → validate → branch → maybe ask a human → retry → finish. LangGraph treats this as a graph with explicit nodes and edges, plus a durable state you can pause/resume, audit, and recover. That's hard to bolt onto simple chains.

Think of your agent as a workflow with multiple decision points, not just a linear chain. You need explicit control flow, state management, and the ability to handle complex scenarios like retries, approvals, and recovery.

The Four Primitives (and How Events Fit)

1. Graph

Your app's topology—named nodes and edges that form linear paths, branches, and loops. The graph makes control flow explicit and testable.

The graph structure allows you to visualize and reason about your agent's flow. You can see exactly where decisions are made and how state flows through the system.

2. State

A persisted, typed dictionary that flows through the graph (e.g., messages, plan, observations, tool_calls). Each node reads state and returns partial updates; reducers merge them deterministically (checkpointing for durability).

State is the memory of your agent. It accumulates information as the workflow progresses, and can be checkpointed at any point for recovery or auditing.

3. Node

A unit of work: planner, tool call, validator, router, human-approval gate, etc. Internally, a node can happily use LangChain/LCEL (prompt → model → parser) while LangGraph provides the orchestration and persistence around it.

This is where LCEL shines—inside each node, you can use LangChain's expressive language for prompt composition, model calling, and output parsing. LangGraph handles the orchestration around it.

4. Edges

Explicit control rules: next(A→B), conditional edges (guards), and loop edges. Edges examine state (and recent events) to decide where to go next—this is how you get predictable routing, retries, and recovery.

Edges are your decision logic. They look at the current state and recent events to determine the next step. This makes your agent's behavior predictable and debuggable.

Where "Events" Fit

Events are runtime signals—tool_succeeded, validation_failed, human_approved, timeout_fired. They're not a separate modeling primitive; they're inputs that nodes emit and edges can consult (with state) to branch, loop, or pause/resume the run.

Example: When a tool fails, an edge routes to a retry node with backoff based on the tool_failed event.

Example: When approval is requested, the runner pauses until a human_approved event arrives.

Think of events as signals: "model tool failed → edge routes to retry node with backoff," or "approval requested → runner pauses until human_approved event arrives."

Why Not "Just LangChain" (or LCEL) for This

Deterministic Control

Complex branching/loops and guarded routes are much clearer in a graph than nested callbacks. You can visually see the flow and reason about edge cases.

Durability & Observability

You need checkpoints, resume after failure, and auditable steps/events; that's first-class in LangGraph's stateful model. LCEL chains are stateless by design.

Human-in-the-Loop

Clean pauses on events and resumes with updated state are fundamental—not an add-on. This is built into LangGraph's architecture.

LCEL (LangChain Expression Language) shines for single-pass, mostly linear flows (prompt → model → parser). It gets unwieldy for multi-turn agents, tool loops, timeouts/backoff, and partial progress with resume—i.e., the exact problems a stateful graph solves.

When to Use Which

Choose LangGraph when you need:

•Multi-step agents (plan→act→observe→reflect) with guarded routing and loops
•Durability (checkpoints), resumability, and human gates tied to events
•Retries/backoff, idempotent recovery, and time-boxed tasks

Choose LangChain/LCEL alone when you need:

•A straight, stateless pipeline (RAG answer with a parser, one-shot transforms)
•Fast to build, no branching required
•Simple prompt → model → parse workflows

Best Practice—Combine Them:

Use LangGraph for the Graph + State + Edges (the workflow brain) and to react to events.

Implement each Node with LCEL/LangChain for ergonomic prompt/model/parsing logic.

You get step-level developer speed with workflow-level reliability.

Putting It Together: A Blueprint You Can Reuse

1. Initialize State

Seed messages, plan, and context (e.g., user query, retrieved docs). Checkpoint.

state = { messages: [user_query], plan: null, observations: [], context: retrieved_docs }

2. Planner Node

Produces a plan (LCEL inside), emits plan_ready event, updates state. Edge guard: if risky action detected → route to approval node.

Inside the planner node, use LCEL to compose prompts, call the model, and parse the plan. LangGraph handles routing based on the plan content.

3. Approval Node (Human-in-the-Loop)

Pauses on approval_requested; resumes on human_approved/human_rejected event. Edge chooses proceed/cancel.

This is where LangGraph's event-driven pause/resume shines. The workflow can pause indefinitely, waiting for human input, then resume exactly where it left off.

4. Act Node (Tools)

Executes a tool call (LCEL/tool wrapper). On tool_failed, edge loops back with exponential backoff; on tool_succeeded, continue. State accumulates observations.

The retry logic is handled by edges examining events and state. The node just executes the tool and emits an event. LangGraph handles the retry orchestration.

5. Validate Node

Checks outputs (schema/guardrails). If validation_failed, edge loops to planner or act; else finish. Checkpoint.

Validation can be code-based (schema checks) or LLM-based (semantic validation using LCEL). The edge decides what to do next based on the validation result.

6. Finish Node

Collates final messages/result from state and returns.

This pattern handles routing, retries, human approvals, and resume cleanly—because each concern is represented explicitly as nodes, edges, and state, with events driving the transitions.

A Quick Decision Table

Situation	Pick	Why
One-shot prompt → model → parse	LCEL	Minimal plumbing; no branching needed.
Agent with tools, retries, and approvals	LangGraph	Graph-level guards, checkpoints, events, and loops.
Complex workflow, ergonomic step logic	BothLangGraph + LCEL	LCEL inside nodes; LangGraph for orchestration.

Takeaway

1. Model the workflow (Graph/Nodes/Edges) and persist the state; let events be the signals that move you through the graph.

2. Keep step internals ergonomic with LCEL, but avoid using it to simulate orchestration.

3. For reliability, recovery, and human-in-the-loop, LangGraph is the right abstraction, with LCEL as the implementation detail inside nodes.

Essential Tools for Building Agents

LangGraph

The orchestration framework for building stateful, reliable agent workflows with explicit control flow.

• Graph-based workflow definition
• State management with checkpointing
• Event-driven control flow
• Human-in-the-loop support
• Built-in retry and recovery mechanisms

LangChain & LCEL

Use inside nodes for prompt composition, model calling, and output parsing.

• Expressive prompt chaining
• Model abstraction (OpenAI, Anthropic, etc.)
• Output parsers and structured extraction
• Tool integration
• Memory and context management

Additional Tools

Observability:

• Langfuse for trace logging
• LangSmith for debugging
• Custom dashboards

State Storage:

• In-memory (development)
• PostgreSQL (production)
• Redis for caching

Quick Start Example

Basic Agent Structure:

from langgraph.graph import StateGraph, END
from langchain_core.prompts import ChatPromptTemplate
from langchain_openai import ChatOpenAI

# Define state
class AgentState(TypedDict):
    messages: list
    plan: str | None
    observations: list

# Create graph
workflow = StateGraph(AgentState)

# Add nodes (use LCEL inside)
def planner_node(state: AgentState):
    prompt = ChatPromptTemplate.from_template("{input}")
    llm = ChatOpenAI()
    chain = prompt | llm
    plan = chain.invoke({"input": state["messages"][-1]})
    return {"plan": plan.content}

workflow.add_node("planner", planner_node)
workflow.add_edge("planner", END)
workflow.set_entry_point("planner")

# Compile and run
app = workflow.compile()

This is a minimal example. In production, you'd add more nodes, conditional edges, error handling, and state checkpointing.

From LangChain to LangGraph: Orchestrating Production Agents