AI Development Environment Setup: The Complete 2025 Guide

Your development environment is the foundation of everything you build. A poorly configured AI dev environment means slow iteration cycles, dependency conflicts, and constantly fighting your tools. This guide sets up everything right the first time.

AI development environment with multiple terminal windows

A well-configured AI development environment reduces setup friction and accelerates iteration speed.

Python Environment Management

The single biggest source of AI development pain is Python environment conflicts. The solution: uv (by Astral) replaces pip, virtualenv, and pyenv with a single fast tool.

# Install uv (replaces pip + virtualenv)

curl -LsSf https://astral.sh/uv/install.sh | sh

# Create a new AI project

uv init my-ai-project

cd my-ai-project

# Install core AI dependencies

uv add langchain langchain-community openai anthropic

uv add chromadb sentence-transformers

uv add fastapi uvicorn pydantic

uv add python-dotenv rich loguru

Core Framework Choices

LLM Orchestration

LangGraph — for stateful agent workflows with branching and human-in-the-loop. Best for production agents.
LangChain — for chains, RAG pipelines, and tool use. More mature ecosystem.
Direct API (Anthropic/OpenAI SDK) — when your workflow is simple enough that a framework adds complexity without benefit.

# Minimal LangGraph agent skeleton

from langgraph.graph import StateGraph, END

from langchain_anthropic import ChatAnthropic

llm = ChatAnthropic(model="claude-sonnet-4-6")

def agent_node(state):

  response = llm.invoke(state["messages"])

  return {"messages": state["messages"] + [response]}

graph = StateGraph(dict)

graph.add_node("agent", agent_node)

graph.set_entry_point("agent")

graph.add_edge("agent", END)

app = graph.compile()

Code editor with LangChain and Python AI development

LangGraph's stateful graph abstraction makes complex agent logic much easier to reason about and debug.

Vector Database

Chroma — local, in-process, zero config. Best for development and small-to-medium datasets.
Weaviate — self-hosted or cloud, excellent for production. Hybrid search (vector + keyword) out of the box.
Pinecone — managed cloud vector DB. Use when you don't want to manage infrastructure.
pgvector — if you're already on Postgres, adding vector search is one extension away.

Observability (Non-Negotiable)

# Install LangSmith for tracing

uv add langsmith

# .env file

LANGCHAIN_TRACING_V2=true

LANGCHAIN_API_KEY=your_key_here

LANGCHAIN_PROJECT=my-ai-project

# Now every LangChain call is automatically traced

# View at: smith.langchain.com

IDE Setup

VS Code Extensions (Must-Have)

GitHub Copilot — baseline AI code completion
Python (Microsoft) — language server, debugging, Jupyter
Even Better TOML — pyproject.toml editing
REST Client — test API endpoints directly in the editor
GitLens — enhanced git history and blame

Alternative: Cursor IDE — VS Code fork with deeper AI integration. Tab completions, agent mode, and inline chat. Currently the fastest AI-assisted coding experience available.

Modern IDE setup with AI coding assistant

Cursor or VS Code + Copilot are the two dominant IDE setups for AI development in 2025.

Environment Variables Pattern

# .env (never commit this)

ANTHROPIC_API_KEY=sk-ant-...

OPENAI_API_KEY=sk-...

LANGCHAIN_API_KEY=ls__...

POSTGRES_URL=postgresql://...

# Load in Python

from dotenv import load_dotenv

load_dotenv()  # loads .env automatically

Testing AI Applications

Testing LLM applications requires a different approach from traditional software testing:

Unit tests: Mock LLM calls for fast CI. Test tool logic, routing logic, and output parsing independently.
Eval suite: 20–50 real examples with expected outputs. Run against every version before deploying.
LangSmith evals: Run evaluations at scale using LLM-as-judge for subjective quality metrics.
Adversarial testing: Inputs designed to break the agent — test edge cases, jailbreaks, and unexpected inputs.

The fastest way to improve an agent: Add tracing (LangSmith) on day 1. Review the traces every morning for the first two weeks. The traces will show you exactly where the agent fails and what to fix. This beats any amount of offline testing.

Building an AI Product?

CodeStaff builds production AI systems end-to-end. Let's scope your project.

Talk to the Team

Devin Mallonee

Founder & AI Agent Architect · CodeStaff

Devin has been building software products and remote teams since 2017. He founded CodeStaff to deploy purpose-built AI agents and workstations that replace repetitive work and scale operations for businesses of every size. He writes about AI strategy, agent architecture, and the practical reality of deploying AI in production.