Your development environment is the foundation of everything you build. A poorly configured AI dev environment means slow iteration cycles, dependency conflicts, and constantly fighting your tools. This guide sets up everything right the first time.
Python Environment Management
The single biggest source of AI development pain is Python environment conflicts. The solution: uv (by Astral) replaces pip, virtualenv, and pyenv with a single fast tool.
curl -LsSf https://astral.sh/uv/install.sh | sh
# Create a new AI project
uv init my-ai-project
cd my-ai-project
# Install core AI dependencies
uv add langchain langchain-community openai anthropic
uv add chromadb sentence-transformers
uv add fastapi uvicorn pydantic
uv add python-dotenv rich loguru
Core Framework Choices
LLM Orchestration
- LangGraph — for stateful agent workflows with branching and human-in-the-loop. Best for production agents.
- LangChain — for chains, RAG pipelines, and tool use. More mature ecosystem.
- Direct API (Anthropic/OpenAI SDK) — when your workflow is simple enough that a framework adds complexity without benefit.
from langgraph.graph import StateGraph, END
from langchain_anthropic import ChatAnthropic
llm = ChatAnthropic(model="claude-sonnet-4-6")
def agent_node(state):
response = llm.invoke(state["messages"])
return {"messages": state["messages"] + [response]}
graph = StateGraph(dict)
graph.add_node("agent", agent_node)
graph.set_entry_point("agent")
graph.add_edge("agent", END)
app = graph.compile()
Vector Database
- Chroma — local, in-process, zero config. Best for development and small-to-medium datasets.
- Weaviate — self-hosted or cloud, excellent for production. Hybrid search (vector + keyword) out of the box.
- Pinecone — managed cloud vector DB. Use when you don't want to manage infrastructure.
- pgvector — if you're already on Postgres, adding vector search is one extension away.
Observability (Non-Negotiable)
uv add langsmith
# .env file
LANGCHAIN_TRACING_V2=true
LANGCHAIN_API_KEY=your_key_here
LANGCHAIN_PROJECT=my-ai-project
# Now every LangChain call is automatically traced
# View at: smith.langchain.com
IDE Setup
VS Code Extensions (Must-Have)
- GitHub Copilot — baseline AI code completion
- Python (Microsoft) — language server, debugging, Jupyter
- Even Better TOML — pyproject.toml editing
- REST Client — test API endpoints directly in the editor
- GitLens — enhanced git history and blame
Alternative: Cursor IDE — VS Code fork with deeper AI integration. Tab completions, agent mode, and inline chat. Currently the fastest AI-assisted coding experience available.
Environment Variables Pattern
ANTHROPIC_API_KEY=sk-ant-...
OPENAI_API_KEY=sk-...
LANGCHAIN_API_KEY=ls__...
POSTGRES_URL=postgresql://...
# Load in Python
from dotenv import load_dotenv
load_dotenv() # loads .env automatically
Testing AI Applications
Testing LLM applications requires a different approach from traditional software testing:
- Unit tests: Mock LLM calls for fast CI. Test tool logic, routing logic, and output parsing independently.
- Eval suite: 20–50 real examples with expected outputs. Run against every version before deploying.
- LangSmith evals: Run evaluations at scale using LLM-as-judge for subjective quality metrics.
- Adversarial testing: Inputs designed to break the agent — test edge cases, jailbreaks, and unexpected inputs.
The fastest way to improve an agent: Add tracing (LangSmith) on day 1. Review the traces every morning for the first two weeks. The traces will show you exactly where the agent fails and what to fix. This beats any amount of offline testing.
Building an AI Product?
CodeStaff builds production AI systems end-to-end. Let's scope your project.
Talk to the Team