- AI Engineering
- Posts
- Andrej Karpathy's Four Principles for Better LLM Coding
Andrej Karpathy's Four Principles for Better LLM Coding
... PLUS: Give Your AI Agent Full Web Access
In today’s newsletter:
Bright Data MCP: One Server to Give Your AI Agent Full Web Access
Andrej Karpathy: Four Principles for Better LLM Coding
Hermes: The AI Agent That Actually Gets Better at Helping You
Reading time: 5 minutes.
AI agents can reason, plan, and execute. But the moment they need to pull live information from the web, things fall apart fast.
Rate limits kick in. IP addresses get flagged. CAPTCHAs interrupt execution mid-task. Geo-restrictions block entire regions of content. The agent stalls, and the workflow breaks.
Bright Data solves this by giving AI agents reliable access to the web. It handles rate limits, CAPTCHAs, IP blocks, and geo-restrictions behind the scenes, so agents can consistently access live websites and data without workflows breaking mid-task.
The Tools Inside the Bright Data MCP Server
The Bright Data MCP Server exposes its capabilities as named tools that agents call directly through the MCP interface. These tools are organized into groups based on the type of data or task they support. Some of the available groups and example tools include:

How It Fits Into an Agent Workflow
An agent monitoring competitor pricing on Amazon can call the web_data_amazon_product tool directly. The Bright Data MCP Server handles rate limits, CAPTCHAs, IP blocks, and returns structured product data.
The same pattern applies to any web-dependent task. The agent requests the data, and the server handles access and retrieval behind the scenes.
Why this matters for agent builders
Most real-world agent workflows touch the web at some point:
Research agents need live search results.
RAG pipelines need fresh source material.
Monitoring agents need consistent access to sites that actively block bots.
All of these break down without infrastructure that can reliably handle the open web at scale.
Bright Data MCP Server sits between your agent and the web and makes that layer reliable, so you can focus on building agent logic instead of debugging why a scraper failed on a JavaScript-rendered page at 2am.
Get started on Claude Desktop
Go to Settings, then Connectors, then Add Custom Connector. Name it
Bright Data Weband paste this as the URL:
https://mcp.brightdata.com/mcp?token=YOUR_API_TOKENClick Add, and Claude now has full web access.
Andrej Karpathy spent time documenting the ways LLMs fail when writing code. This repo turns those observations into a single CLAUDE.md file that changes how tools like Claude Code behave.
The Problem Karpathy Identified
Karpathy's diagnosis is simple: LLMs are often overconfident. They make assumptions silently, overengineer solutions, edit code they do not understand, and keep going even when they are confused.
He highlighted four recurring failure modes:
The model guesses instead of asking clarifying questions
It writes bloated abstractions and unnecessary features
It edits unrelated code, comments, or formatting as side effects
It works toward vague instructions rather than verifiable goals
Those problems show up in real workflows all the time. Ask an LLM to "fix the bug" and it may rewrite half the file. Ask it to "add flexibility" and it may create a new configuration system no one asked for. Karpathy's point was that the issue is not capability, but lack of constraints.
Four Principles-Based Solutions
Think Before Coding: Do not assume. State assumptions explicitly. Present multiple interpretations when ambiguity exists. Push back when a simpler approach exists. Stop when confused and ask for clarification.
Simplicity First: Minimum code that solves the problem. No features beyond what was asked. No abstractions for single-use code. If 200 lines could be 50, rewrite it.
Surgical Changes: Touch only what you must. Do not improve adjacent code or refactor things that are not broken. Match existing style. Every changed line should trace directly to the request.
Goal-Driven Execution: Define success criteria and loop until verified. Transform "fix the bug" into "write a test that reproduces it, then make it pass." Strong success criteria let Claude loop independently.
The key insight from Karpathy: LLMs are exceptionally good at looping until they meet specific goals. Do not tell it what to do. Give it success criteria and watch it go.
You can add the CLAUDE.md file directly to your project root or install the full repo as a Claude Code plugin.
Why This Works
The repo is not really teaching the model new coding skills. It is giving it guardrails.
By packaging these principles into a reusable CLAUDE.md, the repo turns good engineering habits into persistent behavior. This leads to fewer unnecessary edits, fewer rewrites caused by overengineering, and more clarifying questions before the model starts coding.

Hermes Agent is a self-improving AI agent from Nous Research that builds skills from your work, improves them over time, and remembers across sessions.
Most AI agents reset every conversation. You teach them your codebase structure, they forget. You show them your deployment workflow, gone. You explain your preferences, erased.
This happens because most agents treat memory as an afterthought. They save conversation history but do not actually learn from it. They do not build skills or improve their own behavior.
How Hermes Learns and Improves
Hermes fixes this with a closed learning loop. After you complete a complex task, it autonomously creates a skill for it. That skill becomes permanent. Next time a similar task comes up, it uses the skill and improves it based on what worked.
It has a memory system with periodic nudges that actively prompts itself to persist knowledge. FTS5 session search with LLM summarisation recalls past conversations. Honcho dialectic user modelling builds a deepening profile of who you are over time.
Key Capabilities
Works with any model (OpenRouter, GLM, Kimi, MiniMax, OpenAI, or your own endpoint)
Model switching without code changes
Multi-platform access (Telegram, Discord, Slack, WhatsApp, Signal, CLI)
Voice memo transcription and conversation continuity across platforms
Built-in cron scheduler for automations
Delegates and parallelises work by spawning isolated subagents
That’s all for today. Thank you for reading today’s edition. See you in the next issue with more AI Engineering insights.
PS: We curate this AI Engineering content for free, and your support means everything. If you find value in what you read, consider sharing it with a friend or two.
Your feedback is valuable: If there’s a topic you’re stuck on or curious about, reply to this email. We’re building this for you, and your feedback helps shape what we send.
WORK WITH US
Looking to promote your company, product, or service to 200K+ AI developers? Get in touch today by replying to this email.

