- AI Engineering
- Posts
- Train AI Agents with RL - Agent Lightning from Microsoft
Train AI Agents with RL - Agent Lightning from Microsoft
.. PLUS: The Smol Training Playbook from Hugging Face
In today’s newsletter:
Agent Lightning – Train AI Agents with Reinforcement Learning
Build AI Agents Without Writing a Single Line of Code
The Smol Training Playbook – Train LLMs End-to-End
Reading time: 3 minutes.
Microsoft released Agent Lightning, an open-source Python framework that lets you train and improve AI agents without rewriting your existing logic.
It works with existing setups such as LangChain, AutoGen, and the OpenAI Agents SDK, adding a training layer that helps agents learn from experience and feedback.
Traditional agent frameworks could only execute workflows. They could not improve over time. Enhancing performance required manual prompt tuning or full retraining.
Agent Lightning changes that by introducing a lightweight training layer on top of your current agent. You can define reward functions, capture execution traces, and use reinforcement learning to make your agent better with each iteration.
Key Highlights:
Compatible with LangChain, AutoGen, and other agent frameworks
Adds a training loop with minimal code changes
Supports reinforcement learning, prompt tuning, and supervised fine-tuning
Automatically captures prompts, actions, and rewards for optimization
Allows defining custom reward functions for domain-specific tuning
It’s 100% open source.
LangChain released LangSmith Agent Builder a no-code framework that lets anyone create and run AI agents without writing a single line of code.
It’s built for developers who want autonomous agents that can connect to tools, run multi-step workflows, and adapt over time.
Unlike visual workflow tools like n8n or OpenAI’s Agent Builder, it doesn’t rely on fixed paths. The LLM dynamically generates the agent’s prompts, tools, and triggers from a short natural language description.
It also features adaptive memory that lets the agent update its own instructions based on user feedback or edge cases for future runs.
Key Features:
Conversational Setup – Guided agent creation with no prompt engineering needed, auto-generating prompts, tool connections, and triggers.
Adaptive Memory – Self-updating instructions and tool setups based on your feedback, eliminating manual prompt edits.
Tool Integration – Securely connect services like Gmail, Slack, Linear, and LinkedIn through built-in OAuth and MCP support with permission control.
Agent Inbox – A central monitoring hub showing agent status (idle, busy, errored) and notifications for required attention.
Hugging Face released the Smol Training Playbook, a complete guide to training Large Language Models from scratch.
It’s designed for developers who want to understand every stage of the training pipeline, from dataset creation to distributed GPU runs and post-training optimization.
The playbook completes the trilogy that began with FineWeb for dataset building and Ultrascale for large-scale infrastructure, making it a practical reference for anyone training compact language models efficiently.
Here’s what you’ll learn:
Prepare and curate datasets for model pretraining
Configure and run distributed training across multi-GPU clusters
Tune hyperparameters systematically for optimal performance
Apply post-training methods to enhance quality
Log, compare, and reproduce experiments for reliable evaluation
Manage scaling and efficiency trade-offs in real training environments
That’s a Wrap
That’s all for today. Thank you for reading today’s edition. See you in the next issue with more AI Engineering insights.
PS: We curate this AI Engineering content for free, and your support means everything. If you find value in what you read, consider sharing it with a friend or two.
Your feedback is valuable: If there’s a topic you’re stuck on or curious about, reply to this email. We’re building this for you, and your feedback helps shape what we send.
WORK WITH US
Looking to promote your company, product, or service to 150K+ AI developers? Get in touch today by replying to this email.


