AI Engineering
Posts
Build and Deploy LLM agents just using natural language!

Build and Deploy LLM agents just using natural language!

.. PLUS: Transformers & LLMs cheatsheet from Stanford

Sumanth P
June 20, 2025

In today’s newsletter:

Build and deploy LLM agents just using natural language
DeepEval - Pytest for LLM applications
Transformers & LLMs cheatsheet from Stanford's CME 295

Reading time: 3 minutes.

AutoAgent

AutoAgent is the Fully-Automated & Zero-Code LLM Agent Framework that let's you create and deploy LLM agents using just natural language.

Key Features:

Agentic-RAG – Built-in self-managing vector database, outperforming LangChain.
Zero-Code Agent & Workflow Creation – Just use natural language, no coding needed.
Universal LLM Support – Works with OpenAI, Anthropic, Deepseek, vLLM, Huggingface & more.
Flexible Interaction – Supports both function-calling & ReAct modes.

It’s 100% Open Source

👉 Link to the Github Repo

DeepEval

DeepEval is an open-source framework that makes it easy to evaluate, test, and debug LLM applications.

It incorporates the latest research to evaluate LLM outputs using metrics such as G-Eval, answer relevancy, faithfulness and tool correctness.

Key Features:

CI/CD + pytest support for automated testing
20+ evals for RAG, agents, and chatbots (G-Eval, faithfulness, bias)
Custom metrics and local runs
Generate synthetic data from your own KB
Test 40+ safety issues (bias, toxicity, injections)
Benchmark LLMs with MMLU, TruthfulQA, HumanEval, and more

👉 Link to the Github Repo

Transformers & LLMs cheatsheet

Stanford University released the best cheatsheets you'll ever find to learn LLMs & Transformers!

These concise, high-quality cheatsheets cover:

Transformers: self-attention, architecture, variants, optimization techniques (sparse attention, low-rank attention, flash attention)
LLMs: prompting, finetuning (SFT, LoRA), preference tuning, optimization techniques (mixture of experts, distillation, quantization)
Applications: LLM-as-a-judge, RAG, agents, reasoning models (train-time and test-time scaling from DeepSeek-R1)

👉 Link to the Github Repo

That’s a Wrap

That’s all for today. Thank you for reading today’s edition. See you in the next issue with more AI Engineering insights.

PS: We curate this AI Engineering content for free, and your support means everything. If you find value in what you read, consider sharing it with a friend or two.

Your feedback is valuable: If there’s a topic you’re stuck on or curious about, reply to this email. We’re building this for you, and your feedback helps shape what we send.

WORK WITH US

Looking to promote your company, product, or service to 120K+ AI developers? Get in touch today by replying to this email.