- AI Engineering
- Posts
- [Hands-on] Build a PDF RAG Agent
[Hands-on] Build a PDF RAG Agent
....PLUS : Extract structured data and insights from documents
In today’s newsletter:
[Hands-on] PDF RAG Agent using Milvus and Agno
ContextGem: Extract structured data and insights from documents with just a few lines of Python code
Reading time: 3 minutes.
PDF RAG Agent using Milvus and Agno
Let’s build a PDF RAG agent that understands your PDF documents and answers your questions in real time.
The agent pulls context from the indexed PDF files stored in vector DB. If needed, it can also fall back to web search to provide more relevant answers.
Tech Stack:
Step 1: Install Milvus
We use Milvus, an open-source vector database, to store document embeddings.
Milvus supports flexible deployments via Kubernetes, Docker, or Lite mode via pip install.
Let's run the "standalone_embed. sh" script to launch Milvus in Docker standalone mode.

Step 2: Set Up the Vector Database
Use Milvus to store your PDF embeddings and support semantic search:
First, connect to Milvus to store and search vector embeddings.
Agno simplifies this with an abstraction for the Milvus client.
Provide the collection name, embedding model, and server info to create the collection, enabling semantic search in our RAG Agent.

Step 3: Building the PDF RAG Agent
Next, let’s build the PDF RAG Agent
The get_rag_agent function sets up the Agno Agent with gpt-4o-mini model, PDFKnowledgeBase, and the DuckDuckGo tool for web search.
This design allows the agent to try PDF knowledge base first and fall back to websearch if needed.

Step 4: Loading Knowledge & Initializing the Agent
Next, Loading the Knowledge & Agent
When a PDF is processed, we instantiate PDFKnowledgeBase with the file path & our Milvus vector_db.
Then we pass it into the above "get_rag_agent" method to initialize a fully ready-to-chat agent and then store it as a session variable.

Step 5: Define Agent Interaction
Finally define the chat logic. Once the user prompt is passed to the agent, it plans and takes actions.
It first queries the vector database. If no relevant results are returned, it falls back to DuckDuckGo and responds with the final output.

Milvus is the vector database powering our vector search.
It's open-source, incredibly fast, and scalable. It can store and query millions (or even billions) of vectors efficiently.
You can find the entire code for the app in this GitHub repo → (don’t forget to star the repo)
Extract structured data and insights from documents with just a few lines of Python code!
ContextGem is a free, open-source LLM framework that makes extracting structured data and insights from documents radically easier - with minimal code.
Most LLM frameworks require a lot of boilerplate just to extract basic information, which slows down development and adds unnecessary complexity.
ContextGem solves this with a simple and flexible API that handles all the heavy lifting for you.
Key Features:
Extract structured data and insights from documents (containing text, images)
Identify and analyze key aspects (topics, categories, clauses) within documents
Extract specific concepts (entities, facts, conclusions, assessments) from documents
Build complex, multi-LLM extraction workflows through a simple, intuitive API
Create multi-level extraction pipelines (aspects containing concepts, hierarchical aspects, etc.)
It’s 100% Open Source.

That’s a Wrap
That’s all for today. Thank you for reading today’s edition. See you in the next issue with more AI Engineering insights.
PS: We curate this AI Engineering content for free, and your support means everything. If you find value in what you read, consider sharing it with a friend or two.
Your feedback is valuable: If there’s a topic you’re stuck on or curious about, reply to this email. We’re building this for you, and your feedback helps shape what we send.
WORK WITH US
Looking to promote your company, product, or service to 100K+ AI developers? Get in touch today by replying to this email.