AI Engineering
Posts
Google Released Python Library for Data Extraction

Google Released Python Library for Data Extraction

.. PLUS: AI Engineering Toolkit

Sumanth P
August 15, 2025

In today’s newsletter:

LangExtract – Google’s new Python library for data extraction
AI Engineering Toolkit – 100+ curated libraries for building AI systems
MCP Toolbox – Build AI agents with database access

Reading time: 3 minutes.

LangExtract

Google released a Python library for data extraction!

LangExtract is a python library that extracts structured information from unstructured text documents with precise source grounding and interactive visualization.

Key Features:

Precise source grounding – maps each extraction to its exact location in the source text for easy traceability and verification
Reliable structured outputs – delivers consistent schema-based results using few-shot examples with supported models like Gemini
Optimized for long documents – uses chunking, parallel processing, and multi-pass extraction for high recall
Interactive visualization – generates an HTML file to review extracted entities in their original context
Adaptable to any domain – handles any extraction task from just a few examples, no fine-tuning needed

It’s 100% open-source.

👉 Check out the Github Repo

AI Engineering Toolkit

A curated list of 100+ LLM libraries for building, deploying, evaluating, and scaling AI systems.

We’ve curated this repo to help you pick the right tools and frameworks for building LLM applications.

Categories of LLM Libraries include:

Tooling for AI Engineers: vector databases, orchestration and workflows, RAG, evaluation and testing, model management, data collection and web scraping
Agent Frameworks: frameworks for building and running AI agents
LLM Development, Inference & Safety: libraries for training, fine-tuning, efficient model execution, and ensuring secure, reliable operation
Infrastructure & Deployment: local development and serving, production serving, inference platforms

👉 Check out the Github Repo(don’t forget to star it)

Build AI Agents with Database Access

Google released MCP Toolbox for Databases, an open source MCP server that helps AI agents interact with SQL databases safely and efficiently.

It simplifies tool development by handling infrastructure-level concerns like connection pooling, authentication, and observability, so you can focus on defining what the agent should do.

Key Features:

Fast development: Define tools declaratively and integrate them in under 10 lines of code
Improved performance: Built-in connection pooling and efficient query execution
Secure by default: Integrated authentication for safer data access
Built-in observability: Metrics and tracing with OpenTelemetry
Multi-database support: Works with PostgreSQL, MySQL, Cloud SQL, AlloyDB, and more

👉 Check out the Github Repo

That’s a Wrap

That’s all for today. Thank you for reading today’s edition. See you in the next issue with more AI Engineering insights.

PS: We curate this AI Engineering content for free, and your support means everything. If you find value in what you read, consider sharing it with a friend or two.

Your feedback is valuable: If there’s a topic you’re stuck on or curious about, reply to this email. We’re building this for you, and your feedback helps shape what we send.

WORK WITH US

Looking to promote your company, product, or service to 120K+ AI developers? Get in touch today by replying to this email.