Google Released Python Library for Data Extraction

.. PLUS: AI Engineering Toolkit

In today’s newsletter:

  • LangExtract – Google’s new Python library for data extraction

  • AI Engineering Toolkit – 100+ curated libraries for building AI systems

  • MCP Toolbox – Build AI agents with database access

Reading time: 3 minutes.

Google released a Python library for data extraction!

LangExtract is a python library that extracts structured information from unstructured text documents with precise source grounding and interactive visualization.

Key Features:

  • Precise source grounding – maps each extraction to its exact location in the source text for easy traceability and verification

  • Reliable structured outputs – delivers consistent schema-based results using few-shot examples with supported models like Gemini

  • Optimized for long documents – uses chunking, parallel processing, and multi-pass extraction for high recall

  • Interactive visualization – generates an HTML file to review extracted entities in their original context

  • Adaptable to any domain – handles any extraction task from just a few examples, no fine-tuning needed

It’s 100% open-source.

A curated list of 100+ LLM libraries for building, deploying, evaluating, and scaling AI systems.

We’ve curated this repo to help you pick the right tools and frameworks for building LLM applications.

Categories of LLM Libraries include:

  • Tooling for AI Engineers: vector databases, orchestration and workflows, RAG, evaluation and testing, model management, data collection and web scraping

  • Agent Frameworks: frameworks for building and running AI agents

  • LLM Development, Inference & Safety: libraries for training, fine-tuning, efficient model execution, and ensuring secure, reliable operation

  • Infrastructure & Deployment: local development and serving, production serving, inference platforms

👉 Check out the Github Repo(don’t forget to star it)

Google released MCP Toolbox for Databases, an open source MCP server that helps AI agents interact with SQL databases safely and efficiently.

It simplifies tool development by handling infrastructure-level concerns like connection pooling, authentication, and observability, so you can focus on defining what the agent should do.

Key Features:

  • Fast development: Define tools declaratively and integrate them in under 10 lines of code

  • Improved performance: Built-in connection pooling and efficient query execution

  • Secure by default: Integrated authentication for safer data access

  • Built-in observability: Metrics and tracing with OpenTelemetry

  • Multi-database support: Works with PostgreSQL, MySQL, Cloud SQL, AlloyDB, and more

That’s a Wrap

That’s all for today. Thank you for reading today’s edition. See you in the next issue with more AI Engineering insights.

PS: We curate this AI Engineering content for free, and your support means everything. If you find value in what you read, consider sharing it with a friend or two.

Your feedback is valuable: If there’s a topic you’re stuck on or curious about, reply to this email. We’re building this for you, and your feedback helps shape what we send.

WORK WITH US

Looking to promote your company, product, or service to 120K+ AI developers? Get in touch today by replying to this email.