4 Open Source Libraries to Fine-tune LLMs

.. Qwen3, Llama 4, Gemma 3, Phi-4 & Mistral

In today’s newsletter:

  • 4 Open Source Libraries to Fine-tune LLMs

Reading time: 3 minutes.

Fine-tuning Open-Source LLMs

Fine-tuning large language models used to be slow and expensive.

Now, with recent open-source tools, it’s become much easier to do—no need for large infrastructure or high costs.

You can fine-tune models like Llama, Mistral, Gemma, and Qwen using modest resources.

Here are four libraries that are helping make that possible:

Unsloth lets you fine-tune models like Llama, Mistral, and Gemma on limited hardware, including a single GPU or a Colab notebook.

It supports adapter-based methods like LoRA and QLoRA, and avoids the need to merge weights or retrain from scratch.

This can be useful if you're experimenting locally or working with minimal compute resources.

Key Features:

  • Fast, memory-efficient fine-tuning for Llama, Mistral, and Gemma

  • Runs in Google Colab or local machines

  • Supports adapter-only fine-tuning to reduce memory usage

  • Frequently updated for the latest open-weight models.

DeepSpeed is a library that helps with fine-tuning large language models that don’t fit in memory easily.

It supports things like model parallelism and gradient checkpointing to make better use of GPU memory, and can run across multiple GPUs or machines.

Useful if you're working with larger models in a high-compute setup.

Key Features:

  • Distributed training across GPUs or compute nodes

  • ZeRO optimizer for massive memory savings

  • Optimized for fast inference and large-scale training

  • Works well with HuggingFace and PyTorch-based models

DeepSpeed

Axolotl is a fine-tuning framework for language models that uses simple YAML configuration files.

It supports methods like LoRA and QLoRA, making it easier to fine-tune models such as Llama and Mistral without writing a lot of custom code.

Good option if you want to run experiments with less setup and repetition.

Key Features:

  • Supports PEFT techniques (LoRA, QLoRA)

  • Easy YAML-based configuration system

  • Built-in support for HuggingFace models

  • Optimized for ease of use and reproducibility

Axolotl

LLaMA Factory supports fine-tuning and alignment of more than 100 LLMs and VLMs using either full fine-tuning or parameter-efficient methods.

It offers built-in support for alignment techniques like DPO (Direct Preference Optimization) and RLHF (Reinforcement Learning from Human Feedback).

LLaMA Factory is a solid choice for teams that need a modular CLI interface and support for advanced fine-tuning strategies.

Key Features:

  • Full fine-tuning and adapter-based PEFT methods

  • Alignment support: DPO, PPO, SFT, and RLHF

  • Covers LLMs and multimodal models (VLMs)

  • CLI-driven with support for distributed training

LLaMA-Factory


That’s a Wrap

That’s all for today. Thank you for reading today’s edition. See you in the next issue with more AI Engineering insights.

PS: We curate this AI Engineering content for free, and your support means everything. If you find value in what you read, consider sharing it with a friend or two.

Your feedback is valuable: If there’s a topic you’re stuck on or curious about, reply to this email. We’re building this for you, and your feedback helps shape what we send.

WORK WITH US

Looking to promote your company, product, or service to 120K+ AI developers? Get in touch today by replying to this email.