Skip to content

pruna

Model Serving & MLOps
Python Latest v0.3.3 · 1mo ago Security brief →

Features

  • Accelerates inference times with advanced optimization techniques
  • Reduces model size while preserving quality
  • Lowers computational costs for cheaper deployments
  • Decreases energy consumption for greener AI

Recent releases

View all 3 releases →
v0.3.3 Breaking risk
Breaking changes
  • Minimum Python version bumped to 3.13
Notable features
  • Add lm-eval to Pruna Metrics
  • DINO Score v3 implementation
  • Initial rapidata support and benchmark integration for PrunaDataModule with PartiPrompts
Full changelog

The juiciest bits 🧃

Algorithm & compatibility improvements ⚙️

  • feat: FA3-FP8-extension by @Marius-Graml in https://github.com/PrunaAI/pruna/pull/552
  • feat: extend moe model check to multimodal ones and add block quantization parameters by @llcnt in https://github.com/PrunaAI/pruna/pull/605
  • feat: moe kernel tuning by @llcnt in https://github.com/PrunaAI/pruna/pull/482

Lots of new tools for benchmarking and evaluation 📊

  • feat(evaluation): Add lm-eval to Pruna Metrics by @sky-2002 in https://github.com/PrunaAI/pruna/pull/380
  • feat(metrics): DINO Score v3 by @davidberenstein1957 in https://github.com/PrunaAI/pruna/pull/568
  • feat: add column_map support to collate functions by @zamal-db in https://github.com/PrunaAI/pruna/pull/561
  • feat: initial implementation for rapidata by @begumcig in https://github.com/PrunaAI/pruna/pull/581
  • feat: add benchmark support to PrunaDataModule and implement PartiPrompts by @davidberenstein1957 in https://github.com/PrunaAI/pruna/pull/502

Python 3.13 and pytorch 2.11 support is here 🐍

  • build: bump python 3.13 by @gsprochette in https://github.com/PrunaAI/pruna/pull/624

Some Bug Fixing 🐞 and Maintenance

  • fix(evaluation): replace bare raises with proper exceptions and add text_generation_quality request by @zamal-db in https://github.com/PrunaAI/pruna/pull/560
  • fix: protect lm-eval import to allow evaluation-agent import without extra by @gsprochette in https://github.com/PrunaAI/pruna/pull/586
  • fix(torchao): update imports of quantizer by @ParagEkbote in https://github.com/PrunaAI/pruna/pull/549
  • fix: wrap callable enum values with enum.member for python 3.13 by @gsprochette in https://github.com/PrunaAI/pruna/pull/583
  • fix: remove pruna-pro hook from pre-commit by @ParagEkbote in https://github.com/PrunaAI/pruna/pull/572
  • fix: cache handling in SmashConfig due to invalid path exception by @ParagEkbote in https://github.com/PrunaAI/pruna/pull/598
  • fix: pre-download sage_attention kernel before applying backend, remove pinned fa3 kernel version by @Marius-Graml in https://github.com/PrunaAI/pruna/pull/578
  • fix: ping peft >= 0.18.0, < 0.19.0 for torchao compatability issues by @davidberenstein1957 in https://github.com/PrunaAI/pruna/pull/630

We’ve made a bunch of improvements to make installing, testing, and developing locally faster and more reliable:

  • ci: fix too many requests http error in the cpu tests by @begumcig in https://github.com/PrunaAI/pruna/pull/577
  • ci: add uv virtual environment cache to setup-uv-project action by @davidberenstein1957 in https://github.com/PrunaAI/pruna/pull/559
  • ci: separate extra installs by @begumcig in https://github.com/PrunaAI/pruna/pull/622
  • build: make index pypi default and pythonanywhere explicit and setup python through uv by @gsprochette in https://github.com/PrunaAI/pruna/pull/613
  • ci: restrict build to manual dispatch and version tags by @gsprochette in https://github.com/PrunaAI/pruna/pull/633
  • test: explicit combo names and switch stable fast fixture by @gsprochette in https://github.com/PrunaAI/pruna/pull/582

We also updated our PR template and Readme for smoother contributions!:

  • docs: update readme cta by @sdiazlor in https://github.com/PrunaAI/pruna/pull/591
  • docs: updated PR template by @minettekaum in https://github.com/PrunaAI/pruna/pull/576
  • docs: added comment about vibe coded solutions to pr template by @minettekaum in https://github.com/PrunaAI/pruna/pull/606

New Faces in the Garden 👩‍🌾

  • @zamal-db made their first contribution in https://github.com/PrunaAI/pruna/pull/560 (and they did not stop and did another contribution right after with #561)
  • @sky-2002 made their first contribution in https://github.com/PrunaAI/pruna/pull/380 and added the entire lm-eval harness to Pruna!

Full Changelog: https://github.com/PrunaAI/pruna/compare/v0.3.2...v0.3.3

v0.3.2 Breaking risk
⚠ Upgrade required
  • Pinned transformers<5.0.0 to maintain compatibility
  • Enforced PR title format via GitHub Actions workflow
Notable features
  • Added distiller, pruner, decoder, distributer, compilation, recoverer, enhancers algorithms and Sage Attention Algorithm
  • Moved three tutorials from Pruna Pro to open‑source Pruna
Full changelog

It’s been almost a year since we open-sourced Pruna (time flies when you're compressing models ✨).
Since then, the community has grown in ways we couldn’t have imagined, new contributors, new ideas, and so many improvements coming from all directions.

To celebrate this milestone, we’ve started bringing more of the algorithms that previously lived in our closed-source stack into the open-source Pruna ecosystem. The goal is simple: give the community access to more of the tools we use internally so everyone can experiment, optimize, and build faster

This release is a big step in that direction, with a whole wave of new algorithms joining the repo, along with stability improvements, compatibility fixes, and some nice quality-of-life upgrades.

From all of us at Pruna: thank you for being part of this journey 💜
Let’s see what landed in v0.3.2 🔮

The juiciest bits 🧃

A whole garden of new algorithms 🌱

The optimization ecosystem keeps growing with a new set of algorithm building blocks:

feat: add distiller algorithm by @minettekaum in https://github.com/PrunaAI/pruna/pull/479

feat: add pruner algorithm by @minettekaum in https://github.com/PrunaAI/pruna/pull/470

feat: decoder algorithm by @minettekaum in https://github.com/PrunaAI/pruna/pull/444

feat: add distributer algorithm by @minettekaum in https://github.com/PrunaAI/pruna/pull/459

feat: add compilation algorithms by @minettekaum in https://github.com/PrunaAI/pruna/pull/443

feat: add recoverer algorithms by @minettekaum in https://github.com/PrunaAI/pruna/pull/491

feat: add enhancers algorithms by @minettekaum in https://github.com/PrunaAI/pruna/pull/469

feat: Sage Attention Algorithm by @Marius-Graml in https://github.com/PrunaAI/pruna/pull/455

More tutorials to help you get started 📚

Getting started with Pruna just got easier thanks to new tutorials:

docs/flux2-klein-tutorial: Flow changed in the tutorial by @minettekaum in https://github.com/PrunaAI/pruna/pull/522

docs: moving three tutorials from Pruna Pro to Pruna by @minettekaum in https://github.com/PrunaAI/pruna/pull/539

Algorithms are learning to play nicely together 🤝

We also improved compatibility between algorithms and model configurations:

feat: add disjoint algorithm compatibility by @gsprochette in https://github.com/PrunaAI/pruna/pull/520

Pruning some bugs 🐞 and maintenance 👩‍🌾

A lot of work went into improving stability, compatibility, and developer tooling:

[CI] Use a Stable Cache Key to prevent warnings in gh-actions by @ParagEkbote in https://github.com/PrunaAI/pruna/pull/456

fix: kernels version by @begumcig in https://github.com/PrunaAI/pruna/pull/504

refactor: formalize artifact saving by @gsprochette in https://github.com/PrunaAI/pruna/pull/492

ci: mark latest torchao as incompatible with latest diffusers by @gsprochette in https://github.com/PrunaAI/pruna/pull/528

fix: filter out non reapplied algorithms in resmash by @gsprochette in https://github.com/PrunaAI/pruna/pull/525

fix: change moe model for compatibility with earlier transformers versions by @gsprochette in https://github.com/PrunaAI/pruna/pull/533

build: pin transformer<5.0.0 by @gsprochette in https://github.com/PrunaAI/pruna/pull/532

ci: enforce pr title format via a github actions workflow by @SaboniAmine in https://github.com/PrunaAI/pruna/pull/538

fix: align sage_attn default hyperparameter return type with refactor by @gsprochette in https://github.com/PrunaAI/pruna/pull/540

fix: symmetric compatibility by @gsprochette in https://github.com/PrunaAI/pruna/pull/555

test: extend fast cpu tests so tests default as cpu by @gsprochette in https://github.com/PrunaAI/pruna/pull/556

build: update ty to v0.0.20 and align type-checking config by @rensortino in https://github.com/PrunaAI/pruna/pull/535

fix: fix awq tracing fx problem by @begumcig in https://github.com/PrunaAI/pruna/pull/551

fix: enable invalid assignment ty checks by @begumcig in https://github.com/PrunaAI/pruna/pull/553

fix: fix ty invalid-argument-type by @begumcig in https://github.com/PrunaAI/pruna/pull/554

fix: remove deprecated pynvml, remove torchmetrics restrictions by @begumcig in https://github.com/PrunaAI/pruna/pull/566

docs: update links in contributing file by @sdiazlor in https://github.com/PrunaAI/pruna/pull/541

🌱 New faces in the garden

@Marius-Graml made their first contribution in https://github.com/PrunaAI/pruna/pull/455

@rensortino made their first contribution in https://github.com/PrunaAI/pruna/pull/535

Full Changelog: https://github.com/PrunaAI/pruna/compare/v0.3.1...v0.3.2

v0.3.1 Breaking risk
⚠ Upgrade required
  • Pin torchao==0.12.0, numpydoc>=1.6.0, and ty==0.0.1a21 for compatibility
  • Explicit uv version set in GitHub Actions CI to reduce flakiness
Breaking changes
  • Minimum Python version bumped to 3.10
  • PyTorch and torch ecosystem upgraded (minimum versions increased)
Notable features
  • KID metric added for evaluation
  • Tiny datasets added for lightweight experiments
  • Mixture-of-Experts efficiency improvements: reduced number of experts per token
Full changelog

This release is mainly about upgrading our minimum PyTorch and torch ecosystem versions ✨
After the bigger structural changes in v0.3.0, we focused on giving Pruna’s dependency stack a little glow-up, smoothing out compatibility issues and making CI more predictable across environments 💅🌸

The juiciest bits 🧃:

PyTorch & torch ecosystem upgrades

  • Minimum Python version bumped to 3.10
    by @ParagEkbote

  • Minimum PyTorch / torch stack upgraded, with careful pinning and unpinning
    by @gsprochette

  • fix: raise error for pytorch compatibility in gptq by @begumcig


Evaluation & experimentation extras

  • KID metric added
    feat: Kid metric added by @minettekaum

  • Tiny datasets for lightweight experiments
    feat: add tiny datasets for lightweight experiments by @begumcig

  • MoE efficiency improvements, reduced number of experts per token
    feat: reduce nb experts per token in moe architectures by @llcnt


Algorithm & SmashConfig improvements

  • Target modules extended to quantizers
    feat: add target modules to quantizers by @gsprochette

  • Flora extended to Flux, adding flora_backbone_calls_per_step
    feat: extended flora to flux by @try1233

Pruning some bugs 🐞 & maintenance 👩‍🌾:

  • fix: missing default parameters in smash config by @gsprochette in https://github.com/PrunaAI/pruna/pull/496
  • fix: missing parameter from all pruna enums by @begumcig in https://github.com/PrunaAI/pruna/pull/495
  • [Tests] Fix Warnings by replacing deprecated methods in Sphinx by @ParagEkbote in https://github.com/PrunaAI/pruna/pull/458
  • fix: pin the kernel version for torch by @begumcig in https://github.com/PrunaAI/pruna/pull/481
  • [CI] Set explicit uv version in gh-actions and send authenticated requests to reduce flakiness by @ParagEkbote in https://github.com/PrunaAI/pruna/pull/473
  • fix: protect pre-existing layers attribute from hqq overwrite by @gsprochette in https://github.com/PrunaAI/pruna/pull/484
  • build: bump stable-fast-pruna requirement by @gsprochette in https://github.com/PrunaAI/pruna/pull/498
  • fix: initialization of base tester class by @johannaSommer in https://github.com/PrunaAI/pruna/pull/460
  • fix: remove debugger logs in FA3 call by @johannaSommer in https://github.com/PrunaAI/pruna/pull/461
  • Enable TruffleHog in pre-commit by @ParagEkbote in https://github.com/PrunaAI/pruna/pull/439
  • Pin torchao==0.12.0 to avoid PyTorch ABI warnings, also pin numpydoc>=1.6.0 and ty==0.0.1a21 for compatibility. by @ParagEkbote in https://github.com/PrunaAI/pruna/pull/417
  • build: delete uv.lock and gitignore it by @gsprochette in https://github.com/PrunaAI/pruna/pull/457

New faces in the garden 💐

We’re very excited to welcome a new Pruner to the team 💖✨

  • @try1233 recently joined Pruna and wasted absolutely no time jumping in! Already shipping features by extending Flora to Flux and making meaningful contributions from day one 🚀🌸
    Big “hit the ground running” energy.

We also want to give a warm, sparkly shoutout to our very own first-time contributor 💫

  • @minettekaum made their first contribution by adding the KID metric to our evaluation suite 🧪💗

We’re so happy to be building together 💖

Full Changelog: https://github.com/PrunaAI/pruna/compare/v0.3.0...v0.3.1

Weekly OSS security release digest.

The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.

No spam, unsubscribe anytime.

About

Stars
1,190
Forks
89
Language
Python

Install & Platforms

Install via
pip
Platforms
linux macos windows

Community & Support

Beta — feedback welcome: [email protected]