Glq

v0.1.5 Feature

This release adds 1 notable feature for engineering teams evaluating rollout.

Published 4mo Model Serving & MLOps

✓ No known CVEs patched

✓ No known CVEs patched in this version

Topics

inference llm model-compression pytorch quantization

Summary

AI summary

Fixes Mistral3 multimodal model quantization to produce loadable checkpoints.

Full changelog

Fix Mistral3 multimodal model quantization: Quantizing Mistral3-family models (e.g. mistralai/Ministral-3-3B-Base-2512) now produces correct checkpoints loadable via AutoModelForCausalLM. Previously, the save pipeline iterated the full multimodal model instead of the text backbone, producing broken key prefixes and inflated file sizes.
Ministral-3-3B-Base-2512 benchmarks (A10G, 16 calibration samples):

| Method | BPW | PPL | vs bf16 | GPU MB | tok/s |
|--------|-----|------|---------|--------|-------|
| bf16 | 16 | 5.91 | 1.00x | 7348 | 37.0 |
| GLQ 3-bit | 3 | 6.47 | 1.09x | 3788 | 11.4 |

pip install 'glq==0.1.5'

Weekly OSS security release digest.

The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.

No spam, unsubscribe anytime.

Share this release

Track Glq

Get notified when new releases ship.

About Glq