Skip to content

Glq

v0.2.2 Feature

This release adds 5 notable features for engineering teams evaluating rollout.

Published 2mo Model Serving & MLOps
✓ No known CVEs patched
Read the diff → Tool health → What is this tool? →

✓ No known CVEs patched in this version

Topics

inference llm model-compression pytorch quantization

Summary

AI summary

Updates New features, Also in this release, and v0.2.1 across a mixed release.

Full changelog

New features

  • Hessian-based sensitivity profiling for per-layer bit allocation
  • --bpw 2.5 auto-allocates 2/3bpw per layer to hit target average
  • --min-bpw 2 --max-bpw 4 enables full {2,3,4} bpw range per layer
  • Greedy marginal-gain optimizer considers all upgrade jumps (2→3, 2→4, 3→4)
  • New module: glq/sensitivity.py with allocate_bpw() + print_allocation_summary()

Results (SmolLM3-3B, WikiText-2, 128 nsamples, L40S)

| Model | Eff. BPW | Perplexity | vs bf16 |
|-------|----------|------------|---------|
| bf16 | 16.00 | 7.04 | 1.00x |
| GLQ 4bpw | 4.00 | 7.19 | 1.02x |
| GLQ 3.5bpw mixed | 3.50 | 7.20 | 1.02x |
| GLQ 3bpw | 3.00 | 7.64 | 1.09x |
| GLQ 3bpw mixed (2+4) | 3.00 | 7.65 | 1.09x |
| GLQ 2.5bpw mixed | 2.50 | 8.08 | 1.15x |
| GLQ 2bpw | 2.00 | 9.61 | 1.36x |

GLQ 3.5bpw mixed matches uniform 4bpw quality at 10% less storage. GLQ 3bpw mixed (2+4) matches uniform 3bpw at 20% less storage.

5-task lm-eval accuracy (SmolLM3-3B)

| Method | Avg | % of bf16 |
|--------|-----|-----------|
| bf16 | 0.709 | 100% |
| GLQ 4bpw | 0.699 | 98.6% |
| GLQ 2bpw | 0.623 | 87.9% |

Also in this release

  • Fix KV cache bug (v0.2.1): 0.6 → 14.0 tok/s decode
  • Remove B from TC kernel autotune keys (v0.2.1)
  • Fair perplexity re-measurement on L40S with 128 nsamples

Weekly OSS security release digest.

The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.

No spam, unsubscribe anytime.

Share this release

Track Glq

Get notified when new releases ship.

Sign up free

Related context

Beta — feedback welcome: [email protected]