Release history
UQLM releases
All releases
36 shown
Review required
v0.5.0
Breaking risk
Dependencies
normalized_probability deprecation + long‑form UQ
v0.2.7
Breaking risk
⚠ Upgrade required
- Update import statements for UQResult as described in breaking_changes.
Breaking changes
- UQResult import statement changed: previous `from uqlm.scorers.baseclass.uncertainty import UncertaintyQuantifier` → new `from uqlm.utils.results import UQResult`
Notable features
- `plot_ranked_auc` to compute AUPRC and rank in a color‑coded bar plot
- `plot_filtered_accuracy` to compute scorer‑specific filtered LLM accuracy at various confidence thresholds
Full changelog
Highlights
- New utility plotting functions:
plot_ranked_aucto compute AUPRC (rather then current AUROC only) and rank them in a color-coded bar plot (as seen in our research paper)plot_filtered_accuracyto compute scorer-specific filtered LLM accuracy at various confidence thresholds (as seen in our research paper)
- Automated Docs site build
- Breaking change:
UQResultimport statement is changed to the following:- Previous import:
from uqlm.scorers.baseclass.uncertainty import UncertaintyQuantifier - New import:
from uqlm.utils.results import UQResult
- Previous import:
What's Changed
- ci: manage dependencies in CI with poetry for consistency by @trumant in https://github.com/cvs-health/uqlm/pull/160
- Feat: Visualization utility functions by @mohitcek in https://github.com/cvs-health/uqlm/pull/161
- #29 GitHub actions to automate documentation site build on new release by @dimtsap in https://github.com/cvs-health/uqlm/pull/100
- v0.2.6 updates by @dylanbouchard in https://github.com/cvs-health/uqlm/pull/168
- Update Utility Visualization function by @mohitcek in https://github.com/cvs-health/uqlm/pull/170
- Patch release: v0.2.7 by @dylanbouchard in https://github.com/cvs-health/uqlm/pull/169
New Contributors
- @trumant made their first contribution in https://github.com/cvs-health/uqlm/pull/160
Full Changelog: https://github.com/cvs-health/uqlm/compare/v0.2.6...v0.2.7
Review required
v0.2.0
Breaking risk
Breaking upgrade
Dependencies
BLEURT deprecation + progress bars