This release fixes issues for SREs watching stability and regressions.
Published 1mo
RAG & Retrieval
✓ No known CVEs patched
✓ No known CVEs patched in this version
Topics
ai
artificial-intelligence
computer-vision
dataset-hub
datasets
machine-learning
+9 more
huggingface
llm
natural-language-processing
nlp
numpy
pandas
pytorch
speech
tensorflow
Summary
AI summaryFixed JSON decoding before DataFrame.to_json and related conversions.
Full changelog
Main bug fixes
- fix: decode Json() values before calling DataFrame.to_json() (#8116) by @Brianzhengca in https://github.com/huggingface/datasets/pull/8122
- Fix: decode JSON type before to_list or to_dict is called by @ItsTania in https://github.com/huggingface/datasets/pull/8137
- Fix batching for table-formatted datasets by @bluehyena in https://github.com/huggingface/datasets/pull/8126
- Fix iterable map resume state by @Brianzhengca in https://github.com/huggingface/datasets/pull/8147
- don't embed remote files in download_and_prepare to parquet by @lhoestq in https://github.com/huggingface/datasets/pull/8150
Other improvements and bug fixes
- Parse agent traces by @lhoestq in https://github.com/huggingface/datasets/pull/8113
- 🔒 Pin GitHub Actions to commit SHAs by @paulinebm in https://github.com/huggingface/datasets/pull/8114
- chore: bump doc-builder SHA for PR upload workflow by @rtrompier in https://github.com/huggingface/datasets/pull/8134
- Remove print statement in JSON processing by @lhoestq in https://github.com/huggingface/datasets/pull/8136
- Don't include files list DatasetInfo (and remove old stuff) by @lhoestq in https://github.com/huggingface/datasets/pull/8128
- update ci uer by @lhoestq in https://github.com/huggingface/datasets/pull/8139
- fix warning in ci by @lhoestq in https://github.com/huggingface/datasets/pull/8140
- fix mask in embed_storage for remote files by @lhoestq in https://github.com/huggingface/datasets/pull/8151
- fix original_files missing in ci json test by @lhoestq in https://github.com/huggingface/datasets/pull/8152
- Fix null in embed storage by @lhoestq in https://github.com/huggingface/datasets/pull/8154
- Fix base_path in integration tests by @lhoestq in https://github.com/huggingface/datasets/pull/8155
New Contributors
- @paulinebm made their first contribution in https://github.com/huggingface/datasets/pull/8114
- @Brianzhengca made their first contribution in https://github.com/huggingface/datasets/pull/8122
- @bluehyena made their first contribution in https://github.com/huggingface/datasets/pull/8126
- @rtrompier made their first contribution in https://github.com/huggingface/datasets/pull/8134
- @ItsTania made their first contribution in https://github.com/huggingface/datasets/pull/8137
Full Changelog: https://github.com/huggingface/datasets/compare/4.8.4...4.8.5
Weekly OSS security release digest.
The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.
No spam, unsubscribe anytime.
Share this release
About datasets
The largest hub of ready-to-use datasets for AI models with fast, easy-to-use and efficient data manipulation tools
Beta — feedback welcome: [email protected]