Releases: lhotse-speech/lhotse
Releases · lhotse-speech/lhotse
v1.33.0 - Himalayan Vulture
Dependencies
- Make
lilcomoptional and fix failing unit tests on MacOS by @pzelasko in #1555 - Add optional torchcodec support by @pzelasko in #1562
Recipes
- Notsofar ihm recipe by @Lakoc in #1551
- Add oto_speech dataset recipe by @Lakoc in #1552
- Fix cached manifest reading in some recipes by @pzelasko in #1560
New features
- Support loading multiple non-overlapping custom recordings in
MixedCutby @pzelasko in #1553 - Respect
LHOTSE_IO_BACKENDfor readingAudioSource(type='url'); update docs by @pzelasko in #1557 - Add
AudioSamples(mono_downmix=True)to handle mixed single/multi channel batches gracefully by @pzelasko in #1563 - Chunking functionality by @nune-tadevosyan in #1556
- Add
CutSet.mix(..., tag="noise"),MixTrack.{is_snr_reference,mute}bools, andMixedCut.unmix(tag=...)by @pzelasko in #1559
Fixes
- Fix cuts conversion to hf datasets by @mohsen-goodarzi in #1546
- Fix invalid escape sequence warnings in iwslt22_ta by @yfyeung in #1540
- extend SimpleCutSampler to work better with CutConcatenate by @KarelVesely84 in #1520
- refactor(ais): Add backward compatibility and improve error handling in AISBatchLoader by @gaikwadabhishek in #1542
- Remove hardcoded timeout from AIStore client by @gaikwadabhishek in #1549
- Fix for AIStore client 1.23 by @pzelasko in #1565
- fix: handle truncated batch stream with sequential GET fallback by @gaikwadabhishek in #1566
- fix: skip zero-duration supervisions in index_supervisions() by @gaikwadabhishek in #1567
- fix numpy broadcast dtype issue in loudness normalization by @tnq177 in #1561
New Contributors
- @nune-tadevosyan made their first contribution in #1556
- @tnq177 made their first contribution in #1561
Full Changelog: v1.32.2...v1.33.0
v1.32.2 - Blood Pheasant
What's Changed
- NSF grant acknowledgment in README.md by @pzelasko in #1539
- Fix CutSampler initialization for newer PyTorch versions by @pzelasko in #1543
Full Changelog: v1.32.1...v1.32.2
v1.32.1 - Blood Pheasant
Fix issue with importing Lhotse v1.32.0 on Windows.
Full Changelog: v1.32.0...v1.32.1
v1.32.0 - Blood Pheasant
Recipes
- Add recipe for NOTSOFAR-1 by @domklement in #1517
- Add LibriMix full and WHAM noise preparation recipes by @Lakoc in #1518
- LibriSpeechMix recipe addition. by @Lakoc in #1521
- Restore punctuation in AMI recipe by @Lakoc in #1519
- Add chime6 download by @domklement in #1526
- Grid recipe fix by @ialmajai in #1502
New features
- Add new augmentation: codec compression (GSM, Opus, Vorbis, MP3) by @racoiaws in #1510
- Add new augmentation: lowpass using back-and-forth resampling via libsox by @racoiaws in #1511
- Support cut.load_custom_video() and collate_video(..., recording_field='custom_video') by @pzelasko in #1525
- Add AISBatchLoader for efficient batch data loading from AIStore by @gaikwadabhishek in #1529
- Add AIStore batch loading support in AudioSamples class by @gaikwadabhishek in #1534
Fixes and enhancements
- Fix Windows test failure in test/test_utils.py caused by NamedTemporaryFile by @denini08 in #1504
- fix: resolve AttributeError in DynamicBucketer by @somniumism in #1523
- fix: Add Image support to Cut.iter_data() method by @gaikwadabhishek in #1530
- Add unit tests for AISBatchLoader and fix manifest tracking by @gaikwadabhishek in #1531
- avoid bug appearing with OnTheFlyFeatures and PerturbVolume cut_transform by @KarelVesely84 in #1527
New Contributors
- @ialmajai made their first contribution in #1502
- @denini08 made their first contribution in #1504
- @Lakoc made their first contribution in #1519
- @somniumism made their first contribution in #1523
Full Changelog: 1.31.0...v1.32.0
v1.31.1 - Expedition 33
(skipping 1.31.0 because I accidentally uploaded and deleted this version in the past in PyPI, and can't re-use it now)
New recipes
New features
- Prioritize torchaudio FFMPEG backend for video data by @pzelasko in #1482
- Add image loading capabilities via Pillow by @pzelasko in #1483
- Add new augmentation: soft/hard clipping by @racoiaws in #1512
- Randomized shard slicing for Lhotse Shar by @pzelasko in #1505
Fixes and enhancements
- minor fix: remove execute permission for all recipes by @yfyeung in #1485
- Refactor IO to support both files and URLs for Features, Array, and Image by @pzelasko in #1486
- update edacc download link by @teowenshen in #1495
- allow reading kaldi
textfile with utterances having empty references by @KarelVesely84 in #1496 - Support accessing MixedCut.custom by @pzelasko in #1499
- Support not slicing custom recordings by @pzelasko in #1494
- Support left padding in
supervision_intervalsandsupervision_masksby @yfyeung in #1492 - fix cannot join current thread when drop_last=True by @hoangtran9122 in #1487
- Allow up to half a second of duration mismatch between audio and manifest by @pzelasko in #1513
- Allow random reads of files from inside of tar files by @pzelasko in #1514
New Contributors
- @hoangtran9122 made their first contribution in #1487
Full Changelog: v1.30.3...1.31.0
v1.30.3 - Nirvana patch 3
What's Changed
- [fix] avoid import librosa repeatly by @yuekaizhang in #1475
- Support setting initial shard offset for writing by @pzelasko in #1476
- Testing: CUDA compatibility, remove unused tests, add torchaudio resampling by @pzelasko in #1480
Full Changelog: v1.30.2...v1.30.3
v1.30.2 - Nirvana patch 2
v1.30.1 - Nirvana patch 1
Patch release with bug fixes to AIStore and multi-storage-client logic amongst others.
What's Changed
- Fix the returned value for multi-job to_shar export by @pzelasko in #1470
- fix: add timeout to aistore client initialization by @gaikwadabhishek in #1465
- Restore kaldifeat in CI by @pzelasko in #1471
- fix: add LHOTSE_MSC_BACKEND_FORCED flag to only enfore MSCIOBackend for non-MSC URLs by @jayya2 in #1472
New Contributors
- @gaikwadabhishek made their first contribution in #1465
Full Changelog: v1.30.0...v1.30.1
v1.30.0 - Nirvana
New features
Learn more about multi-storage-client here.
Bug fixes and other enhancements
- Fixes forgotten hardcoded path in the original recipe by @m-wiesner in #1438
- [docs] minor fix by @pengzhendong in #1439
- Update CI by @pzelasko in #1452
- Avoid overwrite existing custom fields in dnsmos annotation. by @pkufool in #1450
- pad cuts by num_samples as default by @pengzhendong in #1449
- Remove
validate_cut_setfrom qa.py by @t13m in #1458 - Fix edge cases in mismatching num samples in audio collation by @pzelasko in #1462
- Fix more edge cases in audio collation by @pzelasko in #1463
- Add
features_dtypearg tocollate_featuresfunction by @t13m in #1456
New Contributors
Full Changelog: v1.29.0...v1.30.0
v1.29.0 - Potion of Everlasting Vigor
What's Changed
Recipes
- Recipe for the Chinese Dysarthric Speech Database by @JinZr in #1423
- Optimized ReazonSpeech download speed using hf datasets features by @yuta0306 in #1434
New features
- Option to save audio in the original format when exporting to shar by @anteju in #1422
CutSet.from_huggingface_dataset()for importing HF datasets by @pzelasko in #1433- Extend AIStore serialization backend to writing by @pzelasko in #1435
Other improvements
- change max_frames to max_duration in docs by @pengzhendong in #1419
- add opensmile url by @pengzhendong in #1424
- File reading IO refactoring into backends by @pzelasko in #1421
- Fix .m4a support in some setups (possibly for other formats not supported by libsndfile) by @racoiaws in #1427
- add to_dict for CustomFieldMixin class by @pengzhendong in #1426
- Fix consecutive same sampler selection in round robin sampler with num_workers>1 by @pzelasko in #1432
- Fixed copying MixedCut with custom attributes set by @pzelasko in #1436
New Contributors
Full Changelog: v1.28.0...v1.29.0