Skip to content

Pull requests: datajuicer/data-juicer

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

Replace bs4 stub with beautifulsoup4 in dependencies
#977 opened May 12, 2026 by justinwolfington Loading…
2 tasks
Add video_human_3d_pose_mapper. dj:op issues/PRs about some specific OPs enhancement New feature or request
#976 opened May 10, 2026 by Qirui-jiao Collaborator Loading…
Update num_proc handling for vllm and Ray mode
#973 opened May 1, 2026 by ArdalanM Loading…
Add normal map op, optimal flow op, and universal segmentation op for videos. dj:multimodal issues/PRs about multimodal data processing dj:op issues/PRs about some specific OPs enhancement New feature or request
#970 opened Apr 27, 2026 by Qirui-jiao Collaborator Loading…
[WIP] feat(agent): training-ready data recipes, learnable-value mappers, cross-model similarity agent related to agent dj:op issues/PRs about some specific OPs dj:post-tuning issues/PRs about post-tuning scenarios
#969 opened Apr 20, 2026 by yxdyc Collaborator Loading…
[WIP] feat: add persistent custom operator registry
#968 opened Apr 15, 2026 by cmgzn Collaborator Loading…
Add face keypoints/animal pose ops & Extend ops for frame-sequence input dj:op issues/PRs about some specific OPs enhancement New feature or request
#966 opened Apr 14, 2026 by Qirui-jiao Collaborator Loading…
refactor: declarative schema for configuration
#963 opened Apr 8, 2026 by cmgzn Collaborator Loading…
better parallelism in partitioned ray executor
#945 opened Mar 17, 2026 by cyruszhang Collaborator Draft
[WIP] feat: Integrate ElasticJuicer Core Modules
#934 opened Mar 11, 2026 by fengrui-z Collaborator Loading…
1 of 4 tasks
Feat: update vla ops and add val pipeline demo
#931 opened Mar 6, 2026 by Cathy0908 Collaborator Loading…
[WIP] arXiv/PDF to Markdown mappers + dj-op one-shot runner dj:op issues/PRs about some specific OPs
#917 opened Feb 14, 2026 by yxdyc Collaborator Loading…
[WIP] Multi-branch executor dj:core issues/PRs about the core functions of Data-Juicer enhancement New feature or request
#916 opened Feb 13, 2026 by yxdyc Collaborator Loading…
[WIP] feat: Add combined_logical_filter operator with AND/OR support dj:op issues/PRs about some specific OPs
#914 opened Feb 13, 2026 by yxdyc Collaborator Loading…
Feat: Support paimon, iceberg, hudi, delta lake, hdfs data source.
#911 opened Feb 11, 2026 by Dludora Collaborator Loading…
[WIP] Feat: Add RayImageBTSMinhashDeduplicator
#897 opened Jan 29, 2026 by Dludora Collaborator Loading…
Depth seg new op dj:op issues/PRs about some specific OPs
#862 opened Dec 22, 2025 by archernsy Loading…
[NewOp] Add group_diversity_filter op
#745 opened Jul 22, 2025 by lingzhq Collaborator Loading…
Add lidar object segmentation op
#736 opened Jul 14, 2025 by Qirui-jiao Collaborator Loading…
ProTip! Type g p on any issue or pull request to go back to the pull request listing page.