[VARIANT] Create low-level Variant library with json_to_variant implementation #1030

harshmotw-db · 2025-06-25T02:34:44Z

What changes are proposed in this pull request?

This PR introduces a low-level Variant library with the json_to_variant function which is similar to parse_json in Spark. The function is written in such a way that the caller owns the memory that the output is written to. The caller needs to implement VariantMemoryManager with the methods borrow_value_buffer, borrow_metadata_buffer, ensure_value_buffer_size and ensure_metadata_buffer_size.

How was this change tested?

Several unit tests to manually compare the constructed variants with raw bytes. Implementing variant_to_json should increase coverage and make more tests easier. While the PR currently contains many tests, we will be adding more tests.

TODO:

Test UTF-8 strings with varying character widths.
Test size limit exceeded errors.
More testing on variant objects - nesting, different offset sizes, is_large, keys in different languages etc.
Formalize errors - currently the errors thrown by this library are a little rough.

…into variant_library_independent

scovich · 2025-06-25T14:37:04Z

qq: How does this relate to the ongoing work to support variant in arrow-rs?
https://github.com/apache/arrow-rs/tree/main/parquet-variant

harshmotw-db added 13 commits June 24, 2025 19:30

saving before changing

d354089

intermediate commit

29cd6de

added basic memory allocator

f2d054a

intermediate progress on parse_json

1343252

commit before changing memory management

4d0a86f

removed Rc Refcell dependence

3162f54

add more functionality

5746515

added more tests

c5920a6

verified large_size and u24 offset size in array

416775e

renamed stuff

13da037

metadata creation

003e32a

formalized crate

c686381

implemented json_to_variant barring some more testing

db8a8c1

github-actions bot assigned harshmotw-db Jun 25, 2025

github-actions bot added the breaking-change Change that require a major version bump label Jun 25, 2025

harshmotw-db added 3 commits June 24, 2025 19:43

Merge branch 'main' of https://github.com/harshmotw-db/delta-kernel-rs …

afd597e

…into variant_library_independent

clippy fizx

71f56e7

fix dependencies

097df7a

harshmotw-db mentioned this pull request Jun 26, 2025

[VARIANT] Add support for the json_to_variant API apache/arrow-rs#7783

Merged

nicklan requested a review from scovich July 24, 2025 00:56

scovich removed their request for review July 24, 2025 01:02

nicklan requested a review from scovich July 24, 2025 01:07

harshmotw-db closed this Oct 15, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[VARIANT] Create low-level Variant library with json_to_variant implementation #1030

[VARIANT] Create low-level Variant library with json_to_variant implementation #1030

Uh oh!

harshmotw-db commented Jun 25, 2025 •

edited

Loading

Uh oh!

scovich commented Jun 25, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[VARIANT] Create low-level Variant library with json_to_variant implementation #1030

[VARIANT] Create low-level Variant library with json_to_variant implementation #1030

Uh oh!

Conversation

harshmotw-db commented Jun 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes are proposed in this pull request?

How was this change tested?

Uh oh!

scovich commented Jun 25, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

harshmotw-db commented Jun 25, 2025 •

edited

Loading