Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions docs/src/guide/migration.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,13 @@ stable and breaking changes should generally be communicated (via warnings) for
give users a chance to migrate. This page documents the breaking changes between releases and gives advice on how to
migrate.

## 1.0.0

* The `SearchResult` returned by scalar indices must now output information about null values.
Instead of containing a `RowIdTreeMap`, it now contains a `NullableRowIdSet`. Expressions that
resolve to null values must be included in search results in the null set. This ensures that
`NOT` can be applied to index search results correctly.

## 0.39

* The `lance` crate no longer re-exports utilities from `lance-arrow` such as `RecordBatchExt` or `SchemaExt`. In the
Expand Down
15 changes: 12 additions & 3 deletions python/python/tests/test_scalar_index.py
Original file line number Diff line number Diff line change
Expand Up @@ -1798,13 +1798,14 @@ def test_json_index():
)


def test_null_handling(tmp_path: Path):
def test_null_handling():
tbl = pa.table(
{
"x": [1, 2, None, 3],
"y": ["a", "b", "c", None],
}
)
dataset = lance.write_dataset(tbl, tmp_path / "dataset")
dataset = lance.write_dataset(tbl, "memory://test")

def check():
assert dataset.to_table(filter="x IS NULL").num_rows == 1
Expand All @@ -1813,11 +1814,19 @@ def check():
assert dataset.to_table(filter="x < 5").num_rows == 3
assert dataset.to_table(filter="x IN (1, 2)").num_rows == 2
assert dataset.to_table(filter="x IN (1, 2, NULL)").num_rows == 2
assert dataset.to_table(filter="x > 0 OR (y != 'a')").num_rows == 4
assert dataset.to_table(filter="x > 0 AND (y != 'a')").num_rows == 1
assert dataset.to_table(filter="y != 'a'").num_rows == 2
# NOT should exclude nulls (issue #4756)
assert dataset.to_table(filter="NOT (x < 2)").num_rows == 2
assert dataset.to_table(filter="NOT (x IN (1, 2))").num_rows == 1
# Double NOT
assert dataset.to_table(filter="NOT (NOT (x < 2))").num_rows == 1

check()
dataset.create_scalar_index("x", index_type="BITMAP")
check()
dataset.create_scalar_index("x", index_type="BTREE")
dataset.create_scalar_index("y", index_type="BTREE")
check()


Expand Down
Loading