Skip to content
Open
Show file tree
Hide file tree
Changes from 41 commits
Commits
Show all changes
61 commits
Select commit Hold shift + click to select a range
54583b0
Get bazel tests working on Spark 3.2.
tswitzer-netflix May 14, 2025
5cc1c5e
Whoops.
tswitzer-netflix May 14, 2025
9032fb9
bazel setup continued
Aug 9, 2025
89d2e5c
Merge remote-tracking branch 'origin' into tswitzer/bazel-fixes
Aug 9, 2025
6ff1378
Refactor .gitignore and clean up TwoStackLiteAggregatorTest and spark…
Aug 9, 2025
5ec81bc
fixed test artifacts
Aug 10, 2025
bf74405
added circle ci support for bazel
Aug 10, 2025
f68f185
circleci doesn't have bazel preinstalled
Aug 10, 2025
81849c0
scalafmt
Aug 10, 2025
2590233
install directly on the machine so someone doesn't have to rebuild th…
Aug 10, 2025
745063c
circleci config
Aug 10, 2025
fb76daa
running as root :(
Aug 10, 2025
f7afb00
moved bazel config into env variable before repo fetching
Aug 10, 2025
49d1dbf
update Bazel configuration to use integer for RULES_PYTHON_ENABLE_PYSTAR
Aug 10, 2025
fc6ed3c
refactor CI configuration and update dependencies for consistency
Aug 10, 2025
4fc7804
simplified bazel setup with circleci
Aug 10, 2025
a7afe6a
cooment about setup
Aug 10, 2025
0d9d664
lets see if nonroot works
Aug 10, 2025
b511467
run Bazel tests as nonroot user for improved security
Aug 10, 2025
4e62e6c
permission issue on the setup
Aug 10, 2025
2abcb55
crashed circleci
Aug 10, 2025
30ef632
more logging, trying to get rid of segfault
Aug 11, 2025
ecb219f
flakey test didn't help
Aug 11, 2025
bac7804
test util changes for backwards compatibility
Aug 13, 2025
76ad6ff
refine file filtering logic in getFilePaths to exclude non-config files
Aug 15, 2025
caab562
Revert "refine file filtering logic in getFilePaths to exclude non-co…
Aug 17, 2025
6c63016
Add CircleCI configuration for optimized test performance and resourc…
Aug 17, 2025
7e9ea1a
fewer cores
Aug 17, 2025
3f5735e
removed job changes that seem to break tests
Aug 17, 2025
ef5d23a
old artifact
Aug 17, 2025
78714bd
dropping ram for segfault
Aug 17, 2025
4d6aa61
new base image
Aug 17, 2025
f3b8cb8
mkdir
Aug 17, 2025
92ea500
seperating regular and spark tests
Aug 17, 2025
de6a26a
reverting to stable
Aug 17, 2025
4e78bf8
use base executor with thrift and such preinstalled
Aug 17, 2025
3390fc4
Revert "use base executor with thrift and such preinstalled"
Aug 17, 2025
dc01256
bazel wasn't installed
Aug 17, 2025
16b70d8
mkdir
Aug 17, 2025
9c07817
really flakey with bazel
Aug 17, 2025
d557101
Add Ignore annotation to FetcherTest
Aug 18, 2025
12f69e5
Update .devcontainer/install_thrift.sh
abbywh Aug 20, 2025
114d7c2
bazel cache support
Sep 14, 2025
a8cd09b
removing bad emojis
Sep 14, 2025
6f8e243
Merge branch 'bazel-nflx-fixes' of https://github.com/abbywh/chronon …
Sep 14, 2025
4a3b338
lets try removing context (idk circleci)
Sep 14, 2025
5456ca1
lets expand elsewhere
Sep 14, 2025
3a94142
Update .circleci/config.yml
pengyu-hou Sep 15, 2025
b4b6b3d
Update .circleci/config.yml
pengyu-hou Sep 15, 2025
638be4f
explicit export BuildBuddyAPIKey
pengyu-hou Sep 15, 2025
893da4b
fix
pengyu-hou Sep 15, 2025
716fa5e
debug
pengyu-hou Sep 15, 2025
ff03ab7
fix
pengyu-hou Sep 15, 2025
1f0f3bf
circleci suggestion
Sep 21, 2025
a1e5421
name must equal value
Sep 21, 2025
a19d34c
another debug try
Sep 21, 2025
24c182e
trying something new to expand outside the string
Sep 21, 2025
035ee45
maybe we expand in bazelrc?
Sep 21, 2025
a0650bd
yaml issue?
Sep 21, 2025
afa0df6
hopefully it is just an env var?
Sep 21, 2025
5b2e5bf
action env?
Sep 21, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion .bazelrc
Original file line number Diff line number Diff line change
@@ -1,3 +1,6 @@
try-import %workspace%/.circleci.bazelrc
try-import %workspace%/.maven.bazelrc

## Disable remote cache completely when --config=local is passed
build:local --remote_cache=

Expand All @@ -19,7 +22,7 @@ common:spark_3.1 --define spark_version=3.1
common:spark_3.2 --define spark_version=3.2
common:spark_3.5 --define spark_version=3.5
# Default Spark version
common --define spark_version=3.1
common --define spark_version=3.2

# Flink versions
common:flink_1.16 --define flink_version=1.16
Expand Down
1 change: 1 addition & 0 deletions .bazelversion
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
6.4.0
5 changes: 4 additions & 1 deletion .circleci/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -34,9 +34,12 @@ RUN apt-get update && apt-get -y -q install \
openjdk-8-jdk \
pkg-config \
sbt \
bazelisk \
&& apt-get clean

# Install bazelisk directly from GitHub releases
RUN curl -Lo /usr/local/bin/bazel https://github.com/bazelbuild/bazelisk/releases/latest/download/bazelisk-linux-amd64 \
&& chmod +x /usr/local/bin/bazel

# Install thrift
RUN curl -sSL "http://archive.apache.org/dist/thrift/$THRIFT_VERSION/thrift-$THRIFT_VERSION.tar.gz" -o thrift.tar.gz \
&& mkdir -p /usr/src/thrift \
Expand Down
39 changes: 38 additions & 1 deletion .circleci/config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -195,6 +195,42 @@ jobs:
destination: spark_warehouse.tar.gz
when: on_fail

"Bazel Tests":
executor: docker_baseimg_executor_xxlarge
steps:
- checkout
# TODO build/publish image
# Airbnb would have to set up/own this
- run:
name: Run Bazel Setup
command: |
# Add Bazel GPG key and repository
curl -fsSL https://bazel.build/bazel-release.pub.gpg | gpg --dearmor | sudo tee /usr/share/keyrings/bazel-archive-keyring.gpg > /dev/null
echo "deb [arch=amd64 signed-by=/usr/share/keyrings/bazel-archive-keyring.gpg] https://storage.googleapis.com/bazel-apt stable jdk1.8" | sudo tee /etc/apt/sources.list.d/bazel.list
# Update package list and install specific Bazel version
sudo apt update
sudo apt install -y bazel-6.4.0
# Set up bazel-6.4.0 as the default bazel command
sudo update-alternatives --install /usr/bin/bazel bazel /usr/bin/bazel-6.4.0 100
- run:
# TODO remote cache
# Airbnb would have to set up/own this
name: Run Bazel Tests
shell: /bin/bash -leuxo pipefail
command: |
sudo mkdir -p /root/.cache/bazel
useradd --create-home --shell /bin/bash nonroot
chown -R nonroot:nonroot /chronon
chown -R nonroot:nonroot /root/.cache/bazel # Give access to the bazel cache
export JAVA_OPTS="-XX:+CMSClassUnloadingEnabled -XX:MaxPermSize=4G -Xmx4G -Xms2G"
sudo -u nonroot -H -s /bin/bash -c "bazel version && bazel build //... && bazel test //... --test_output=streamed --test_summary=detailed --local_ram_resources=HOST_RAM*0.7 --local_cpu_resources=HOST_CPUS*0.7"
- store_test_results:
path: bazel-testlogs
- store_artifacts:
path: bazel-testlogs
destination: bazel-test-logs
when: on_fail

workflows:
build_test_deploy:
jobs:
Expand Down Expand Up @@ -222,4 +258,5 @@ workflows:
- "Pull Docker Image"
- "Scala 13 -- Iceberg Table Utils Tests":
requires:
- "Pull Docker Image"
- "Pull Docker Image"
- "Bazel Tests"
25 changes: 25 additions & 0 deletions .devcontainer/devcontainer.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
{
"name": "Chronon Development",
"image": "mcr.microsoft.com/devcontainers/base:ubuntu-22.04",
"features": {
"ghcr.io/devcontainers/features/java:1": {
"version": "11"
},
"ghcr.io/devcontainers/features/python:1": {
"version": "3.10"
}
},
"postCreateCommand": ".devcontainer/setup.sh",
"customizations": {
"vscode": {
"extensions": [
"scalameta.metals",
"ms-python.python",
"ms-python.pylint",
"bazelbuild.vscode-bazel"
]
}
},
"forwardPorts": [8080, 3000],
"remoteUser": "vscode"
}
48 changes: 48 additions & 0 deletions .devcontainer/install_thrift.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
#!/bin/bash

set -efx

# Clean up any existing thrift_build directory
rm -rf thrift_build
mkdir thrift_build
pushd thrift_build

# Download archive and verify it matches our expected checksum.
THRIFT_HTTP_ARCHIVE=https://archive.apache.org/dist/thrift/0.13.0/thrift-0.13.0.tar.gz
THRIFT_ARCHIVE=thrift.tar.gz
THRIFT_EXPECTED_CHECKSUM_SHA256=7ad348b88033af46ce49148097afe354d513c1fca7c607b59c33ebb6064b5179
curl "$THRIFT_HTTP_ARCHIVE" -o "$THRIFT_ARCHIVE"
THRIFT_ACTUAL_CHECKSUM_SHA256=$(sha256sum "$THRIFT_ARCHIVE" | awk '{ print $1 }')
if [ "$THRIFT_EXPECTED_CHECKSUM_SHA256" != "$THRIFT_ACTUAL_CHECKSUM_SHA256" ]; then
echo "Checksum does not match expected value" >&2
echo " - location: $THRIFT_HTTP_ARCHIVE" >&2
echo " - expected: $THRIFT_EXPECTED_CHECKSUM_SHA256" >&2
echo " - obtained: $THRIFT_ACTUAL_CHECKSUM_SHA256" >&2
exit 1
fi

echo "Building Thrift from source"
# Build thrift from source.
mkdir src
tar zxvf thrift.tar.gz -C src --strip-components=1
pushd src

# Install build dependencies
sudo apt update
sudo apt install -y build-essential libssl-dev pkg-config flex bison

# Configure and build
./configure --without-python --without-cpp
make

# Install
sudo make install

popd

# Verify installation
thrift -version

popd

echo "Thrift 0.13.0 installation completed successfully"
27 changes: 27 additions & 0 deletions .devcontainer/setup.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
#!/bin/bash

# Install Bazel 6.4.0 (required by this project)
echo "Installing Bazel 6.4.0..."

# Add Bazel GPG key and repository
curl -fsSL https://bazel.build/bazel-release.pub.gpg | gpg --dearmor | sudo tee /usr/share/keyrings/bazel-archive-keyring.gpg > /dev/null
echo "deb [arch=amd64 signed-by=/usr/share/keyrings/bazel-archive-keyring.gpg] https://storage.googleapis.com/bazel-apt stable jdk1.8" | sudo tee /etc/apt/sources.list.d/bazel.list

# Update package list and install specific Bazel version
sudo apt update
sudo apt install -y bazel-6.4.0

# Set up bazel-6.4.0 as the default bazel command
sudo update-alternatives --install /usr/bin/bazel bazel /usr/bin/bazel-6.4.0 100

# Install Thrift compiler (0.13.0 to match project requirements)
echo "Installing Thrift 0.13.0..."
./.devcontainer/install_thrift.sh

# Verify installations
echo "Verifying installations..."
bazel version
python --version
java -version

echo "Setup complete!"
5 changes: 4 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,10 @@ cs
*.venv
# Documentation builds
docs/build/

thrift_build/*
project/*
.metals/*
.claude/*
# Python distribution and packaging
api/py/dist/
api/py/eggs/
Expand Down
17 changes: 1 addition & 16 deletions .ijwb/.bazelproject
Original file line number Diff line number Diff line change
@@ -1,16 +1 @@
directories:
# Add the directories you want added as source here
# By default, we've added your entire workspace ('.')
.

# Automatically includes all relevant targets under the 'directories' above
derive_targets_from_directories: true

targets:
# If source code isn't resolving, add additional targets that compile it here

additional_languages:
# Uncomment any additional languages you want supported
python
scala
java
import tools/ide_support/intellij/default_view.bazelproject
4 changes: 4 additions & 0 deletions .jvmopts
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
-Xms2G
-Xmx4G
-XX:+CMSClassUnloadingEnabled
-XX:MaxPermSize=4G
6 changes: 6 additions & 0 deletions WORKSPACE
Original file line number Diff line number Diff line change
Expand Up @@ -123,6 +123,12 @@ load("@io_bazel_rules_scala//scala:toolchains.bzl", "scala_register_toolchains")

scala_register_toolchains()

load("@io_bazel_rules_scala//testing:junit.bzl", "junit_repositories", "junit_toolchain")

junit_repositories()

junit_toolchain()

load("@io_bazel_rules_scala//testing:scalatest.bzl", "scalatest_repositories", "scalatest_toolchain")

scalatest_repositories()
Expand Down
8 changes: 4 additions & 4 deletions aggregator/BUILD.bazel
Original file line number Diff line number Diff line change
Expand Up @@ -57,13 +57,13 @@ scala_library(
]),
)

scala_test_suite(
scala_junit_test(
name = "test",
srcs = glob(["src/test/scala/ai/chronon/aggregator/test/*.scala"]),
suffixes = ["Test"],
visibility = ["//visibility:public"],
deps = [
":aggregator",
":test-lib",
"//api:api-lib",
"//api:api-models",
maven_artifact("junit:junit"),
Expand Down Expand Up @@ -100,9 +100,9 @@ genrule(

java_export(
name = "aggregator-export",
maven_coordinates = "ai.chronon:aggregator_(scala_version):$(version)",
maven_coordinates = "ai.chronon:aggregator_$(scala_version):$(version)",
pom_template = ":generate_pom",
runtime_deps = [
":aggregator",
],
)
)
Original file line number Diff line number Diff line change
Expand Up @@ -19,8 +19,11 @@ package ai.chronon.aggregator.test
import ai.chronon.aggregator.base.ApproxDistinctCount
import junit.framework.TestCase
import org.junit.Assert._
import org.junit.Test

class ApproxDistinctTest extends TestCase {

@Test
def testErrorBound(uniques: Int, errorBound: Int, lgK: Int): Unit = {
val uniqueElems = 1 to uniques
val duplicates = uniqueElems ++ uniqueElems ++ uniqueElems
Expand All @@ -32,6 +35,7 @@ class ApproxDistinctTest extends TestCase {
assertTrue(Math.abs(estimated - uniques) < errorBound)
}

@Test
def testMergingErrorBound(uniques: Int, errorBound: Int, lgK: Int, merges: Int): Unit = {
val chunkSize = uniques / merges
assert(chunkSize > 0)
Expand All @@ -50,12 +54,14 @@ class ApproxDistinctTest extends TestCase {
assertTrue(Math.abs(estimated - uniques) < errorBound)
}

@Test
def testErrorBounds(): Unit = {
testErrorBound(uniques = 100, errorBound = 1, lgK = 10)
testErrorBound(uniques = 1000, errorBound = 20, lgK = 10)
testErrorBound(uniques = 10000, errorBound = 300, lgK = 10)
}

@Test
def testMergingErrorBounds(): Unit = {
testMergingErrorBound(uniques = 100, errorBound = 1, lgK = 10, merges = 10)
testMergingErrorBound(uniques = 1000, errorBound = 20, lgK = 10, merges = 4)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,11 +3,14 @@ package ai.chronon.aggregator.test
import ai.chronon.aggregator.base.{ApproxHistogram, ApproxHistogramIr}
import junit.framework.TestCase
import org.junit.Assert._
import org.junit.Test

import java.util
import scala.jdk.CollectionConverters._

class ApproxHistogramTest extends TestCase {

@Test
def testHistogram(): Unit = {
val approxHistogram = new ApproxHistogram[String](3)
val counts = (1L to 3).map(i => i.toString -> i).toMap
Expand All @@ -18,6 +21,7 @@ class ApproxHistogramTest extends TestCase {
assertEquals(toHashMap(counts), approxHistogram.finalize(ir))
}

@Test
def testSketch(): Unit = {
val approxHistogram = new ApproxHistogram[String](3)
val counts = (1L to 4).map(i => i.toString -> i).toMap
Expand All @@ -29,6 +33,7 @@ class ApproxHistogramTest extends TestCase {
assertEquals(toHashMap(expected), approxHistogram.finalize(ir))
}

@Test
def testMergeSketches(): Unit = {
val approxHistogram = new ApproxHistogram[String](3)
val counts1: Map[String, Long] = Map("5" -> 5L, "4" -> 4, "2" -> 2, "1" -> 1)
Expand All @@ -52,6 +57,7 @@ class ApproxHistogramTest extends TestCase {
assertTrue(ir.histogram.isEmpty)
}

@Test
def testMergeHistograms(): Unit = {
val approxHistogram = new ApproxHistogram[String](3)
val counts1: Map[String, Long] = Map("4" -> 4L, "2" -> 2)
Expand All @@ -76,6 +82,7 @@ class ApproxHistogramTest extends TestCase {
assertTrue(ir.sketch.isEmpty)
}

@Test
def testMergeHistogramsToSketch(): Unit = {
val approxHistogram = new ApproxHistogram[String](3)
val counts1: Map[String, Long] = Map("4" -> 4L, "3" -> 3)
Expand All @@ -101,6 +108,7 @@ class ApproxHistogramTest extends TestCase {
assertTrue(ir.histogram.isEmpty)
}

@Test
def testMergeSketchAndHistogram(): Unit = {
val approxHistogram = new ApproxHistogram[String](3)
val counts1: Map[String, Long] = Map("5" -> 5L, "3" -> 3, "2" -> 2, "1" -> 1)
Expand All @@ -125,6 +133,7 @@ class ApproxHistogramTest extends TestCase {
assert(ir.histogram.isEmpty)
}

@Test
def testNormalizeHistogram(): Unit = {
val approxHistogram = new ApproxHistogram[String](3)
val counts = (1L to 3).map(i => i.toString -> i).toMap
Expand All @@ -135,6 +144,7 @@ class ApproxHistogramTest extends TestCase {
assertEquals(ir, normalized)
}

@Test
def testNormalizeSketch(): Unit = {
val approxHistogram = new ApproxHistogram[String](3)
val counts = (1L to 4).map(i => i.toString -> i).toMap
Expand Down
Loading