feat: added new `database.getSize()` api by lvca · Pull Request #3045 · ArcadeData/arcadedb

lvca · 2025-12-17T17:00:58Z

What does this PR do?

A brief description of the change being made with this pull request.

Motivation

What inspired you to submit this pull request?

Related issues

A list of issues either fixed, containing architectural discussions, otherwise relevant
for this Pull Request.

Additional Notes

Anything else we should know when reviewing?

Checklist

I have run the build using mvn clean package command
My unit tests cover both failure and success scenarios

gemini-code-assist · 2025-12-17T17:01:18Z

Summary of Changes

Hello @lvca, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a new API to determine the total size of a database on disk. This feature provides a straightforward way to monitor the storage consumption of a database instance, which can be crucial for resource management and capacity planning. The implementation carefully aggregates file sizes within the database directory, and its functionality is thoroughly validated with new unit tests.

Highlights

New database.getSize() API: A new getSize() method has been added to the BasicDatabase and DatabaseInternal interfaces, allowing users to query the total disk space occupied by a database.
Implementation in LocalDatabase: The getSize() method is implemented in LocalDatabase to calculate the cumulative size of all regular files within the database's directory, providing an accurate measure of its footprint.
Comprehensive Unit Tests: A new test file, DatabaseGetSizeTest.java, has been introduced with extensive unit tests covering various scenarios for the getSize() API, including empty databases, databases with documents, vertices, edges, and indexes, and behavior after data deletion.
API Delegation in Server Components: The new getSize() API has been integrated into ServerDatabase and ReplicatedDatabase by delegating the call to their respective wrapped or proxied database instances, ensuring consistent functionality across different deployment contexts.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

mergify · 2025-12-17T17:02:35Z

🧪 CI Insights

Here's what we observed from your CI run for 272e5ab.

🟢 All jobs passed!

But CI Insights is watching 👀

gemini-code-assist

Code Review

This pull request introduces a new database.getSize() API to calculate the total size of the database files on disk. The implementation in LocalDatabase correctly walks the database directory and sums the sizes of all files. The new API is also exposed through the ServerDatabase and ReplicatedDatabase wrappers, and a comprehensive set of unit tests has been added to validate the new functionality. The changes are well-implemented, but I have one suggestion to improve the consistency of exception handling in the getSize() method.

gemini-code-assist · 2025-12-17T17:05:20Z

engine/src/main/java/com/arcadedb/database/LocalDatabase.java

+      try {
+        final Path dir = Path.of(databasePath);
+        if (!Files.exists(dir))
+          return 0L;
+        try (Stream<Path> stream = Files.walk(dir)) {
+          return stream.filter(Files::isRegularFile).mapToLong(p -> {
+            try {
+              return Files.size(p);
+            } catch (IOException e) {
+              throw new UncheckedIOException(e);
+            }
+          }).sum();
+        }
+      } catch (UncheckedIOException e) {
+        throw new DatabaseOperationException("Error calculating database size", e.getCause());
+      }


The exception handling in this block could be more consistent. An IOException thrown by Files.walk(dir) is not caught here and would be handled by the outer executeInReadLock method, leading to a generic error message. In contrast, an IOException from Files.size(p) is wrapped in UncheckedIOException and caught, resulting in a more specific error message.

To ensure all I/O errors during size calculation are handled uniformly with a clear message, I suggest catching both IOException and UncheckedIOException within this block.

try { final Path dir = Path.of(databasePath); if (!Files.exists(dir)) return 0L; try (Stream<Path> stream = Files.walk(dir)) { return stream.filter(Files::isRegularFile).mapToLong(p -> { try { return Files.size(p); } catch (IOException e) { throw new UncheckedIOException(e); } }).sum(); } } catch (IOException e) { throw new DatabaseOperationException("Error calculating database size", e); } catch (UncheckedIOException e) { throw new DatabaseOperationException("Error calculating database size", e.getCause()); }

….1 [skip ci] Bumps [com.arcadedb:arcadedb-network](https://github.com/ArcadeData/arcadedb) from 25.11.1 to 25.12.1. Release notes *Sourced from [com.arcadedb:arcadedb-network's releases](https://github.com/ArcadeData/arcadedb/releases).* > 25.12.1 > ------- > > ArcadeDB 25.12.1 Release Notes > ============================== > > We're excited to announce the release of ArcadeDB v25.12.1! This release includes significant bug fixes, new features, performance improvements, and dependency updates. > > Highlights > ---------- > > ### Vector Search Enhancements > > * **Fixed critical vector quantization bug** ([#3052](https://redirect.github.com/ArcadeData/arcadedb/issues/3052), [#3053](https://redirect.github.com/ArcadeData/arcadedb/issues/3053)) - INT8 and BINARY vector quantization now works correctly across all dimensions > * **New filtered vector search** ([#3071](https://redirect.github.com/ArcadeData/arcadedb/issues/3071), [#3072](https://redirect.github.com/ArcadeData/arcadedb/issues/3072)) - LSMVectorIndex now supports filtered searches for more precise queries > * **Better vector type support** ([#3090](https://redirect.github.com/ArcadeData/arcadedb/issues/3090)) - Added support for `List<Float>` in vector indexes > * **Improved compression** ([#2911](https://redirect.github.com/ArcadeData/arcadedb/issues/2911)) - Enhanced compression for LSM vector indexes > * **Fixed HNSW graph persistence** ([#2916](https://redirect.github.com/ArcadeData/arcadedb/issues/2916)) - Ensures JVector HNSW graph file is properly closed and flushed to disk > > ### SQL and Query Improvements > > * **Fixed IF statement execution** ([#2775](https://redirect.github.com/ArcadeData/arcadedb/issues/2775)) - SQL scripts with IF statements now execute correctly from console > * **Fixed index creation with IF NOT EXISTS** ([#1819](https://redirect.github.com/ArcadeData/arcadedb/issues/1819)) - Console no longer errors when creating existing indexes with IF NOT EXISTS clause > * **Custom function parameter binding** ([#3046](https://redirect.github.com/ArcadeData/arcadedb/issues/3046), [#3049](https://redirect.github.com/ArcadeData/arcadedb/issues/3049)) - Fixed parameter binding for SQL and JavaScript custom functions > * **SQL method consistency** ([#2964](https://redirect.github.com/ArcadeData/arcadedb/issues/2964), [#2967](https://redirect.github.com/ArcadeData/arcadedb/issues/2967)) - `values()` method now behaves consistently with `keys()` method > * **CONTAINSANY index fix** ([#3051](https://redirect.github.com/ArcadeData/arcadedb/issues/3051)) - Fixed index usage for lists of embedded documents with CONTAINSANY > > ### Transaction Management > > * **Revised transaction logic** ([#3074](https://redirect.github.com/ArcadeData/arcadedb/issues/3074)) - Improved transaction handling and consistency > * **Fixed edge index invalidation** ([#3091](https://redirect.github.com/ArcadeData/arcadedb/issues/3091)) - Edge indexes now remain valid in edge-case scenarios > > ### New Features > > * **Database size API** ([#3045](https://redirect.github.com/ArcadeData/arcadedb/issues/3045)) - Added new `database.getSize()` API method > * **Version display enhancement** ([#2905](https://redirect.github.com/ArcadeData/arcadedb/issues/2905)) - Server log version number now displayed consistently > > What's Changed > -------------- > > ### Bug Fixes > > * Fix INT8 and BINARY vector quantization offset bug in LSMVectorIndex page loading by [`@Copilot`](https://github.com/Copilot) in [ArcadeData/arcadedb#3053](https://redirect.github.com/ArcadeData/arcadedb/pull/3053) > * fix: revert SQL grammar changes and disable deep level JSON insert tests by [`@robfrank`](https://github.com/robfrank) in [ArcadeData/arcadedb#2961](https://redirect.github.com/ArcadeData/arcadedb/pull/2961) > * [#2915](https://redirect.github.com/ArcadeData/arcadedb/issues/2915) fix: ensure Jvector HNSW graph file is closed and flushed to disk on database close by [`@robfrank`](https://github.com/robfrank) in [ArcadeData/arcadedb#2916](https://redirect.github.com/ArcadeData/arcadedb/pull/2916) > * fix: make values method behave like keys method by [`@gramian`](https://github.com/gramian) in [ArcadeData/arcadedb#2967](https://redirect.github.com/ArcadeData/arcadedb/pull/2967) > * Fix custom function parameter binding for SQL and JavaScript functions by [`@Copilot`](https://github.com/Copilot) in [ArcadeData/arcadedb#3049](https://redirect.github.com/ArcadeData/arcadedb/pull/3049) > * fix CONTAINSANY index use for lists of embedded documents by [`@gramian`](https://github.com/gramian) in [ArcadeData/arcadedb#3051](https://redirect.github.com/ArcadeData/arcadedb/pull/3051) > * fix: support List in vector index by [`@szekelyszabi`](https://github.com/szekelyszabi) in [ArcadeData/arcadedb#3090](https://redirect.github.com/ArcadeData/arcadedb/pull/3090) > > ### Features > > * Show version number same as in server log by [`@gramian`](https://github.com/gramian) in [ArcadeData/arcadedb#2905](https://redirect.github.com/ArcadeData/arcadedb/pull/2905) > * feat: added new `database.getSize()` api by [`@lvca`](https://github.com/lvca) in [ArcadeData/arcadedb#3045](https://redirect.github.com/ArcadeData/arcadedb/pull/3045) > * Add filtered vector search support to LSMVectorIndex by [`@Copilot`](https://github.com/Copilot) in [ArcadeData/arcadedb#3072](https://redirect.github.com/ArcadeData/arcadedb/pull/3072) > * add stars chart by [`@robfrank`](https://github.com/robfrank) in [ArcadeData/arcadedb#3084](https://redirect.github.com/ArcadeData/arcadedb/pull/3084) > > ### Performance Improvements > > * Lsm vector fix by [`@lvca`](https://github.com/lvca) in [ArcadeData/arcadedb#2907](https://redirect.github.com/ArcadeData/arcadedb/pull/2907) > * perf: improved compression with lsm vectors by [`@lvca`](https://github.com/lvca) in [ArcadeData/arcadedb#2911](https://redirect.github.com/ArcadeData/arcadedb/pull/2911) ... (truncated) Commits * [`6290454`](ArcadeData/arcadedb@6290454) Set release version to 25.12.1 * [`5bdbdfa`](ArcadeData/arcadedb@5bdbdfa) chore: removed system.out * [`5764b95`](ArcadeData/arcadedb@5764b95) fix: deletion of light edge after last fix * [`a81163a`](ArcadeData/arcadedb@a81163a) fix: avoid reuse of deleted record in same tx * [`a42ae5e`](ArcadeData/arcadedb@a42ae5e) perf: avoid conversion of float[] into List<Float> in SQL engine * [`c8fb3e5`](ArcadeData/arcadedb@c8fb3e5) chore: refactoring conversion functions to float[] in a centralized method * [`de9bfcf`](ArcadeData/arcadedb@de9bfcf) fix: support List<Float> in vector index ([#3090](https://redirect.github.com/ArcadeData/arcadedb/issues/3090)) * [`9e964ef`](ArcadeData/arcadedb@9e964ef) Merge branch 'main' of <https://github.com/ArcadeData/arcadedb> * [`07c7d3e`](ArcadeData/arcadedb@07c7d3e) Fixed failing test using java * [`51a058b`](ArcadeData/arcadedb@51a058b) fix CONTAINSANY index use for lists of embedded documents ([#3051](https://redirect.github.com/ArcadeData/arcadedb/issues/3051)) * Additional commits viewable in [compare view](ArcadeData/arcadedb@25.11.1...25.12.1) [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility\_score?dependency-name=com.arcadedb:arcadedb-network&package-manager=maven&previous-version=25.11.1&new-version=25.12.1)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- Dependabot commands and options You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

(cherry picked from commit 4215ba0)

feat: added new database.getSize() api

272e5ab

lvca added this to the 25.12.1 milestone Dec 17, 2025

lvca self-assigned this Dec 17, 2025

lvca added the enhancement New feature or request label Dec 17, 2025

lvca merged commit 4215ba0 into main Dec 17, 2025
5 of 8 checks passed

lvca deleted the database-size-api branch December 17, 2025 17:01

gemini-code-assist bot reviewed Dec 17, 2025

View reviewed changes

robfrank pushed a commit that referenced this pull request Feb 11, 2026

feat: added new database.getSize() api (#3045)

b6d75c7

(cherry picked from commit 4215ba0)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: added new `database.getSize()` api#3045

feat: added new `database.getSize()` api#3045
lvca merged 1 commit intomainfrom
database-size-api

lvca commented Dec 17, 2025

Uh oh!

Uh oh!

gemini-code-assist bot commented Dec 17, 2025

Uh oh!

mergify bot commented Dec 17, 2025 •

edited

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Dec 17, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

lvca commented Dec 17, 2025

What does this PR do?

Motivation

Related issues

Additional Notes

Checklist

Uh oh!

Uh oh!

gemini-code-assist bot commented Dec 17, 2025

Summary of Changes

Highlights

Footnotes

Uh oh!

mergify bot commented Dec 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🧪 CI Insights

🟢 All jobs passed!

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Dec 17, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

mergify bot commented Dec 17, 2025 •

edited

Loading