Skip to content

Conversation

@vamsikarnika
Copy link
Collaborator

@vamsikarnika vamsikarnika commented Aug 28, 2025

Description

Motivation and Context

Improves performance by reducing number of calls to metastore

Release Notes

== NO RELEASE NOTE ==

tanjialiang and others added 30 commits August 26, 2025 12:19
…iterOperator (prestodb#25846)

Summary:
Pull Request resolved: prestodb#25846

Pass the Operator Context's Runtime Stats down into the `TableWriteOperator`'s Page Sink.

Specifically this diff makes the following changes:

a) `TableWriteOperator` passes its `RuntimeStats` into the Page Sink it creates via `PageSinkManager.createPageSink`
b) When the `PageSinkManager.createPageSink` is provided `RuntimeStats`, these `RuntimeStats` are passed into the `Session.toConnectorSession` call, which creates a `FullConnectorSession` instance
c) When `Session.toConnectorSession` is provided `RuntimeStats`, it passes this into the `FullConnectorSession` instance it constructs
d) Add a `Builder` to `FullConnectorSession`, which allows providing a `RuntimeStats` instance to `FullConnectorSession` at construction-time. `FullConnectorSession.getRuntimeStats()` now returns the `RuntimeStats` which was set at construction-time. If no `RuntimeStats` were provided at construction-time, then `FullConnectorSession.getRuntimeStats()` defaults to return the `Session` object's `RuntimeStats`—this preserves backwards compatibility.

All changes preserve forward-compatibility.

## Context

Without this change, the `FullConnectorSession`'s `RuntimeStats` points to the `Session`'s `RuntimeStat`s. All metrics added to the `Session`'s `RuntimeStats` within an Operator Worker-side are discarded. That is, all Runtime Metrics added to the Connector Session's RuntimeStats when executing `TableWriterOperator` were being completely discarded.

Specifically, in Meta, the stats from our internal filesystem implementation were missing.

Passing the Operator Context's `RuntimeStats` instance down into Connector Session is the simplest way to fix this.

Additionally, since the previous `RuntimeStat`s for `TableWriteOperator`'s `FullConnectorSession` were always discarded, we can be confident that replacing them with the `OperatorContext` `RuntimeStat`s will not break anyone else's code.

Differential Revision: D80675849
There is an existing HiveClientConfig property hive.orc.use-column-names to access ORC file by column names, but no session property.
This commit moves the existing HiveClientConfig property to HiveCommonClientConfig and introduces a session property in HiveCommonSessionProperties.
It also implements changes accordingly in DwrfAggregatedPageSourceFactory, OrcAggregatedPageSourceFactory, OrcSelectivePageSourceFactory and OrcBatchPageSourceFactory.
Constructors in those classes do not take boolean useOrcColumnNames anymore. Tests where those are used have also been changed.
Hive connector documentation has been changed.
An integration test has been added to TestHiveDistributedQueries.java.
Helper function created in HiveTestUtils to replace function in TestHiveIntegrationSmokeTest.
Remove superfluous constructors that have hiveClientConfig in parameter list from DwrfAggregatedPageSourceFactory.java and OrcAggregatedPageSourceFactory.java and change explicit calls in HiveTestUtils.java.

Closes-Issue: prestodb#24134

Remove superfluous constructors that have hiveClientConfig in parameter list from DwrfAggregatedPageSourceFactory.java and OrcAggregatedPageSourceFactory.java and change explicit calls in HiveTestUtils.java.

Add additional test with different column names to TestHiveDistributedQueries.java
The test framework client now receives statement executing results with
`clearTransactionId` and `startTransactionId` flags embedded.
Velox provides a function to install the Arrow library. We don’t need
to copy and paste the same code here and can re-use it.
There is an EXTRA_ARROW_OPTIONS variable that allows custom
Arrow library build options to be able to pass along that Arrow Flight
should be built.
Reuse the existing Velox VarcharType to implement the type Char(n) in protocol.
Add a SystemConfig "char-n-type-enabled" to guard this feature.
Note this will make Char(n) type carry the behavior of VarcharType type. It is a different
behavior from Char(n) type in Presto today, where it has a fixed number
of characters. We suppose the user could call rpad() if today's behavior is needed.
…#25902)

## Description

This PR update the github action to publish maven artifacts with central
publishing method, since maven repo doesn't allow executable jar(with
shell script) to be published, so we will create a github release and
publish the jars

Need fix in release branch:
prestodb#25900

Sample release for executable jars
https://github.com/unix280/presto/releases

## Motivation and Context


## Impact
Release 0.294

## Test Plan
Tested the github release in myrepo:
https://github.com/unix280/presto/actions/runs/17272968441
Tested the maven publishing in local env

## Contributor checklist

- [ ] Please make sure your submission complies with our [contributing
guide](https://github.com/prestodb/presto/blob/master/CONTRIBUTING.md),
in particular [code
style](https://github.com/prestodb/presto/blob/master/CONTRIBUTING.md#code-style)
and [commit
standards](https://github.com/prestodb/presto/blob/master/CONTRIBUTING.md#commit-standards).
- [ ] PR description addresses the issue accurately and concisely. If
the change is non-trivial, a GitHub Issue is referenced.
- [ ] Documented new properties (with its default value), SQL syntax,
functions, or other functionality.
- [ ] If release notes are required, they follow the [release notes
guidelines](https://github.com/prestodb/presto/wiki/Release-Notes-Guidelines).
- [ ] Adequate tests were added if applicable.
- [ ] CI passed.

## Release Notes

```
== NO RELEASE NOTE ==
```
@pdabre12 has been voted as module committer for the Presto sidecar module.

Also, I fixed a bug that project committers could not approve some C++ code.  Per our contributing guide, project committers must be capable of approving all code (although C++ module committers are preferred for approving and merging C++ code).
…restodb#25687)

Summary:
Similar to cpp worker added the endpoint for java.
We won't be using the worker-load as going forward we will be focussing
on cpp worker only

Differential Revision: D79471792
WriteMapping support for decimal type is already present for writing values but is missing from the query builder.
This PR adds the write function to the query builder buildSql function
…bc write mappings

These types are missing in the new write mapping interface. If implemented, this will add them back.
Added the `iceberg.engine.hive.lock-enabled` to enable or disable table locks
when iceberg accesses a hive table. This can be overridden with the table property
`engine.hive.lock-enabled`
The map function will not sort a json object by its keys, despite the
json_parse function sorting the same input.
If implemented, this will sort json objects.

Resolves prestodb#24207
Summary:
- Add abstract class BuiltInSpecialFunctionNamespaceManager
  - Add BuiltInNativeFunctionNamespaceManager
- Refactor BuiltInPluginFunctionNamespaceManager to extend the abstract
class
- Deduplicate sidecar function registry logic by moving some of it to
presto-main-base module from presto-native-sidecar-plugin module
- Add function name conflict logic to FunctionAndTypeManager that
overrides SQL built in functions but does not override Java built in
functions.
- Add retry logic in to fetch function registry from worker: retry
interval is every 1 minute

Note: `show functions` will show both built in functions in the same
namespace. This is already similar behavior to regular Native sidecar
namespace enabled with default presto.default prefix. The `show
functions` logic is not addressed in this change. Can add some unit
tests for show functions as well

Tests:
Added unit tests that enable to flag for this feature, and it is
overriding the SQL function implementation properly.

## Release Notes
Please follow [release notes
guidelines](https://github.com/prestodb/presto/wiki/Release-Notes-Guidelines)
and fill in the release notes below.

```
== NO RELEASE NOTE ==
```
Summary:
Fix to support GCC14 build

- Replace `{}` with explicit empty container to avoid the following error within optionals.

          error: converting to 'std::in_place_t' from list would use explicit contructor
     `{}` leads to copy initialization which is not allowed since in_place_t is marked explicit

- Add Import `chrono` in `Duration.h` as gcc14 mandates having it

- Correct include directory path for proxygen

- Ignore errors associated with template-id-cdtor as gcc14 fails build for constructors having template support

Rollback Plan:


```
== NO RELEASE NOTE ==
```


Differential Revision: D80784416

Pulled By: pratikpugalia
Presto-main was split into presto-main and presto-main-base. Update
paths in codeowners file to reflect the change.
…stodb#25750)

Summary:
I added threshold for logging memory pool allocations":
facebookincubator/velox#14437
In this adding I'm adding corresponding session property to configure
the threshold.

Differential Revision: D80066283
Co-authored-by: Christian Zentgraf <[email protected]>
## Description

This PR is [the fix from branch release-0.294](
prestodb#25900), to fix maven release
issues

## Motivation and Context
Merge the fix from release branch into master branch

## Impact
Newer releases

## Test Plan
Tested with release 0.294

## Contributor checklist

- [ ] Please make sure your submission complies with our [contributing
guide](https://github.com/prestodb/presto/blob/master/CONTRIBUTING.md),
in particular [code
style](https://github.com/prestodb/presto/blob/master/CONTRIBUTING.md#code-style)
and [commit
standards](https://github.com/prestodb/presto/blob/master/CONTRIBUTING.md#commit-standards).
- [ ] PR description addresses the issue accurately and concisely. If
the change is non-trivial, a GitHub Issue is referenced.
- [ ] Documented new properties (with its default value), SQL syntax,
functions, or other functionality.
- [ ] If release notes are required, they follow the [release notes
guidelines](https://github.com/prestodb/presto/wiki/Release-Notes-Guidelines).
- [ ] Adequate tests were added if applicable.
- [ ] CI passed.

## Release Notes

```
== NO RELEASE NOTE ==
```
…5357

Fix prestodb#25357
Added type mapping table for Delta Lake to PrestoDB

Co-Authored-By: Steve Burnett <[email protected]>
Co-Authored-By: Jalpreet Singh Nanda <[email protected]>
Summary:

Adds output row stats for sapphire-velox related sink operators
Properly close write file on broadcast write

Reviewed By: singcha

Differential Revision: D81271224
shrinidhijoshi and others added 23 commits September 12, 2025 12:04
This commit introduces mutual TLS authentication for the Arrow
Flight connector, including necessary configuration options for
both the client and server.

It also includes fixes to the CI pipeline and C++ tests to
ensure the new mTLS functionality is properly validated.

co-authored-by: Ajas Mangal <[email protected]>
co-authored-by: Elbin Pallimalil <[email protected]>
co-authored-by: Thanzeel Hassan <[email protected]>
…source node (prestodb#26031)

Summary:

Sapphire-Velox might send multiple task sources with the same source node. Task manager doesn't expect this and directly send splits of each task source to velox task. Since Sapphire-Velox send all splits once for each velox task, then all such task sources have no more splits set. This hit the check failure in the recent added no-more split check in Velox task split add API. This PR fixes the issue by merge the splits from multiple task sources if they share the same source node id.

Reviewed By: zacw7, tanjialiang

Differential Revision: D82367224
…stoTask

Previously, prestoTask->createFinishTimeMs was set after the lock scope,
potentially not reflecting the actual task creation finish time. Now, the
assignment is moved inside the lock, right after the task is created and
assigned, to more accurately capture when the task creation completes.
## Description

1. Enables features for prestissimo image by default, added flags below
when building the image:
```
-DPRESTO_ENABLE_REMOTE_FUNCTIONS=ON
-DPRESTO_ENABLE_JWT=ON
-DPRESTO_STATS_REPORTER_TYPE=PROMETHEUS
-DPRESTO_MEMORY_CHECKER_TYPE=LINUX_MEMORY_CHECKER
-DPRESTO_ENABLE_SPATIAL=ON 
```
2. Use cache mount on ccache directory to accelerate local build
3. Added ARM_BUILD_TARGET for arm build
4. Fixed error in centos dependency image when building arrow
 
## Motivation and Context
By default the image is built with 
```
-DPRESTO_ENABLE_TESTING=OFF
-DPRESTO_ENABLE_PARQUET=ON
-DPRESTO_ENABLE_S3=ON
```

Add more features so that user can try without rebuild the image

## Impact
Release

## Test Plan
Build and test

## Contributor checklist

- [ ] Please make sure your submission complies with our [contributing
guide](https://github.com/prestodb/presto/blob/master/CONTRIBUTING.md),
in particular [code
style](https://github.com/prestodb/presto/blob/master/CONTRIBUTING.md#code-style)
and [commit
standards](https://github.com/prestodb/presto/blob/master/CONTRIBUTING.md#commit-standards).
- [ ] PR description addresses the issue accurately and concisely. If
the change is non-trivial, a GitHub Issue is referenced.
- [ ] Documented new properties (with its default value), SQL syntax,
functions, or other functionality.
- [ ] If release notes are required, they follow the [release notes
guidelines](https://github.com/prestodb/presto/wiki/Release-Notes-Guidelines).
- [ ] Adequate tests were added if applicable.
- [ ] CI passed.

## Release Notes
Please follow [release notes
guidelines](https://github.com/prestodb/presto/wiki/Release-Notes-Guidelines)
and fill in the release notes below.

```
== NO RELEASE NOTE ==
```
…traction

This commit introduces a new overloaded functions
1. array_sort()       that accepts an array and a lambda expression to extract sort keys, then sorts the array in ascending order based on those keys.
2. array_sort_desc()  that accepts an array and a lambda expression to extract sort keys, then sorts the array in descending order based on those keys.

Such as,

  array_sort(ARRAY['hello', 'hi', 'world'], x -> length(x))
  -- Returns: ['hi', 'hello', 'world']

  array_sort(ARRAY[row('apples', 23), row('bananas', 12)], x -> x[2])
  -- Returns: [row('bananas', 12), row('apples', 23)]

  array_sort_desc(ARRAY['hello', 'hi', 'world'], x -> length(x))
  -- Returns: ['hello', 'world', 'hi']

  array_sort_desc(ARRAY[row('apples', 23), row('bananas', 12)], x -> x[2])
  -- Returns: [row('apples', 23), row('bananas', 12)]

The implementation leverages the same code generation approach to optimize key extraction based on element and key types.
Upgrade org.jdbi:jdbi3-core:3.4.0 to org.jdbi:jdbi3-core:3.49.5
org.jdbi:jdbi3-sqlobject:3.4.0 to org.jdbi:jdbi3-sqlobject:3.49.5

This upgrade will fix below vulnerabilities
CVE-2024-1597, CVE-2023-32697
CVE-2023-2976, CVE-2022-41946
CVE-2022-41853,CVE-2022-31197
CVE-2022-26520,CVE-2022-23221
CVE-2022-21724,CVE-2021-42392
CVE-2020-8908, CVE-2020-13692
CVE-2018-10237.
Upgrade org.glassfish.jaxb:jaxb-runtime:2.3.1 to :4.0.5
Addresses CVE-2020-15250.
Changes adapted from trino/PR#11336, 12951, 14175
Original commit:
d4c73389bbdb6b48c24a0969b259286b05a99ade
565700985baff0c4b29fdb1e3e26139a29318b9e
ec8b9fd2b2cc9c8bc78c0ca1317dc34fcf2c48c7
98fc1ee8b29fca86f2a1b3abe4989524940333a6
1aea489884346822c812b1a242acc286e3e1248e
8bd17171a8469b9351e2fd7d9f2f49f4af9ea209
Author: kasiafi

Modifications were made to adapt to Presto including:
Change CatalogName to ConnectorId
Change Symbol to VariableReferenceExpression
TableFunctionNode extends InternalPlanNode instead of PlanNode.
Add applyTableFunction to all implementations of Metadata
Add empty ConnectorTableLayoutHandle to TableHandle in MetadataManger::applyTableFunction
Removal of PlannerContext and replaced with Metadata

Co-authored-by: kasiafi <[email protected]>
Co-authored-by: Pratik Joseph Dabre <[email protected]>
Co-authored-by: Xin Zhang <[email protected]>
Partial cherry-pick but contains the following commits
trinodb/trino@e8a8b5ab
trinodb/trino@7b98764a

Co-authored-by: Stephen Yugel <[email protected]>
Co-authored-by: Szymon Homa <[email protected]>
Co-authored-by: Mateusz Gajewski <[email protected]>
Partial cherry-pick of the following commits -
trinodb/trino@3879f455
trinodb/trino@cd3da24c
trinodb/trino@dcb6f0bf
trinodb/trino@7cdd1336
trinodb/trino@8b8b0bec
trinodb/trino@15e53ffd

Co-authored-by: Stephen Yugel <[email protected]>
Co-authored-by: lukasz-walkiewicz <[email protected]>
Co-authored-by: Nik Hodgkinson <[email protected]>
Co-authored-by: Stephen Yugel <[email protected]>
Co-authored-by: lukasz-walkiewicz <[email protected]>
PrestoTask can be created in different endpoints:
- getTaskStatus
- getTaskInfo
- receive task update
etc

PrestoTask can be created in getTaskStatus, but it won't be able to create velox plan and
start. It has to wait until receiving taskUpdate

Make taskCreationTime represent the time between receiving first taskUpdate
and task creation time
@vamsikarnika vamsikarnika force-pushed the improve_metastore_calls branch from 2fe233d to 7b45616 Compare September 17, 2025 16:17
@vamsikarnika vamsikarnika force-pushed the improve_metastore_calls branch 4 times, most recently from 300a6f6 to cbfeefa Compare September 30, 2025 15:37
@vamsikarnika vamsikarnika force-pushed the improve_metastore_calls branch from 698a75d to 6f1fd8a Compare October 1, 2025 04:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.