Added ANTLR parse tree persistent caching for procedures#4547
Added ANTLR parse tree persistent caching for procedures#4547manisha-deshpande wants to merge 10 commits intobabelfish-for-postgresql:BABEL_5_X_DEVfrom
Conversation
0e2dd88 to
751d3e8
Compare
| create_date SYS.DATETIME NOT NULL, | ||
| modify_date SYS.DATETIME NOT NULL, | ||
| definition sys.NTEXT DEFAULT NULL, | ||
| antlr_parse_tree JSONB DEFAULT NULL, -- JSONB serialized ANTLR parse tree for caching |
There was a problem hiding this comment.
babelfish_function_ext regression test fails.
Column can be queried from postgres side, not babelfish side due to unsupported datatype.
Should the column type be TEXT instead? Would that affect storage space?
--- /home/runner/work/babelfish_extensions/babelfish_extensions/test/JDBC/./expected/babelfish_function_ext-vu-cleanup.out 2026-02-06 18:16:42.761144533 +0000
+++ /home/runner/work/babelfish_extensions/babelfish_extensions/test/JDBC/./output/babelfish_function_ext-vu-cleanup.out 2026-02-06 18:42:00.808212640 +0000
@@ -52,7 +52,7 @@
-- babelfish_function_ext entry should have been removed after dropping all these functions/procedure
SELECT * FROM sys.babelfish_function_ext WHERE funcname LIKE 'babel_2877_vu_prepare%';
GO
-~~START~~
-varchar#!#varchar#!#nvarchar#!#text#!#text#!#bigint#!#bigint#!#datetime#!#datetime#!#ntext
-~~END~~
+~~ERROR (Code: 33557097)~~
+
+~~ERROR (Message: data type jsonb is not supported yet)~~
There was a problem hiding this comment.
You can ignore this error for now. We can later add a TDS sender function for JSONB (which just sends it as JSON)
Should the column type be TEXT instead? Would that affect storage space?
Yes, JSONB will allow fast lookups compared to JSON/TEXT which will required deserialization of its own. (Which will become a problem for bigger procedures).
751d3e8 to
6bb6abf
Compare
| * | ||
| * This header provides the interface for serializing and deserializing | ||
| * ANTLR PLtsql parse trees to/from JSONB format. The serialized data is | ||
| * stored in the cross-session cache (babelfish_func_ext) to enable faster |
There was a problem hiding this comment.
it's a catalog rather than a cache - even though the catalog will be cached
|
|
||
| /* Read the value for this key */ | ||
| tok = JsonbIteratorNext(&ctx.it, &v, false); | ||
|
|
robverschoor
left a comment
There was a problem hiding this comment.
addded some comments
Task: BABEL-6037 Signed-off-by: Manisha Deshpande <[email protected]>
…ble and type Task: BABEL-6037 Signed-off-by: Manisha Deshpande <[email protected]>
Task: BABEL-6037 Signed-off-by: Manisha Deshpande <[email protected]>
…E, TRY Statements Task: BABEL-6037 Signed-off-by: Manisha Deshpande <[email protected]>
Task: BABEL-6037 Signed-off-by: Manisha Deshpande <[email protected]>
…d retrieval Task: BABEL-6037 Signed-off-by: Manisha Deshpande <[email protected]>
Signed-off-by: Manisha Deshpande <[email protected]>
Add cross-session ANTLR parse tree caching for T-SQL stored procedures. Serialized parse trees and datums are stored in babelfish_function_ext using nodeToString/stringToNode. On procedure execution, cached results are restored to skip ANTLR re-parsing. Cache reads validate the stored bbf_version and modify_date before deserializing, skipping stale entries from different Babelfish versions or procedures modified with the GUC disabled. Changes: - Add antlr_parse_tree_text, antlr_parse_tree_datums, antlr_parse_tree_modify_date, and antlr_parse_tree_bbf_version columns to sys.babelfish_function_ext - Store serialized parse tree and version in pltsql_store_func_default_positions - Restore and validate cached parse tree in new function pltsql_restore_func_parse_result invoked prior to ANTLR parse Task: BABEL-6037 Signed-off-by: Manisha Deshpande <[email protected]>
6316c43 to
ffb844e
Compare
| DROP TABLE IF EXISTS sys.babelfish_function_ext; | ||
|
|
||
| CREATE TABLE sys.babelfish_function_ext ( |
There was a problem hiding this comment.
Should be ALTER TABLE ADD COLUMN
| antlr_parse_tree_text TEXT DEFAULT NULL, -- Native PG nodeToString() serialized parse tree | ||
| antlr_parse_tree_datums TEXT DEFAULT NULL, -- Native PG nodeToString() serialized datums array | ||
| antlr_parse_tree_modify_date SYS.DATETIME DEFAULT NULL, | ||
| antlr_parse_tree_bbf_version TEXT DEFAULT NULL, |
There was a problem hiding this comment.
Two colums should be sufficent, no ?
- pg_node_tree - the tree
- bbf_version_for_tree
Also why will storing it in binary have performnace improvments ? Wouldn't it have the reverse effect
There was a problem hiding this comment.
I've let it remain as two separate columns for now as the split by delimiter function call on large concatenated strings (for large procedures) might be expensive (unnecessary overhead)
contrib/babelfishpg_tsql/src/guc.c
Outdated
|
|
||
| bool pltsql_enable_create_alter_view_from_pg = false; | ||
| bool pltsql_enable_alter_owner_from_pg = false; | ||
| bool pltsql_enable_procedure_parse_cache = false; |
There was a problem hiding this comment.
Lets use the general term routine applies to both procedures and functions.
Add bbf_version validation, exec-time cache repopulation, and rename/alter/dependency invalidation logic and tests Task: BABEL-6037 Signed-off-by: Manisha Deshpande <[email protected]>
Task: BABEL-6037 Signed-off-by: Manisha Deshpande <[email protected]>
Description
[TBD]
Issues Resolved
BABEL-6037
Test Scenarios Covered
[TBD]
Use case based -
Boundary conditions -
Arbitrary inputs -
Negative test cases -
Minor version upgrade tests -
Major version upgrade tests -
Performance tests -
Tooling impact -
Client tests -
Check List
By submitting this pull request, I confirm that my contribution is under the terms of the Apache 2.0 and PostgreSQL licenses, and grant any person obtaining a copy of the contribution permission to relicense all or a portion of my contribution to the PostgreSQL License solely to contribute all or a portion of my contribution to the PostgreSQL open source project.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.