Skip to content

Conversation

@ViktorT-11
Copy link
Collaborator

@ViktorT-11 ViktorT-11 commented Aug 26, 2025

This PR addresses Phase 1 of the release plan of comment #9762 (comment) in #9762.

This PR extracts the non-specific lnd specific logic of the sqldb package into a new sqldb/v2 package, and introduces the extra functionality that sqldb/v2 adds.

@ViktorT-11 ViktorT-11 changed the title 2025 08 sqldb v2 separate package sqldb/v2 as separate package Aug 26, 2025
@ViktorT-11 ViktorT-11 changed the base branch from sqldb-v2 to master August 26, 2025 21:43
@ViktorT-11 ViktorT-11 force-pushed the 2025-08-sqldb-v2-separate-package branch 3 times, most recently from 6b5e428 to 6899a39 Compare September 2, 2025 23:30
Copy link
Collaborator

@ellemouton ellemouton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did a first pass, looking really good!

go.mod Outdated
Comment on lines 218 to 219
replace github.com/golang-migrate/migrate/v4 => github.com/lightninglabs/migrate/v4 v4.18.2-9023d66a-fork-pr-2

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok afaict, the big thing we added in the fork was the post migration call-backs logic.

Something id just like some clarity on before we push forward here is:
in our own rolled migration strategy in LND today that takes care of code migrations, we know that the migration version table will only be updated once the code migration is complete. Is the same true for the post-migration callback?

Copy link
Collaborator Author

@ViktorT-11 ViktorT-11 Sep 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I discussed this with guggero previously.
Currently, no that is not true unfortunately, as the version table will be updated if the SQL migration succeeds, while the code migration fails. But I'm planning to address that asap 🔥!

The current plan is to submit a new PR to our migrate library fork. This PR will add a separate tracking table for code migrations. With this change, the logic will ensure that if the db is initialized and the persisted code migration version is lower than the SQL migration version, the code migration will be re-run prior to any SQL migration.

I haven't gotten to updating the migrate fork yet though. Perhaps we should hold of with merging this one until that change has been merged to the migrate fork, so that we don't need to push a new PR that updates the dependency in lnd though.

Does that sound like a good plan to you @ellemouton?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good!

@ViktorT-11 ViktorT-11 force-pushed the 2025-08-sqldb-v2-separate-package branch 2 times, most recently from 7ca3000 to 233f1fb Compare September 4, 2025 11:32
@ViktorT-11
Copy link
Collaborator Author

Thanks a lot for the review @ellemouton 🔥🎉! Did one separate push to address the latest feedback (and add the contents of #10193) + one push to rebase this PR on master.

I'm not re-requesting a review of you just yet until we have decided on #10175 (comment) @ellemouton.

@ViktorT-11
Copy link
Collaborator Author

Will address failing CI checks on the next push after the next review round.

@ViktorT-11 ViktorT-11 changed the base branch from master to sqldb-v2 September 5, 2025 10:08
@ViktorT-11
Copy link
Collaborator Author

Based the PR on the now updated sqldb-v2 branch 🎉!

@ViktorT-11 ViktorT-11 marked this pull request as ready for review September 5, 2025 10:09
Copy link
Contributor

@GustavoStingelin GustavoStingelin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work here! I'm excited to see this moving forward and to use it in btcwallet.
I’ve opened a PR with some related work already, but this looks more polished and complete.

I’ve added a few comments as well.

Comment on lines 53 to 55
// MigrateTableName is the name of the table used by golang-migrate to
// track the current schema version.
MigrateTableName string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about renaming this to MigrationTableName? Or, to avoid repeating the word “migration”, maybe TrackingTableName would work better. To me, MigrateTableName sounds more like the table that is going to be migrated, rather than the one used for tracking migrations.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is part of the new MigrationStream functionality this PR introduces, I changed this to your suggestion without including it in a separate commit.

Comment on lines 57 to 62
// Schemas is the embedded file system containing the SQL migration
// files.
Schemas embed.FS

// SQLFileDirectory is the directory containing the SQL migration files.
SQLFileDirectory string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we’re using two different names to describe the DDL migrations. One is Schemas, which might be confusing since it sounds like the literal database.schema relation. The other is SQLFileDirectory, which I think works well. What do you think about renaming it to something like DDLFiles and DDLFilesDirectory (or something along those lines) to make the intent clearer?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or SQLFiles and SQLFilesDirectory as well.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as #10175 (comment) :)

Comment on lines 15 to 25
// SqliteStore is a database store implementation that uses a sqlite backend.
type SqliteStore struct {
cfg *SqliteConfig

*BaseDB
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it could be clearer that it's a fake implementation.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in a separate commit as this initial comment was copied from lnd.

Comment on lines +9 to +17
// maxSQLiteBatchSize is the maximum number of items that can be
// included in a batch query IN clause for SQLite. This was determined
// using the TestSQLSliceQueries test.
maxSQLiteBatchSize = 32766

// maxPostgresBatchSize is the maximum number of items that can be
// included in a batch query IN clause for Postgres. This was determined
// using the TestSQLSliceQueries test.
maxPostgresBatchSize = 65535
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this value applies to all query parameters used in the query, not only for those used in the IN statement, so if this limit is reached when we have other parameters aside the IN, it will return an error.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment was copied from lnd, but I think you might be right. We should research this and address it in a follow-up if correct.

Comment on lines +106 to +117
// Ensure the migration tracker table exists before running migrations.
// This table tracks migration progress and ensures compatibility with
// SQLC query generation. If the table is already created by an SQLC
// migration, this operation becomes a no-op.
migrationTrackerSQL := `
CREATE TABLE IF NOT EXISTS migration_tracker (
version INTEGER UNIQUE NOT NULL,
migration_time TIMESTAMP NOT NULL
);`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this different from the MigrationStream table name?

Comment on lines +121 to +131
// TearDown stops the underlying docker container.
func (f *TestPgFixture) TearDown(t testing.TB) {
err := f.pool.Purge(f.resource)
require.NoError(t, err, "Could not purge resource")
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be passed as a test Cleanup function inside the fixture creation

	t.Cleanup(func() {
		fixture.TearDown(t)
	})

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines 17 to 19
// ErrRetriesExceeded is returned when a transaction is retried more
// than the max allowed valued without a success.
ErrRetriesExceeded = errors.New("db tx retries exceeded")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// ErrRetriesExceeded is returned when a transaction is retried more
// than the max allowed valued without a success.
ErrRetriesExceeded = errors.New("db tx retries exceeded")
// ErrTxRetriesExceeded is returned when a transaction is retried more
// than the max allowed valued without a success.
ErrTxRetriesExceeded = errors.New("db tx retries exceeded")

Maybe ErrTxRetriesExceeded to avoid confusion with other types of retry.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added as a separate commit.

Comment on lines 21 to 29
// postgresErrMsgs are strings that signify retriable errors resulting
// from serialization failures.
postgresErrMsgs = []string{
"could not serialize access",
"current transaction is aborted",
"not enough elements in RWConflictPool",
"deadlock detected",
"commit unexpectedly resulted in rollback",
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe postgresRetriableErrMsgs to be clearer.

Suggested change
// postgresErrMsgs are strings that signify retriable errors resulting
// from serialization failures.
postgresErrMsgs = []string{
"could not serialize access",
"current transaction is aborted",
"not enough elements in RWConflictPool",
"deadlock detected",
"commit unexpectedly resulted in rollback",
}
// postgresRetriableErrMsgs are strings that signify retriable errors resulting
// from serialization failures.
postgresRetriableErrMsgs = []string{
"could not serialize access",
"current transaction is aborted",
"not enough elements in RWConflictPool",
"deadlock detected",
"commit unexpectedly resulted in rollback",
}

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added as a separate commit.

Comment on lines 279 to 281
// NewTestSqliteDB is a helper function that creates an SQLite database for
// testing.
func NewTestSqliteDB(t testing.TB, streams []MigrationStream) *SqliteStore {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should separate the test helpers in another file from the functions used in "production".

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added as a separate commit.

Comment on lines 12 to 15
t.Cleanup(func() {
pgFixture.TearDown(t)
})

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can move this to DB creation, like done in SQLite.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added as a separate commit :)

@GustavoStingelin
Copy link
Contributor

Anything that I can do to help put energy into this?

@ViktorT-11
Copy link
Collaborator Author

Hey @GustavoStingelin! Thank you so much for the review above 🙏!

Anything that I can do to help put energy into this?

Apologies, I should probably have informed you this in this PR earlier:
We're waiting for this PR in the dependent migrate fork to be merged before I'm going to start pushing for this PR again, including addressing your great feedback:
lightninglabs/migrate#3

As that PR will require updates of the implementation of this PR, it makes most sense that it is finalised and merged before changing this dependent PR.

In terms of areas where you can help out: Continuing to review this PR once the above is merged and I've addressed your feedback here would be very helpful. Additionally, if you want to, reviewing the above migrate PR would also be very helpful 🙏.

testPgUser = "test"
testPgPass = "test"
testPgDBName = "test"
PostgresTag = "15"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To what extent is making the Postgres DB version 15 intentional? PostgreSQL 15 was released 4 years ago, and between that time and now (PostgreSQL 18), I heuristically bet there are significant performance optimizations that have been added to PostgreSQL. I think we could target a more recent version, like the pre-latest release (PostgreSQL 17 seems like a sweet spot)

Related:
https://endoflife.date/postgresql

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Version 15 was chosen as it is what we currently use in tapd.

It's really solid feedback that this should be configurable, so thanks and I definitely agree 🙏! In order to avoid scope creep, let's do that in a follow-up PR(s).

The same applies for the rest of your comments as well. Once merged, I'll open an issue that tracks all of your feedback points, so that we can tackle those in follow-up PRs :).

testPgUser = "test"
testPgPass = "test"
testPgDBName = "test"
PostgresTag = "15"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Relatedly, it looks like the PostgresTag version is not configurable. I understand the intent may be to enforce a specific version for all consumers, but this may not always be feasible now or in the future. Each project may have evolving dynamics that require a specific PostgreSQL version. I think it's safe to make it configurable with a default value that can be overridden when needed

func NewTestPgFixture(t testing.TB, expiry time.Duration) *TestPgFixture {
// Use a sensible default on Windows (tcp/http) and linux/osx (socket)
// by specifying an empty endpoint.
pool, err := dockertest.NewPool("")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we are gonna use docker container anyway here, Why not using testcontainers which is out of the box, more mature, and maintainable solution. Seeing https://testcontainers.com/modules/postgresql/?language=go

Credits:
Originally inspired by @GustavoStingelin's PR: btcsuite/btcwallet#1027

fmt.Sprintf("POSTGRES_PASSWORD=%v", testPgPass),
fmt.Sprintf("POSTGRES_DB=%v", testPgDBName),
"listen_addresses='*'",
},
Copy link
Contributor

@mohamedawnallah mohamedawnallah Nov 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those environment variables need to be configurable. It is up to the API consumer to pass the intended parameters. We can have some sort of base arguments and additional arguments.

  • If no additional args provided, the base args would be used
  • If additional args provided and no overlapping with base args, the base args would be augmented
  • If additional args provided and there is overlapping with base args, the base args would be overriden by the corresponding values of the additional args and augmented with the new ones if any

Comment on lines +58 to +64
Cmd: []string{
"postgres",
"-c", "log_statement=all",
"-c", "log_destination=stderr",
"-c", "max_connections=5000",
},
Copy link
Contributor

@mohamedawnallah mohamedawnallah Nov 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Related to the above comment

pool *dockertest.Pool
resource *dockertest.Resource
host string
port int
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How are we making sure here there is no port collision in case of running multiple PostgreSQL DB containers?


// Tell docker to hard kill the container in "expiry" seconds.
require.NoError(t, resource.Expire(uint(expiry.Seconds())))

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Relatedly, That all docker containers "management" lifecycle part here and there across this PR is managed out of the box by testcontainers


// MapSQLError attempts to interpret a given error as a database agnostic SQL
// error.
func MapSQLError(err error) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to agnostically map that SQL error? How would doing that be helpful? It looks like at the application layer there is a state to signal whether we are dealing with SQLite or PostgreSQL. We could have separate functions for this: one for MapPostgreSQLError and another for MapSQLiteError. That way the errors would be extendable in the future, given that there is almost a major release version for PostgreSQL every year (hence, heuristically, new errors)

"INTEGER PRIMARY KEY": "BIGSERIAL PRIMARY KEY",
"TIMESTAMP": "TIMESTAMP WITHOUT TIME ZONE",
"UNHEX": "DECODE",
}
Copy link
Contributor

@mohamedawnallah mohamedawnallah Nov 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is having an intermediary schema patching approach helpful and reliable over time?

Why not make the schema creation process redundant for each target DB (e.g., one for PostgreSQL and one for SQLite)? That way we can have flexibility and reliability in how we define schemas according to the target DB.

To boil down my arguments for why this intermediary schema patching design approach is unreliable long-term and even with some reasonable time in short-term:

  1. PostgreSQL has frequent version releases (approximately one per year) with EOL releases every ~5 years. These uncontrolled changes naturally require updates to the replacement mappings, which may necessitate more migrations or make the process increasingly complex.

  2. There was an internal issue discussed in LL where an explicit AUTO_INCREMENT was required for intended behavior (which, as observed, required a migration in the next release). Issues like this could be completely avoided by not using patching and instead following a "Write Everything Twice" approach for schema creation. Note: this wouldn't apply to queries, since queries are easier to write in a universally-agnostic manner than schema definitions.

  3. Future database support concerns: What if there's a plan to support another database? Would the schema patching map become a multi-type hierarchy? We'd eventually end up solving a simple problem with a complex solution.

For example: What's the problem with having the entire graph schema creation written twice—once for SQLite and once for PostgreSQL?

Seeing: https://github.com/lightningnetwork/lnd/blob/master/sqldb/sqlc/migrations/000008_graph.up.sql

cc: @yyforyongyu for input need – since the btcwallet SQL project will eventually depend on this sqldb package, having this intermediary patching approach could potentially impact the reliability and criticality of the btcwallet sqlization project given the vision for btcwallet sqlization to stand for a decade, with that patching, and time constraint those parameters don't look like align with that vision

In the upcoming commits, we will introduce a new sqldb module, sqldb
version 2.

The intention of the new sqldb module, is to make it generalizable so
that it contains no `lnd` specific code, to ensure that it can be reused
in other projects.

This commit adds the base of the new module, but does not include any
implementation yet, as that will be done in the upcoming commits.
This commit moves all non lnd-specific code of sqldb/v1 to the new
sqldb/v2 module.

Note however, that without additional changes, this package still needs
to reference lnd, as references to the lnd `sqlc` package is required
without further changes. Those changes will be introduced in the
upcoming commits, to fully decouple the new sqldb/v2 module from lnd.
@ViktorT-11 ViktorT-11 force-pushed the 2025-08-sqldb-v2-separate-package branch from 233f1fb to 20f91ed Compare December 19, 2025 16:05
This commit introduces a new struct named `MigrationStream`, which
defines a structure for migrations SQL migrations.

The `MigrationStream` struct contains the SQL migrations which will be
applied, as well as corresponding post-migration code migrations which
will be executed afterwards. The struct also contains fields which
define how the execution of the migrations are tracked.

Importantly, it is also possible to define multiple different
`MigrationStream`s which are executed, to for example define one `prod`
and one `dev` migration stream.
This commit updates the `sqldb/v2` package to utilize the new
`MigrationStream` type for executing migrations, instead of passing
`[]MigrationConfig`'s directly.
This commit updates the definition of the `BaseDB` struct to decouple
it from lnd`s `sqlc` package. We also introduce new fields to the struct
to make it possible to track the database type used at runtime.
In order to make it possible to replace `tapd`'s internal `sqldb`
package with the new generic `sqldb/v2` package, we need to make sure
that all features and functionality that currently exist in the `tapd`
package are also present in the new `sqldb/v2` package.

This commit adds such additional missing features to the `sqldb/v2`
package.
Previously, the test db helper files were suffixed with "_test", which
would indicate that the files specifically contained tests.
However, these files actually contain helper functions to be used
in tests, and are not tests themselves. To better reflect their
purpose, the files have been renamed to instead be prefixed with
"test_".
The docs of the `SqliteStore` for no sqlite build environments
previously didn't clarify that the actual `SqliteStore` implementation
under such build tag environments, didn't actually implement a real
sqlite store. This commit clarifies that in the docs.
rename the `ErrRetriesExceeded` error to `ErrTxRetriesExceeded`.
Rename the `postgresErrMsgs` list to `postgresRetriableErrMsgs`, in
order to clarify its usage.
Move the tear down of the postgres db fixture on the test cleanup to the
constructor, instead of requiring the callsites to implement that.
@ViktorT-11 ViktorT-11 force-pushed the 2025-08-sqldb-v2-separate-package branch from 20f91ed to d2bc562 Compare December 20, 2025 00:02
@ViktorT-11
Copy link
Collaborator Author

Thanks a lot for the reviews @GustavoStingelin & @mohamedawnallah 🎉!

Since the migrate dependency fix has now been merged, I've now updated this PR to use that new dependency. I've also addressed some of your feedback.

Even though the feedback you've both submitted is really solid and should definitely be added in the future, I've been quite restrictive in which feedback I've addressed so far.

The reasoning behind this is that I think I need to clarify exactly what the intent of this initial version of the sqldb/v2 package is:
The initial intent of this package is to ensure that our repos which currently have SQL db implementations (mainly lnd, tapd & litd), should not need to implement their own db implementation package, and should instead be able to reference this package as the base for their SQL db implementation, so that we do not need to redefine the same functionality in all repos.

Therefore, what this PR currently implements, is ONLY mainly a combination of the non repo specific db code that's already present in those repos. Additionally in order to make that code useable to execute migrations, this PR also introduces the concept of the MigrationStream, which the unique repo can then define the exact contents of in order to execute the migrations for that specific repo.

I.e., this repo mainly only carries over code that already is live in the other repos, in order to ensure that this PR is quite easy to review, and that the scope of the PR remains quite small.

In order to avoid scope creep, I suggest that we add the additional functionality that you've suggested in your feedback in follow-up PRs, just so that we can get the sqldb/v2 package released first, and then expand it's functionality with additional functionality that doesn't exist in our repos already.

Therefore, I'll only address feedback in this PR which is points out bugs, or if you notice currently live functionality in our repos which I've missed including in this PR.

For the additional feedback, once this PR is merged, I'll open an issue which includes the additional functionality you've suggested, so that we can track it and tackle those in follow-up PRs.

One final note: Since some of your suggestions were basically only renaming of fields which are already present & live in the DB code in our other repos, I addressed those in separate commits at the end of the PR. Since such commits should be super easy to review and should stall this PR, I opted to including those.

Copy link
Contributor

@GustavoStingelin GustavoStingelin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just left a comment for a potential issue, otherwise LGTM!

Comment on lines +271 to +275
// Report the current version of the database before the migration.
currentDbVersion, _, err := driver.Version()
if err != nil {
return fmt.Errorf("unable to get current db version: %w", err)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we are missing an !errors.Is(err, migrate.ErrNilVersion) check for fresh databases? Since no migrations have been applied yet.

Also, what is the difference between this value currentDbVersion and migrationVersion? They look equivalent to me.

Finally, how do we ensure this logic is actually exercised and working as expected?

@lightninglabs-deploy
Copy link

@ziggie1984: review reminder
@mohamedawnallah: review reminder
@ViktorT-11, remember to re-request review from reviewers when ready

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants