Skip to content

Conversation

@dvkashapov
Copy link
Contributor

@dvkashapov dvkashapov commented Jul 4, 2025

API changes and user behavior:

  • Default behavior for database access.

Default is alldbs permissions.

Database Permissions (db=)

  • Accessing particular database
> ACL SETUSER test1 on +@all ~* resetdbs db=0,1 nopass
"user test1 on nopass sanitize-payload ~* resetchannels db=0,1 +@all"
  • (Same behavior without usage of resetdbs)
> ACL SETUSER test1 on +@all ~* db=0,1 nopass
"user test1 on nopass sanitize-payload ~* resetchannels db=0,1 +@all"
  • Multiple selector can be provided
> ACL SETUSER test1 on nopass (db=0,1 +@write +select ~*) (db=2,3 +@read +select ~*)
"user test1 on nopass sanitize-payload resetchannels alldbs -@all (~* resetchannels db=0,1 -@all +@write +select) (~* resetchannels db=2,3 -@all +@read +select)"
  • Restricting special commands which access databases as part of the command.

The user needs to have access to both the commands and db(s) part of the command to run these commands.

  1. SWAPDB
  2. SELECT
  3. MIGRATE
  4. MOVE - (Select command would have went through for the source database). Have access for the target database.
  • Restricting special commands which doesn't specify database number, however, accesses multiple databases.

The user needs to have access to both the commands and all databases (alldbs) to run these commands.

  1. FLUSHALL - Access all databases
  2. CLUSTER commands that access all databases:
    • CANCELSLOTMIGRATIONS
    • COUNTKEYSINSLOT
    • GETKEYSINSLOT
    • MIGRATESLOTS
  • New connection establishment behavior
    New client connection gets established to DB 0 by default. Authentication and authorisation are decoupled and the user can connect/authenticate and further perform SELECT operation.

(Do we want to extend HELLO?) Alternative suggestion by @madolson: Extend HELLO command to pass the dbid to which the user should get connected after authentication if they have right set of permission. I think it will become a long poll for adoption.

  • Observability
    Without ACL LOG support to flag denied permission for database, it will be difficult for user/administrator to identify malicious activity. So, I think we should extend ACL LOG to log user which received denied permission error while accessing a database.

  • Module API

  • Introduce VM_ACLCheckPermissions(ValkeyModuleUser *user, ValkeyModuleString **argv, int argc, int dbid); API
  • Stop support of VM_ACLCheckCommandPermissions.

Open Issues

#2309 (comment)

Resolves: #1336

@dvkashapov
Copy link
Contributor Author

Hello! I would greatly appreciate any feedback from ACL experts)
@xbasel @hpatro @madolson

@hwware hwware added the major-decision-pending Major decision pending by TSC team label Jul 4, 2025
@hwware
Copy link
Member

hwware commented Jul 4, 2025

I would like to first discuss this new feature in the weekly meeting. Thanks

@dvkashapov dvkashapov marked this pull request as ready for review July 14, 2025 14:22
@madolson madolson requested a review from hpatro July 14, 2025 14:54
@madolson
Copy link
Member

Just discussed in the weekly meeting. It seems like we are all aligned to add more database features. We don't want folks to use database instead of true multi-tenancy, like running multiple containers, but there are still plenty of workloads that could benefit from the having access control on databases for namespacing. So we'll move forward with this feature.

Copy link
Collaborator

@hpatro hpatro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this @JoBeR007.

High level comment:

  • I would prefer us to come up with a symbol for DB, let's say ^ and use that instead of db= as prefix to have the same experience as other Then it would look like,

All database access: ACL SETUSER alice ^*
Selective database access: ACL SETUSER bob ^1 ^2

  • With DB I find providing both ALLOW and DENY would be helpful. Possibly extend it to other resources in a separate PR.

Restrict admin db access via (-^?):

Allow all except db 0: ACL SETUSER alice -^0

  • For FLUSHALL and FLUSHDB the command can get accepted as ASYNC and while it's getting executed the permission could change. I think the correct behavior is to still process the command completely even if the permission has changed at a later point. (The PR has the correct behavior).

src/acl.c Outdated
Comment on lines 1784 to 1802
if (keyidxptr) {
if (cmd->proc == selectCommand)
*keyidxptr = 1;
else if (cmd->proc == moveCommand)
*keyidxptr = 2;
else if (cmd->proc == swapdbCommand)
*keyidxptr = (i == 0) ? 1 : 2;
else
*keyidxptr = 0;
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need to handle flushdb here. This part of the code doesn't seem very readable. Let me think more and come up with something.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed that this needs improvement, don't really have an idea for how to do it cleaner.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need to handle flushdb here

It is handled as a general case in last else if because FLUSHDB is related to current dbid

This part of the code doesn't seem very readable

Added todo here, still can't think of clean way to set keyidxptr

@dvkashapov
Copy link
Contributor Author

dvkashapov commented Aug 1, 2025

Thanks for review @hpatro!
I see where you're coming from in terms of one token per symbol. Do you think originally proposed syntax (db=<list> / db!=<list>) is worse than one token per symbol? For me explicit list seemed like a better idea in terms of specifying multiple selectors because it provided better readability and seemed more compact.
Also about ALLOW and DENY ACL: I suggest in this PR we add only ALLOW policies and as a separate PR I will add DENY policy for different rules and from that point we will think about usefulness of each DENY policy.

@madolson madolson moved this to Optional for next release candidate in Valkey 9.1 Aug 4, 2025
@madolson madolson moved this from Optional for next release candidate to Todo in Valkey 9.1 Aug 6, 2025
@madolson
Copy link
Member

madolson commented Sep 3, 2025

@dvkashapov Just to be transparent with you. This PR was opened at a time when we were working on the 9.0 release, and so we weren't going to merge this feature in to it. Now that the RC candidates are mostly done we can start reviewing this again for the next release. I think we'll have more time now to review it and make progress.

With DB I find providing both ALLOW and DENY would be helpful. Possibly extend it to other resources in a separate PR.

This is inconsistent with the rest of the ACL system. Today everything is positive grants. Lots of people think that if you do -get +@string, the negation of get overrules the adding of the string category. That isn't correct, as you will still have access to get. Having allows and denies for databases will just add on to that confusion. If we want to have negative policies, we should have a more generic solution. I'm fine with another issue for this though.

For FLUSHALL and FLUSHDB the command can get accepted as ASYNC and while it's getting executed the permission could change. I think the correct behavior is to still process the command completely even if the permission has changed at a later point. (The PR has the correct behavior).

From the end user perspective, the FLUSH always happens synchronously. The data is just freed async. I think this behavior is fine.

All database access: ACL SETUSER alice ^* Selective database access: ACL SETUSER bob ^1 ^2

I think this is less readable than the original proposal, but it is more consistent. I don't know how strongly I feel.

I see where you're coming from in terms of one token per symbol. Do you think originally proposed syntax (db= / db!=) is worse than one token per symbol? For me explicit list seemed like a better idea in terms of specifying multiple selectors because it provided better readability and seemed more compact.
Also about ALLOW and DENY ACL: I suggest in this PR we add only ALLOW policies and as a separate PR I will add DENY policy for different rules and from that point we will think about usefulness of each DENY policy.

- Implemented database permissions using `db=<id>`, `db!=<id>`, `alldbs`, `resetdbs` ACL rules
- Added SELECTOR_FLAG_ALLDBS and SELECTOR_FLAG_DBLIST_NEGATED flags
- Updated SELECT, MOVE, SWAPDB, FLUSHDB commands with CMD_CROSS_DB flag, FLUSHALL - CMD_ALL_DBS
- Extended ACL checks with database access verification
- Added ACL_DENIED_DB error type for database permission violations
- Maintained backward compatibility with default access to all databases

Signed-off-by: Daniil Kashapov <[email protected]>
@dvkashapov
Copy link
Contributor Author

@madolson @hpatro Hello! It would be awesome to merge this in 9.1!
I think I will rewrite to ACL SETUSER alice ^dbid this week. There's some talks about named databases and in that case something like ACL SETUSER alice ^db_name1 ^db_name2 looks better than ACL SETUSER bob db=db_name1,db_name2, what do you think?

@madolson
Copy link
Member

madolson commented Oct 27, 2025

In the discussion, it came up that we are missing the following other database features:

  1. No database listed for ACL LOG
  2. No database listed for MONITOR, you can parse this and track which is annoying
  3. No database listed for COMMAND LOG
  4. Monitor will show information for all databases. (Maybe fine)
  5. Does FLUSHALL need all database access?
  6. Do we need a DBLIST operation?
  7. There was an ask to split pub/sub based off of databases, [NEW] Add complete isolation between databases (pub/sub) #1868.
  8. Add support for named databases [NEW] Introduce database logical names #1601

@codecov
Copy link

codecov bot commented Oct 27, 2025

Codecov Report

❌ Patch coverage is 90.59406% with 19 lines in your changes missing coverage. Please review.
✅ Project coverage is 72.32%. Comparing base (d16788e) to head (d35a3a7).

Files with missing lines Patch % Lines
src/module.c 8.33% 11 Missing ⚠️
src/acl.c 93.79% 8 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##           unstable    #2309      +/-   ##
============================================
- Coverage     72.45%   72.32%   -0.13%     
============================================
  Files           128      128              
  Lines         70485    70674     +189     
============================================
+ Hits          51068    51118      +50     
- Misses        19417    19556     +139     
Files with missing lines Coverage Δ
src/commands.def 100.00% <ø> (ø)
src/db.c 93.35% <100.00%> (+0.19%) ⬆️
src/intset.c 100.00% <100.00%> (ø)
src/lua/script_lua.c 89.81% <100.00%> (+0.01%) ⬆️
src/multi.c 97.33% <100.00%> (+0.08%) ⬆️
src/server.c 88.43% <100.00%> (+0.03%) ⬆️
src/server.h 100.00% <ø> (ø)
src/sort.c 94.82% <100.00%> (+0.01%) ⬆️
src/acl.c 90.53% <93.79%> (+0.22%) ⬆️
src/module.c 9.77% <8.33%> (+<0.01%) ⬆️

... and 10 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@madolson
Copy link
Member

The use cases we need to have clear syntax for.

  1. User can add a databse to their existing supported databases
# Add DB 1 to user1s permissions
ACLSETUER user1 db+=1
  1. User can remove a database from their accessible databases - Do we need to support this usecase?
ACLSETUER user1 db-=1,2,3
  1. User can override all of their existing databases with new databases
ACLSETUER user1 resetdbs db+=1,2,3,4

@hpatro Will followup to try to help clarify the most consistent way to implement the syntax.

There is one more open question, do we want to make the default secure by default. For Redis 7, we changed the default permissions of an ACL user from allchannels to nochannels. Do we want to do the same thing here?

@dvkashapov
Copy link
Contributor Author

Awesome! This is really descriptive and answers a lot of questions. Proposed syntax is good, I'm OK with it! Answering your first message:

ACL LOG, MONITOR, COMMANDLOG etc.

We can make an assumption that @admin commands need all database access to execute, seems to cover those cases but may be unintuitive, in this case we can return error with helpful message.

Monitor will show information for all databases

I will take a look at MONITOR implementation, maybe we can make it per-db with default of all databases.

Does FLUSHALL need all database access

Yes, FLUSHALL needs all database access.

Do we need a DBLIST operation?

DBLIST would be useful when we will have named databases, yes.

There was an ask to split pub/sub based off of databases

I don't know much about channels right now, so I can't say how hard it would be to make them per db.

Add support for named databases

In terms of ACL for named databases, if we decide to make them like an alias for some dbid, then implementation won't change much. We would only need to resolve name to dbid.

Do we want to make the default secure by default?

This will be good for security but it will be a breaking change for all existing ACL's for all users, maybe this is too much.

Important question: We're going with per-user permissions, not per-selector like currently in PR?

@hpatro
Copy link
Collaborator

hpatro commented Oct 27, 2025

We were just trying to think through in the weekly meeting what other gaps exists around databases support. @dvkashapov doesn't need to be worked upon regarding this change.

Important question: We're going with per-user permissions, not per-selector like currently in PR?

It's still per-selector, maybe the example suggested above brought in some ambiguity.

@allenss-amazon
Copy link
Member

I started an issue for dealing with ACL issues for search module. #2764

Seems like this work intersects with that, in particular, the current search code would not enforce these ACL capabilities and would cause security issues.

@dvkashapov
Copy link
Contributor Author

dvkashapov commented Oct 28, 2025

still per-selector

But then db+=1,2,3 would only apply to root selector, and will not be useful for user with >1 selector, same for db-=1,2,3 case. Do we want to give users ability to edit individual selector? Then we would need to identify them with some id's and add ACL LIST-SELECTORS user.

@hpatro
Copy link
Collaborator

hpatro commented Oct 28, 2025

still per-selector

But then db+=1,2,3 would only apply to root selector, and will not be useful for user with >1 selector, same for db-=1,2,3 case. Do we want to give users ability to edit individual selector?

That has been the case with all other restrictions (commands / categories / channels). We allow the operation even if one selector succeeds.

@dvkashapov
Copy link
Contributor Author

OK, then I suggest we focus on db+=1,2,3 syntax, and while I do that - lets discuss db-=1,2,3, are we targeting to add more negative ACL's in future releases? If there's no request from community then maybe this is not needed?

Signed-off-by: Daniil Kashapov <[email protected]>
@dvkashapov
Copy link
Contributor Author

dvkashapov commented Nov 15, 2025

@hpatro Thanks for the detailed points, I agree with them! However, I wonder if db level permissions configuration is necessary here? I suspect it might be a less common feature. Would love to hear your perspective on this.

In regards of the expected behaviour: current implementation fits all the criteria's, but we still lack test for ACL LOG, will do it and update the top comment!

Also AFAIK FLUSHDB and SCAN operate on current db, so we don't need to handle them in special way right now.

@dvkashapov
Copy link
Contributor Author

While writing test I noticed that for CMD_ALL_DBS we can't correctly set keyidxptr and as a result in ACL LOG object is cmd itself, should we leave it like that or use some placeholder?

@dvkashapov
Copy link
Contributor Author

dvkashapov commented Nov 16, 2025

@hpatro added VM_ACLCheckDbPermissions that checks db-level permissions for user, do we need anything else?
Also updated top comment.

Question: We need to handle db-level checks in VM_ACLCheckCommandPermissions because it calls ACLCheckAllUserCommandPerm which needs dbid, what should we do in this scenario?

@ranshid
Copy link
Member

ranshid commented Nov 16, 2025

@dvkashapov I think probably want to also include migrate command destdb flag?
I assume when ACLs are synced cross the cluster it will probably fail on the target when we execute the select, but maybe it is better to fail it on the source as well?

Also cluster-migrateslots should have the ALL_DBS flag as this is also a cover-all-dbs command right?

Copy link
Collaborator

@hpatro hpatro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems mostly good on acl.c barring the resetdbs behavior and code performance on each ACL SETUSER operation.

Going through the remaining files.

src/acl.c Outdated
selector->flags &= ~SELECTOR_FLAG_ALLDBS;
}

char *dblist = zstrdup(dbs_str);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would suggest creating a sds and operate on it. improves strlen perf and more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rewrote to sds usage, PTAL

}
}

sds getAclErrorMessage(int acl_res, user *user, struct serverCommand *cmd, sds errored_val, int verbose) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not particular for this PR, . Was it a security requirement behind introducing a non-verbose mode and hide the user/resource from the error response ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also thought that it would be useful for user to know..

@hpatro
Copy link
Collaborator

hpatro commented Nov 16, 2025

@hpatro Thanks for the detailed points, I agree with them! However, I wonder if db level permissions configuration is necessary here? I suspect it might be a less common feature. Would love to hear your perspective on this.

In regards of the expected behaviour: current implementation fits all the criteria's, but we still lack test for ACL LOG, will do it and update the top comment!

Also AFAIK FLUSHDB and SCAN operate on current db, so we don't need to handle them in special way right now.

My bad. You're right SCAN is for the selected db. Let me update the comment. For FLUSHDB, for some weird reason I thought it takes in the dbid as parameter. So, you're right about this as well. :)

@hpatro
Copy link
Collaborator

hpatro commented Nov 16, 2025

@dvkashapov I think probably want to also include migrate command destdb flag? I assume when ACLs are synced cross the cluster it will probably fail on the target when we execute the select, but maybe it is better to fail it on the source as well?

I also think fail-fast is better.

Also cluster-migrateslots should have the ALL_DBS flag as this is also a cover-all-dbs command right?

Not sure about the behavior of this command. Does it use the already authenticated user to perform the operation ? Does it fail if the user doesn't have access to some keys in the keyspace? If yes, we should mark it to have ALL_DBS access.

@hpatro
Copy link
Collaborator

hpatro commented Nov 16, 2025

@hpatro added VM_ACLCheckDbPermissions that checks db-level permissions for user, do we need anything else? Also updated top comment.

Why would one use this check independently ?

Question: We need to handle db-level checks in VM_ACLCheckCommandPermissions because it calls ACLCheckAllUserCommandPerm which needs dbid, what should we do in this scenario?

There are two alternatives to deal with this.

  1. Stop supporting the API with 9.1. We could alter the API or recommend module devs to use a new API VM_ACLCheckPermissions which takes in all the parameters required.
  2. Structure our code in a way that still supports the existing API but the command isn't guaranteed to succeed if ACL database restriction feature is used.

I would recommend going with option 1. @allenss-amazon thoughts?

@hpatro
Copy link
Collaborator

hpatro commented Nov 16, 2025

@dvkashapov If you don't mind, I will update the top level comment with this #2309 (comment) as the checklist and other reviewers can signoff reading it and you can mark the checkbox as you finish implementing each of it.

src/acl.c Outdated
Comment on lines 1022 to 1025
if (errno == ERANGE || endptr == token || *endptr != '\0' ||
dbid < 0 || dbid >= server.dbnum) {
return ACLDatabasePermissionError(new_dbs, dblist, EINVAL);
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be better for user to understand what's incorrect in the command. WDYT ?

Current response is generic syntax error warning.

127.0.0.1:6379> config get databases
1) "databases"
2) "16"
127.0.0.1:6379> acl setuser hp db+=16
(error) ERR Error in ACL SETUSER modifier 'db+=16': Syntax error
127.0.0.1:6379> acl setuser hp db+=0
OK

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, will add new errno in this case.

Copy link
Collaborator

@hpatro hpatro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems dangerous to set databases config to large value. The output buffer could be quite large and server processing time with setting acl setuser is also quite long with such high database count. This could be accounted as denial of service I believe.

% src/valkey-server --daemonize yes --logfile valkey.log --databases 100000000
% src/valkey-cli
127.0.0.1:6379> acl setuser hp on resetdbs db-=0
OK
(12.23s)
127.0.0.1:6379> acl list
<response too big and takes too long to get printed out>

@hpatro
Copy link
Collaborator

hpatro commented Nov 17, 2025

Did a dry run locally. Few things which caught my eyes.

  1. Large database value seems dangerous and can block the main thread for long period Database-level access control #2309 (review)
    2. ACL DB permission set flow misbehaves with allow and deny permission set in same operation Database-level access control #2309 (comment)
  2. allow or deny permission beyond allowed db range throws generic error Database-level access control #2309 (comment)
    4. Allow and deny permission leading to no db permission leads to empty allow db permission response instead of resetdbs Database-level access control #2309 (comment)
  3. User connected to a database without permission won't be able to run any commands until selecting a database to which they have access to.
src/valkey-cli -a pass --user hp
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
127.0.0.1:6379> acl list
(error) NOPERM No permissions to access database
127.0.0.1:6379> select 1
OK
127.0.0.1:6379[1]> acl list
1) "user default on nopass sanitize-payload ~* &* alldbs +@all"
2) "user hp on nopass sanitize-payload resetchannels db+=1 +@all"

Signed-off-by: Daniil Kashapov <[email protected]>
Copy link
Contributor Author

@dvkashapov dvkashapov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hpatro Applied some of review suggestions, will get to work on the rest of them, PTAL at my comments.

}
}

sds getAclErrorMessage(int acl_res, user *user, struct serverCommand *cmd, sds errored_val, int verbose) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also thought that it would be useful for user to know..

src/acl.c Outdated
Comment on lines 1022 to 1025
if (errno == ERANGE || endptr == token || *endptr != '\0' ||
dbid < 0 || dbid >= server.dbnum) {
return ACLDatabasePermissionError(new_dbs, dblist, EINVAL);
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, will add new errno in this case.

Signed-off-by: Daniil Kashapov <[email protected]>
@dvkashapov
Copy link
Contributor Author

dvkashapov commented Nov 17, 2025

There are two alternatives to deal with this.

  1. Stop supporting the API with 9.1. We could alter the API or recommend module devs to use a new API >VM_ACLCheckPermissions which takes in all the parameters required.
  2. Structure our code in a way that still supports the existing API but the command isn't guaranteed to succeed if ACL >database restriction feature is used.
    I would recommend going with option 1. @allenss-amazon thoughts?

I'm OK with first way, but I guess this is another major decision and we need feedback from community on that one.
We should probably discuss this on weekly meeting, wdyt?

@madolson
Copy link
Member

Decisions from the core meeting:

  1. I would like us to stick to alldbs permissions though.
    1. Seems like the right decision.
  2. What should the initial syntax of settings databases.
    1. We will start with just db=<id1>,<id2>
  3. Behavior of being in a database without having permissions
    1. As long as command is not accessing the keyspace, allow.
    2. Need to double check keyless commands.
  4. Module APIs
    1. We should implement a new top level API that also passes in the database. Other APIs are maybe TBD.

@madolson madolson moved this from Todo to In Progress in Valkey 9.1 Nov 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

major-decision-pending Major decision pending by TSC team

Projects

Status: In Progress

Development

Successfully merging this pull request may close these issues.

[NEW] Support database level ACL

6 participants