Harmonise Drive and Bucket client interfaces#53
Conversation
Brings in multipart-upload support (PR HumanBrainProject#52). Conflict in bucket.py imports resolved to keep both BucketDir (harmonisation) and the new multipart utils constants (master). Also routes Bucket.upload_local_file through Bucket.upload by path rather than opening the file first, so multipart_upload can persist its resume manifest for >1G uploads.
|
@xgui3783 This is a big one, so please take your time! I think this is needed to improve the usability of the library, as a step towards a 1.0 release. |
xgui3783
left a comment
There was a problem hiding this comment.
Thanks for the Herculean effort.
I made some suggestions, but no major changes at this point.
I would recommend merge this already (with or without the suggestions)
Once merged, I will try to catch some time to make some PRs on cosmetic issues such as typing and tests, which would also allow me to test how/if it works.
| def connect(username=None, password=None, token=None, env=""): | ||
| client = DriveApiClient(username, password, token, env) | ||
| return client | ||
| def connect(username=None, password=None, token=None, env="", target="drive"): |
There was a problem hiding this comment.
| def connect(username=None, password=None, token=None, env="", target="drive"): | |
| from typing import Literal | |
| def connect(username=None, password=None, token=None, env="", target: Literal["drive", "bucket"]="drive"): |
There was a problem hiding this comment.
perhaps this would allow easier time for static type checking to figure out the correct value to use for target
| def __init__(self, client): | ||
| self.client = client | ||
|
|
||
| def get_file_by_url(self, file_url): |
There was a problem hiding this comment.
| def get_file_by_url(self, file_url): | |
| def get_file_by_url(self, file_url: str): |
|
|
||
| def get_content(self): | ||
| """Get the content of the file""" | ||
| def get_content(self, *, progress=False): |
There was a problem hiding this comment.
I wonder if the method should be called iter_content and returns an Iterator[byte] .
This would help in constrained devices, where content may not fit on memory.
| "Pass recursive=True to delete every object under this prefix." | ||
| ) | ||
| for obj in self.bucket.ls(prefix=self._api_prefix() or None): | ||
| obj.delete() |
There was a problem hiding this comment.
| obj.delete() | |
| obj.delete(recursive=recrusive) |
There was a problem hiding this comment.
I don't have static type checker, but obj can be BucketDir here, right?
Summary
The goal of this PR is to harmonise the interface between the Drive and Bucket clients,
to make it easier for users to switch between them.
For backwards compatibility, every existing call still works unchanged; all changes are additive.
I've added
DeprecationWarningfor some methods where harmonised alternatives have been added.Legend:
✓= already existed ·+= added by this PR ·—= out of scope or not applicable on this storage type.Entry point
ebrains_drive.connect(..., target="drive")target="bucket")Client class (
DriveApiClient/BucketApiClient).repos/.bucketsmanager.fileURL-resolution helperBucketFile)client.create_new(name, ...)DeprecationWarning→ useclient.buckets.create_bucket)client.delete_bucket(name, ...)DeprecationWarning→ useclient.buckets.delete_bucketorbucket.delete())Manager (
Repos/Buckets)Repos)Buckets)list()list_repos()/list_buckets(search=None)GET /v1/buckets)get(name)get_repo(id)/get_bucket(name)create(name, ...)create_repo(name, ...)/create_bucket(name, ...)BucketApiClient.create_new)delete(name)delete_bucket)delete_bucket(name, ...)get_dataset(id, ...)get_repos_by_name,get_repo_by_url,get_default_repo,get_repo_by_local_path,get_repos_by_filterContainer (
Repo/Bucket)Repo)Bucket).namedelete()(instance)client.buckets.delete_bucket(self.name))is_readonly()self.perm→self.permissionTruewhenrole in (None, "viewer"))get_file(path)get_dir(path)SeafDir)BucketDir)ls(prefix=None)get_dir(...).ls())upload(filelike_or_path, name)get_dir("/").upload(...))upload_local_file(path, name=None, overwrite=False)get_dir("/").upload_local_file(...))multipart_upload(filelike_or_path, name)upload()when size > 1G)File (
SeafFile/DataproxyFile)SeafFile)DataproxyFile)get_content()get_content(*, progress=False)get_download_link()delete()rename(newname)moveTo(dst_dir, dst_repo_id=None)(camelCase)move_to(...)(snake_case alias)copyTo(dst_dir, dst_repo_id=None)(camelCase)copy_to(...)copyTo)PUT /copy:copy_to(dst_name=None, *, dst_bucket=None))get_share_link()_SeafDirentBase)Directory-only (
SeafDir/BucketDir)SeafDir)BucketDir)ls(...)entity_type=,force_refresh=)recursive=Falseuses nativedelimiter="/";recursive=Trueyields all files flat)get_file(name)repo.get_file)get_dir(name)repo.get_dir)mkdir(name)upload(fileobj_or_path, name)upload_local_file(path, name=None, overwrite=False)create_empty_file,share_to_user,check_exists,download(zip)check_existsis also added onBucketDirdelete(*, recursive=False)recursive=True)Shared structural protocols (new
ebrains_drive.base)typing.Protocol,@runtime_checkable)StorageClientDriveApiClient,BucketApiClientContainerManagerRepos,BucketsContainerRepo,BucketStorageObjectSeafFile,DataproxyFileBackwards compatibility
BucketApiClient.create_new/delete_bucketemitDeprecationWarningand delegate to the new canonical entry points.