Skip to content

Add limit to DefaultFileStatisticsCache #19052

@alamb

Description

@alamb

Also listing file statistics cache seems to not have any memory limit unlike metadata cache for example.

Is that by design , do you think we need to add similar limit for this cache too ?

Originally posted by @bharath-techie in #18971 (comment)

Basically the cache used in ListingTable comes from here:
https://github.com/apache/datafusion/blob/81512da2b0aaa474f6c4ba205b05eea7b3095176/datafusion/core/src/datasource/listing_table_factory.rs#L188-L187

Which somewhat unobviously sets a DefaultFileStatisticsCache here
https://github.com/apache/datafusion/blob/9f725d9c7064813cda0de0f87d115354b68d76e6/datafusion/catalog-listing/src/table.rs#L260-L259

The DefaultFileStatisticsCache has no limit:
https://github.com/apache/datafusion/blob/7d8b8602ad1be2f61f6a8ebb253ace9d85304ea7/datafusion/execution/src/cache/cache_unit.rs#L41-L40

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions