Skip to content

Conversation

@mattbit
Copy link
Member

@mattbit mattbit commented Aug 25, 2023

This is caused by bad formatting of newlines in csv read/write, see here.

Fixing also a series of other issues:

  • Fix cache loading when cache is disabled
  • Fix broken cache for text generation models (and added a test)
  • Fix broken cache path on Windows
  • Fix broken CSV writing on Windows
  • Ensure regression models output is casted to float

@mattbit mattbit requested a review from andreybavt August 25, 2023 10:37
@linear
Copy link

linear bot commented Aug 25, 2023

GSK-1617 Bug with ModelCache on Windows

Model cache fails on Windows with regression models, breaking all models.

self.model_type = model_type
self._warmed_up = False

def warm_up_from_disk(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not warm up at creation time instead of read_from_cache ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because I don’t want to warm up if the cache is disabled. But it’s still better to create the ModelCache object in the model, so that it can be used if the cache is enabled at a later time.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think in case cache is disabled the cleanest will be not to initialize _cache property at all and then warm it up in the ModelCache constructor depending on a cache type (in memory only vs FS backed), WDYT?

Copy link
Member Author

@mattbit mattbit Aug 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But cache can be disabled temporarily. So if we don’t initialize the _cache property at all, we would need to check if it exists at every prediction, initializing it there and possibly warm it up at prediction time, in addition to checking that cache is enabled. That’s because cache could have been disabled when model was initialized, but enabled later.

In any case I would avoid doing expensive and persistent operations upon construction of the instance.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok let's keep it this way

@mattbit mattbit requested a review from andreybavt August 25, 2023 11:01
@andreybavt andreybavt merged commit c01b5b8 into main Aug 25, 2023
@mattbit mattbit deleted the task/GSK-1617-cache-bug branch August 25, 2023 12:58
@sonarqubecloud
Copy link

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 0 Code Smells

94.7% 94.7% Coverage
0.0% 0.0% Duplication

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

3 participants