Skip to content

feat: reduce Hatchet payload size by removing words from topic chunk …#857

Draft
deardarlingoose wants to merge 1 commit intomainfrom
feat/payload-thinning
Draft

feat: reduce Hatchet payload size by removing words from topic chunk …#857
deardarlingoose wants to merge 1 commit intomainfrom
feat/payload-thinning

Conversation

@deardarlingoose
Copy link
Contributor

fat payloads choke postgres notification mechanism causing RPC errors in hatchet, that confuse any statistical log analysis making it unmanageable.

I assume it also causes some other existing errors to be shadowed

no-mistaken part:

…workflows

Remove ~6.5MB of redundant Word data from Hatchet task boundaries:

  • Remove words from TopicChunkInput/TopicChunkResult (child workflow I/O)
  • detect_topics maps words from local chunks by chunk_index instead
  • TopicsResult carries empty transcript words (persisted to DB already)
  • extract_subjects refetches topics from DB instead of task output
  • Clear topics at detect_topics start for retry idempotency

Description

Related Issue

Motivation and Context

How Has This Been Tested?

Screenshots (if appropriate):

…workflows

Remove ~6.5MB of redundant Word data from Hatchet task boundaries:
- Remove words from TopicChunkInput/TopicChunkResult (child workflow I/O)
- detect_topics maps words from local chunks by chunk_index instead
- TopicsResult carries empty transcript words (persisted to DB already)
- extract_subjects refetches topics from DB instead of task output
- Clear topics at detect_topics start for retry idempotency
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants