Skip to content

Conversation

@meirk-brd
Copy link
Collaborator

Summary:

  • Introduce fetch_pdf tool that streams PDFs via Bright Data's unblocker with size caps and extracts text using pdf-parse, guarded by a shared p-limit to prevent memory spikes.
  • Add pdf-parse and p-limit runtime dependencies.
  • Harden PDF handling with try/catch around streaming and parsing; surface clean errors to clients.

Testing:

  • Invoke fetch_pdf against arXiv PDF and observe successful error reporting without a process crash.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants