Skip to content

Handle very small pdf's#580

Merged
Borewit merged 1 commit intosindresorhus:mainfrom
eric-yuan-vanta:min-pdf
Feb 18, 2023
Merged

Handle very small pdf's#580
Borewit merged 1 commit intosindresorhus:mainfrom
eric-yuan-vanta:min-pdf

Conversation

@eric-yuan-vanta
Copy link
Contributor

@eric-yuan-vanta eric-yuan-vanta commented Feb 10, 2023

Handle tiny pdf's, and add a test pdf < 1350 bytes (the test pdf is MIT licensed and from https://brendanzagaeski.appspot.com/0004.html)

just fyi generating an empty pdf from google drive results in a pdf of similar size (~700 bytes)

resolves #579

@eric-yuan-vanta
Copy link
Contributor Author

Hey @sindresorhus would appreciate a review on this when you get a chance. Thanks!

@eric-yuan-vanta eric-yuan-vanta changed the title Handle tiny pdf's Handle very small pdf's Feb 13, 2023
@sindresorhus sindresorhus requested a review from Borewit February 14, 2023 05:58
@eric-yuan-vanta eric-yuan-vanta requested review from sindresorhus and removed request for Borewit February 14, 2023 17:18
@eric-yuan-vanta
Copy link
Contributor Author

I rerequested review and github seemed to automatically remove @Borewit. Not sure why!

@eric-yuan-vanta eric-yuan-vanta requested review from Borewit and removed request for sindresorhus February 17, 2023 17:05
throw errors

lint

lint

simplify

wrap all AI reads in try catch

move only ingore to try catch, early return pdf

Revert "move only ingore to try catch, early return pdf"

This reverts commit 3b90419.
const buffer = Buffer.alloc(Math.min(maxBufferSize, tokenizer.fileInfo.size));
await tokenizer.readBuffer(buffer, {mayBeLess: true});
try {
await tokenizer.ignore(1350);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This remains a sensitive point, but is not introduced by this PR.
Maybe we should drop support for specialized PDF formats as text based formats are out of scope.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

End of stream error when checking the file type of a pdf

4 participants