Releases: smalot/pdfparser
v2.2.0
What's Changed
- Rework documentation by @rubenvanerk in #513
- fixes #520 (missing r in composer command in README.md) by @k00ni in #521
- Added font info to dataTm by @shtayerc in #516
- Added calculateTextWidth function to Font by @shtayerc in #517
- Add issue template by @rubenvanerk in #524
New Contributors
Full Changelog: v2.1.0...v2.2.0
v2.1.0
What's Changed
- Fix encoding for encoding dictionary without Type item. by @likemusic in #500
- Added decodeMemoryLimit to Config to avoid memory leaks. by @b3n-l in #476
- added short example how to parse base64 encoded PDFs by @granjero in #493
- Make horizontal offset configurable by @rubenvanerk in #505
- Link docs to wiki instead of pdfparser.org by @rubenvanerk in #506
- Add return types to tests methods. Fix todos in phpDocs. Add method's descriptions for Font class. by @likemusic in #509
Full Changelog: v2.0.1...v2.1.0
v2.0.1
Bugfix release
For PHP 7 users: In 2.0.0 we used a function which is PHP 8 only. It was fixed in #486.
- Font.php: Optimization of the uchr function by @mariuszkrzaczkowski in #467
- Fix Scrutinizer-integration: mark PageTest::testGetTextPullRequest457 as "memory-heavy" by @k00ni in #481
- Fixes #478 (/Index problem) by @yasheena in #479
Full Changelog: v2.0.0...v2.0.1
2.0.0
Breaking Changes
❗All function parameters as well as return types of functions are typed now. That means, if you are using values which do not fit, you may receive Type errors. Most of it was done internally and you should not get bothered. In case you use internal functions, please check your code before go into production.
We initially decided to release 1.2.0 but finally jumped to 2.0.0 to include BC on a major release instead (see #480)
Highlights
- massive code refactoring (thanks to @jee7, #440)
- workaround to enable FPDFs (thanks to @izabala, #453)
- Added cache for Documents object cache dictionary, which also results in better performance in some cases (thanks to @jee7, #434)
- prevent endless loops during
Page->getText()in some cases (thanks to @Nickmanbear, #457) - Fixes invalid return type on unknown glyphs (thanks to @PrinsFrank, #459)
- Fix TypeError on
Document::getFirstFontwhen no fonts are available (thanks to @PrinsFrank, #461) - Fix TypeError on default font when no fonts available (#466, thanks for @PrinsFrank)
- Fix for extractRawData, extractDecodedRawData, getDataTm and getDataXY do not work with a Pdf file produced by FPDI/FPDF (#454, thanks to @izabala)
- Test backend was improved by @j0k3r (#460)
v1.2.0-RC2
❗Not production ready - We reworked our code base and added typed parameters as well as return values. If you find anything, please drop us a comment. Further information can be found #468. Thank you in advance!❗
Changes since v1.2.0-RC1
- Fix TypeError on default font when no fonts available (#466, thanks for @PrinsFrank)
- Fix for extractRawData, extractDecodedRawData, getDataTm and getDataXY do not work with a Pdf file produced by FPDI/FPDF (#454, thanks to @izabala)
Further information about changes and fixes in 1.2.0 can be found here: https://github.com/smalot/pdfparser/releases/tag/v1.2.0-RC1
v1.2.0-RC1
Bug fix and performance release
❗Not production ready - We reworked our code base and added typed parameters as well as return values. If you find anything, please drop us a comment. Further information can be found #468. Thank you in advance!❗
Highlights:
- massive code refactoring (thanks to @jee7, #440)
- workaround to enable FPDFs (thanks to @izabala, #453)
- Added cache for Documents object cache dictionary, which also results in better performance in some cases (thanks to @jee7, #434)
- prevent endless loops during
Page->getText()in some cases (thanks to @Nickmanbear, #457) - Fixes invalid return type on unknown glyphs (thanks to @PrinsFrank, #459)
- Fix TypeError on
Document::getFirstFontwhen no fonts are available (thanks to @PrinsFrank, #461)
@j0k3r improved our test backend.
v1.1.0
Maintenance and small performance boost
PDFs with images can be parsed with less resource consumption (like memory) from now on. @Connum added a feature with #441 to ignore image data. It must be enabled manually though. You can do it easily:
use Smalot\PdfParser\Config;
use Smalot\PdfParser\Parser;
$config = new Config();
$config->setRetainImageContent(false);
$parser = new Parser([], $config);
// $parser->parseFile (...)Besides that, we fixed a problem with Scrutinizer (part of our test infrastructure).
v1.0.2
v1.0.1
v1.0.0
Highlights
- Removed support for PHP 5.6 and 7.0, requires at least PHP 7.1 or newer❗
- extended
Config.phpwith white space characters: it allows developers to override regex for white space recognition (#411, thanks @LucianoHanna) - Fixed some test-infrastructure related issues (#412, #413, #414)