Skip to content

Releases: smalot/pdfparser

v2.2.0

12 Apr 10:06
193515a

Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v2.1.0...v2.2.0

v2.1.0

03 Feb 14:07
4551cd0

Choose a tag to compare

What's Changed

  • Fix encoding for encoding dictionary without Type item. by @likemusic in #500
  • Added decodeMemoryLimit to Config to avoid memory leaks. by @b3n-l in #476
  • added short example how to parse base64 encoded PDFs by @granjero in #493
  • Make horizontal offset configurable by @rubenvanerk in #505
  • Link docs to wiki instead of pdfparser.org by @rubenvanerk in #506
  • Add return types to tests methods. Fix todos in phpDocs. Add method's descriptions for Font class. by @likemusic in #509

Full Changelog: v2.0.1...v2.1.0

v2.0.1

23 Nov 08:42
768d1d6

Choose a tag to compare

Bugfix release

For PHP 7 users: In 2.0.0 we used a function which is PHP 8 only. It was fixed in #486.

Full Changelog: v2.0.0...v2.0.1

2.0.0

16 Nov 12:02
4d6864a

Choose a tag to compare

Breaking Changes

❗All function parameters as well as return types of functions are typed now. That means, if you are using values which do not fit, you may receive Type errors. Most of it was done internally and you should not get bothered. In case you use internal functions, please check your code before go into production.

We initially decided to release 1.2.0 but finally jumped to 2.0.0 to include BC on a major release instead (see #480)

Highlights

  • massive code refactoring (thanks to @jee7, #440)
  • workaround to enable FPDFs (thanks to @izabala, #453)
  • Added cache for Documents object cache dictionary, which also results in better performance in some cases (thanks to @jee7, #434)
  • prevent endless loops during Page->getText() in some cases (thanks to @Nickmanbear, #457)
  • Fixes invalid return type on unknown glyphs (thanks to @PrinsFrank, #459)
  • Fix TypeError on Document::getFirstFont when no fonts are available (thanks to @PrinsFrank, #461)
  • Fix TypeError on default font when no fonts available (#466, thanks for @PrinsFrank)
  • Fix for extractRawData, extractDecodedRawData, getDataTm and getDataXY do not work with a Pdf file produced by FPDI/FPDF (#454, thanks to @izabala)
  • Test backend was improved by @j0k3r (#460)

v1.2.0-RC2

18 Oct 05:38
4d6864a

Choose a tag to compare

v1.2.0-RC2 Pre-release
Pre-release

Not production ready - We reworked our code base and added typed parameters as well as return values. If you find anything, please drop us a comment. Further information can be found #468. Thank you in advance!❗

Changes since v1.2.0-RC1

  • Fix TypeError on default font when no fonts available (#466, thanks for @PrinsFrank)
  • Fix for extractRawData, extractDecodedRawData, getDataTm and getDataXY do not work with a Pdf file produced by FPDI/FPDF (#454, thanks to @izabala)

Further information about changes and fixes in 1.2.0 can be found here: https://github.com/smalot/pdfparser/releases/tag/v1.2.0-RC1

v1.2.0-RC1

15 Oct 08:55
5ed3040

Choose a tag to compare

v1.2.0-RC1 Pre-release
Pre-release

Bug fix and performance release

Not production ready - We reworked our code base and added typed parameters as well as return values. If you find anything, please drop us a comment. Further information can be found #468. Thank you in advance!❗

Highlights:

  • massive code refactoring (thanks to @jee7, #440)
  • workaround to enable FPDFs (thanks to @izabala, #453)
  • Added cache for Documents object cache dictionary, which also results in better performance in some cases (thanks to @jee7, #434)
  • prevent endless loops during Page->getText() in some cases (thanks to @Nickmanbear, #457)
  • Fixes invalid return type on unknown glyphs (thanks to @PrinsFrank, #459)
  • Fix TypeError on Document::getFirstFont when no fonts are available (thanks to @PrinsFrank, #461)

@j0k3r improved our test backend.

v1.1.0

16 Aug 07:07
43e436f

Choose a tag to compare

Maintenance and small performance boost

PDFs with images can be parsed with less resource consumption (like memory) from now on. @Connum added a feature with #441 to ignore image data. It must be enabled manually though. You can do it easily:

use Smalot\PdfParser\Config;
use Smalot\PdfParser\Parser;

$config = new Config();
$config->setRetainImageContent(false);
$parser = new Parser([], $config);
// $parser->parseFile (...)

Besides that, we fixed a problem with Scrutinizer (part of our test infrastructure).

v1.0.2

21 Jun 07:50
35c8812

Choose a tag to compare

Bugfix release

  • Don't throw an exception if there is no base encoding defined (as of PDF 1.5 Reference Table 5.11) - #433, thanks @LucianoHanna

v1.0.1

08 Jun 06:46
b32bb7a

Choose a tag to compare

Bugfix release

  • Fixed decode octal regex (#421, thanks @gdiasb12)
  • Fixed remaining places which use Config class and threw exceptions (#420, #424, thanks @TivoSoho)

v1.0.0

28 Apr 08:00
d4148fd

Choose a tag to compare

Highlights

  • Removed support for PHP 5.6 and 7.0, requires at least PHP 7.1 or newer❗
  • extended Config.php with white space characters: it allows developers to override regex for white space recognition (#411, thanks @LucianoHanna)
  • Fixed some test-infrastructure related issues (#412, #413, #414)