, (*1)
This package provides a class to extract text from a pdf., (*2)
use Spatie\PdfToText\Pdf;
echo Pdf::getText('book.pdf'); //returns the text from the pdf
Spatie is a webdesign agency based in Antwerp, Belgium. You'll find an overview of all our open source projects on our website., (*3)
Support us
, (*4)
We invest a lot of resources into creating best in class open source packages. You can support us by buying one of our paid products., (*5)
We highly appreciate you sending us a postcard from your hometown, mentioning which of our package(s) you are using. You'll find our address on our contact page. We publish all received postcards on our virtual postcard wall., (*6)
Requirements
Behind the scenes this package leverages pdftotext. You can verify if the binary installed on your system by issueing this command:, (*7)
which pdftotext
If it is installed it will return the path to the binary., (*8)
To install the binary you can use this command on Ubuntu or Debian:, (*9)
apt-get install poppler-utils
On a mac you can install the binary using brew, (*10)
brew install poppler
If you're on RedHat, CentOS, Rocky Linux or Fedora use this:, (*11)
yum install poppler-utils
Installation
You can install the package via composer:, (*12)
composer require spatie/pdf-to-text
Usage
Extracting text from a pdf is easy., (*13)
$text = (new Pdf())
->setPdf('book.pdf')
->text();
Or easier:, (*14)
echo Pdf::getText('book.pdf');
By default the package will assume that the pdftotext
command is located at /usr/bin/pdftotext
.
If it is located elsewhere pass its binary path to constructor, (*15)
$text = (new Pdf('/custom/path/to/pdftotext'))
->setPdf('book.pdf')
->text();
or as the second parameter to the getText
static method:, (*16)
echo Pdf::getText('book.pdf', '/custom/path/to/pdftotext');
Sometimes you may want to use pdftotext options. To do so you can set them up using the setOptions
method., (*17)
$text = (new Pdf())
->setPdf('table.pdf')
->setOptions(['layout', 'r 96'])
->text()
;
or as the third parameter to the getText
static method:, (*18)
echo Pdf::getText('book.pdf', null, ['layout', 'opw myP1$$Word']);
Please note that successive calls to setOptions()
will overwrite options passed in during previous calls., (*19)
If you need to make multiple calls to add options (for example if you need to pass in default options when creating
the Pdf
object from a container, and then add context-specific options elsewhere), you can use the addOptions()
method:, (*20)
php
$text = (new Pdf())
->setPdf('table.pdf')
->setOptions(['layout', 'r 96'])
->addOptions(['f 1'])
->text()
;
, (*21)
Change log
Please see CHANGELOG for more information about what has changed recently., (*22)
Testing
composer test
Contributing
Please see CONTRIBUTING for details., (*23)
Security
If you've found a bug regarding security please mail security@spatie.be instead of using the issue tracker., (*24)
Credits
About Spatie
Spatie is a webdesign agency based in Antwerp, Belgium. You'll find an overview of all our open source projects on our website., (*25)
License
The MIT License (MIT). Please see License File for more information., (*26)