2017 © Pedro Peláez
 

library pdf-to-text

Extract text from a pdf

image

spatie/pdf-to-text

Extract text from a pdf

  • Wednesday, March 7, 2018
  • by Spatie
  • Repository
  • 13 Watchers
  • 158 Stars
  • 48,394 Installations
  • PHP
  • 4 Dependents
  • 0 Suggesters
  • 38 Forks
  • 1 Open issues
  • 7 Versions
  • 33 % Grown

The README.md

Extract text from a pdf

Latest Version on Packagist GitHub Workflow Status Software License Quality Score Total Downloads, (*1)

This package provides a class to extract text from a pdf., (*2)

use Spatie\PdfToText\Pdf;

echo Pdf::getText('book.pdf'); //returns the text from the pdf

Spatie is a webdesign agency based in Antwerp, Belgium. You'll find an overview of all our open source projects on our website., (*3)

Support us

, (*4)

We invest a lot of resources into creating best in class open source packages. You can support us by buying one of our paid products., (*5)

We highly appreciate you sending us a postcard from your hometown, mentioning which of our package(s) you are using. You'll find our address on our contact page. We publish all received postcards on our virtual postcard wall., (*6)

Requirements

Behind the scenes this package leverages pdftotext. You can verify if the binary installed on your system by issueing this command:, (*7)

which pdftotext

If it is installed it will return the path to the binary., (*8)

To install the binary you can use this command on Ubuntu or Debian:, (*9)

apt-get install poppler-utils

On a mac you can install the binary using brew, (*10)

brew install poppler

If you're on RedHat, CentOS, Rocky Linux or Fedora use this:, (*11)

yum install poppler-utils

Installation

You can install the package via composer:, (*12)

composer require spatie/pdf-to-text

Usage

Extracting text from a pdf is easy., (*13)

$text = (new Pdf())
    ->setPdf('book.pdf')
    ->text();

Or easier:, (*14)

echo Pdf::getText('book.pdf');

By default the package will assume that the pdftotext command is located at /usr/bin/pdftotext. If it is located elsewhere pass its binary path to constructor, (*15)

$text = (new Pdf('/custom/path/to/pdftotext'))
    ->setPdf('book.pdf')
    ->text();

or as the second parameter to the getText static method:, (*16)

echo Pdf::getText('book.pdf', '/custom/path/to/pdftotext');

Sometimes you may want to use pdftotext options. To do so you can set them up using the setOptions method., (*17)

$text = (new Pdf())
    ->setPdf('table.pdf')
    ->setOptions(['layout', 'r 96'])
    ->text()
;

or as the third parameter to the getText static method:, (*18)

echo Pdf::getText('book.pdf', null, ['layout', 'opw myP1$$Word']);

Please note that successive calls to setOptions() will overwrite options passed in during previous calls., (*19)

If you need to make multiple calls to add options (for example if you need to pass in default options when creating the Pdf object from a container, and then add context-specific options elsewhere), you can use the addOptions() method:, (*20)

php $text = (new Pdf()) ->setPdf('table.pdf') ->setOptions(['layout', 'r 96']) ->addOptions(['f 1']) ->text() ;, (*21)

Change log

Please see CHANGELOG for more information about what has changed recently., (*22)

Testing

 composer test

Contributing

Please see CONTRIBUTING for details., (*23)

Security

If you've found a bug regarding security please mail security@spatie.be instead of using the issue tracker., (*24)

Credits

About Spatie

Spatie is a webdesign agency based in Antwerp, Belgium. You'll find an overview of all our open source projects on our website., (*25)

License

The MIT License (MIT). Please see License File for more information., (*26)

The Versions

07/03 2018

dev-master

9999999-dev https://github.com/spatie/pdf-to-text

Extract text from a pdf

  Sources   Download

MIT

The Requires

 

The Development Requires

spatie pdf-to-text

07/03 2018

1.1.0

1.1.0.0 https://github.com/spatie/pdf-to-text

Extract text from a pdf

  Sources   Download

MIT

The Requires

 

The Development Requires

spatie pdf-to-text

20/02 2018

1.0.3

1.0.3.0 https://github.com/spatie/pdf-to-text

Extract text from a pdf

  Sources   Download

MIT

The Requires

 

The Development Requires

spatie pdf-to-text

13/11 2017

1.0.2

1.0.2.0 https://github.com/spatie/pdf-to-text

Extract text from a pdf

  Sources   Download

MIT

The Requires

 

The Development Requires

spatie pdf-to-text

16/03 2016

1.0.1

1.0.1.0 https://github.com/spatie/pdf-to-text

Extract text from a pdf

  Sources   Download

MIT

The Requires

 

The Development Requires

spatie pdf-to-text

31/12 2015

1.0.0

1.0.0.0 https://github.com/spatie/pdf-to-text

Extract text from a pdf

  Sources   Download

MIT

The Requires

 

The Development Requires

spatie pdf-to-text

31/12 2015

0.0.1

0.0.1.0 https://github.com/spatie/pdf-to-text

Extract text from a pdf

  Sources   Download

MIT

The Requires

 

The Development Requires

spatie pdf-to-text