, (*1)
This bundle work with Apache Tika., (*2)
Configuration
File config.yml, (*3)
funstaff_tika:
tika_path: /path/to/tika-app-1.0.jar
output_format: ~ # default: xml
output_encoding: ~ # default: UTF-8
logging: ~ # Use the Symfony2 default. Force the logging with this param.
Examples
$tika = $this->get('funstaff.tika')
->setOutputFormat('text')
->addDocument('foo', '/path/to/foo')
->extractContent();
$tika = $this->get('funstaff.tika')
...
->extractMetadata();
Extract content and metadata
$tika = $this->get('funstaff.tika')
...
->extractAll();
Work with data
foreach ($tika->getDocuments() as $document) {
$content = $document->getContent();
$metadata = $document->getMetadata();
$author = $metadata->get('Author');
}
Credits
To all users that gave feedback and committed code https://github.com/Funstaff/FunstaffTikaBundle., (*4)