2017 © Pedro PelĂĄez
 

cakephp-plugin exclude-similar-docs

Exclude similar documents

image

samokspv/exclude-similar-docs

Exclude similar documents

  • Friday, January 2, 2015
  • by samokspv
  • Repository
  • 1 Watchers
  • 1 Stars
  • 2 Installations
  • PHP
  • 0 Dependents
  • 0 Suggesters
  • 0 Forks
  • 2 Open issues
  • 10 Versions
  • 0 % Grown

The README.md

ExcludeSimilarDocs

Build Status Coverage Status Latest Stable Version Total Downloads Latest Unstable Version License, (*1)

ExcludeSimilarDocs plugin for CakePHP, (*2)

Use it if you want to exclude similar documents: Types exclude: simple - exclude documents by identical duplication search shingles - exclude documents by near-duplicate search (shingles algorithm), (*3)

Installation

cd my_cake_app/app
git clone git@github.com:samokspv/exclude-similar-docs.git Plugin/ExcludeSimilarDocs

or if you use git add as submodule:, (*4)

cd my_cake_app
git submodule add "git@github.com:samokspv/exclude-similar-docs.git" "app/Plugin/ExcludeSimilarDocs"

then add plugin loading in Config/bootstrap.php, (*5)

CakePlugin::load('ExcludeSimilarDocs', array('bootstrap' => true, 'routes' => false));

Usage

In any place of your code:, (*6)

App::uses('ExcludeSimilarDocs', 'ExcludeSimilarDocs.Utility');

Configure::write('ExcludeSimilarDocs', array(
    'types' => array(
        'simple' => array(
            'fields' => array(
                'title',
                'description'
            ) // documents fields for comparison
        ),
        'shingles' => array(
            'fields' => array(
                'title',
                'description'
            ), // documents fields for comparison
            'length' => 10, // length of single shingle
            'allowSimilarity' => 1, // alllow percent similarity documents
            'stopWords' => array(), // your own prepositions and conjunction for clear in texts
            'stopSymbols' => array() // your own punctuation marks, others symbols for clear in texts
        )
    )
));

For example:

---------------
(type - simple)

// array of objects documents with fields title/description (default)
$arrayOfObjDocs = array(
    objDoc1, 
        /* 
        title = 
            'title 1'
        description = 
            'description 1'
        */
    objDoc2, 
        /* 
        title = 
            'title 1'
        description = 
            'description 2'
        */
    objDoc3 
        /* 
        title = 
            'title 3'
        description = 
            'description 1'
        */
    //...
);

$arrayOfObjDocs = ExcludeSimilarDocs::exclude($arrayOfObjDocs);
debug($arrayOfObjDocs); 
/*
    array(
        objDoc1 // document with 'title 1' or 'description 1' we already have
        //...
    );
*/

-----------------
(type - shingles)

// array of objects documents with fields title/description (default)
$arrayOfObjDocs = array(
    objDoc1, 
        /* 
        title = 
            'Hong Kong Security Chief Says 6 Police Seen on Video Beating Protester Have Been Reassigned'
        description = 
            'Hong Kong security chief says 6 police seen on video beating protester have been reassigned'
        */

    objDoc2, 
        /* 
        title = 
            'Hong Kong police seen beating activist reassigned'
        description = 
            'HONG KONG (AP) — Hong Kong’s security chief says some police officers who were captured on video kicking a democracy protester during an operation to clear demonstrators from a tunnel have been reassigned. Secretary for Security Lai Tung-kwok said Wednesday that the officers were moved to other posts and that the police department is carrying out an investigation. (MORE: Violence flares in Hong Kong as protests continue) Local television channel TVB showed footage of around six plainclothes police officers taking a man around the side of a building, placing him on the ground and...'
        */

    objDoc3
        /* 
        title = 
            'Violence flares in Hong Kong as protests continue'
        description = 
            'Police officers push protesters out to a nearby park to clear the main roads outside government headquarters in Hong Kong’s Admiralty, Wednesday. Pic: AP. Violence flared again in Hong Kong last night as police clashed with protesters. Police are reported to have beaten pro-democracy protesters as they attempted to clear the streets after more than two weeks of rallies. Tuesday night’s clashes followed a tense two days on Hong Kong’s streets. Events have been so fast that even reporting has proven a challenge. Some barricades have been removed, others built. A major road...'
        */
    //...
);

$arrayOfObjDocs = ExcludeSimilarDocs::exclude($arrayOfObjDocs, array('type' => 'shingles'));
debug($arrayOfObjDocs); 
/*
    array(
        objDoc2,
        objDoc3
        //...
    );
*/

The Versions

02/01 2015

dev-master

9999999-dev http://github.com/samokspv/exclude-similar-docs

Exclude similar documents

  Sources   Download

MIT

The Requires

 

02/01 2015

1.0.8

1.0.8.0 http://github.com/samokspv/exclude-similar-docs

Exclude similar documents

  Sources   Download

MIT

The Requires

 

02/01 2015

1.0.7

1.0.7.0 http://github.com/samokspv/exclude-similar-docs

Exclude similar documents

  Sources   Download

MIT

The Requires

 

02/01 2015

1.0.6

1.0.6.0 http://github.com/samokspv/exclude-similar-docs

Exclude similar documents

  Sources   Download

MIT

The Requires

 

25/12 2014

1.0.5

1.0.5.0 http://github.com/samokspv/exclude-similar-docs

Exclude similar documents

  Sources   Download

MIT

The Requires

 

30/10 2014

1.0.4

1.0.4.0 http://github.com/samokspv/exclude-similar-docs

Exclude similar documents

  Sources   Download

MIT

The Requires

 

30/10 2014

1.0.3

1.0.3.0 http://github.com/samokspv/exclude-similar-docs

Exclude similar documents

  Sources   Download

MIT

The Requires

 

30/10 2014

1.0.2

1.0.2.0 http://github.com/samokspv/exclude-similar-docs

Exclude similar documents

  Sources   Download

MIT

The Requires

 

29/10 2014

1.0.1

1.0.1.0 http://github.com/samokspv/exclude-similar-docs

Exclude similar documents

  Sources   Download

MIT

The Requires

 

29/10 2014

1.0.0

1.0.0.0 http://github.com/samokspv/ExcludeSimilarDocs

Exclude similar documents

  Sources   Download

MIT

The Requires