dev-master
9999999-dev http://pear.php.net/package/Text_LanguageDetectDetects the language of a given piece of text.
BSD
The Requires
- php >=5.3.0
by Nicholas Pisarro
by Popadjuk Oleh
by Webmil
language detect
Detects the language of a given piece of text.
Detects the language of a given piece of text., (*1)
The package attempts to detect the language of a sample of text by correlating ranked 3-gram frequencies to a table of 3-gram frequencies of known languages., (*2)
It implements a version of a technique originally proposed by Cavnar & Trenkle (1994): "N-Gram-Based Text Categorization"., (*3)
This is a fork of Text_LanguageDetect 0.3.0 (alpha)., (*4)
PHP Version: PHP 5.3 or newer PHP Extension: pcre PHP Extension: mbstring (optional)
<?php use TextLanguageDetect\TextLanguageDetect; use TextLanguageDetect\LanguageDetect\TextLanguageDetectException; $l = new TextLanguageDetect(); echo "Supported languages:\n"; try { $langs = $l->getLanguages(); sort($langs); echo implode(', ', $langs) . "\n\n"; } catch (TextLanguageDetectException $e) { die($e->getMessage()); } $text = <<<EOD Hallo! Das ist ein Text in deutscher Sprache. Mal sehen, ob die Klasse erkennt, welche Sprache das hier ist. EOD; try { //return 2-letter language codes only $l->setNameMode(2); $result = $l->detect($text, 4); print_r($result); } catch (TextLanguageDetectException $e) { die($e->getMessage()); }
Output:, (*5)
// output Supported languages: albanian, arabic, azeri, bengali, bulgarian, cebuano, croatian, czech, danish, dutch, english, estonian, farsi, finnish, french, german, hausa, hawaiian, hindi, hungarian, icelandic, indonesian, italian, kazakh, kyrgyz, latin, latvian, lithuanian, macedonian, mongolian, nepali, norwegian, pashto, pidgin, polish, portuguese, romanian, russian, serbian, slovak, slovene, somali, spanish, swahili, swedish, tagalog, turkish, ukrainian, urdu, uzbek, vietnamese, welsh Array ( [de] => 0.40703703703704 [nl] => 0.2880658436214 [en] => 0.28333333333333 [da] => 0.23452674897119 )
Nicholas Pisarro - infinityminusnine+pear@gmail.com, (*6)
http://www.debian.org/misc/bsd.license BSD, (*7)
Detects the language of a given piece of text.
BSD
language detect