camspiers/statistical-classifier
最新稳定版本:0.8.0
Composer 安装命令:
composer require camspiers/statistical-classifier
包简介
A PHP implementation of Complement Naive Bayes and SVM statistical classifiers, including a structure for building other classifier, multiple data sources and multiple caching backends
关键字:
README 文档
README
PHP Classifier uses semantic versioning, it is currently at major version 0, so the public API should not be considered stable.
What is it?
PHP Classifier is a text classification library with a focus on reuse, customizability and performance. Classifiers can be used for many purposes, but are particularly useful in detecting spam.
Features
- Complement Naive Bayes Classifier
- SVM (libsvm) Classifier
- Highly customizable (easily modify or build your own classifier)
- Command-line interface via separate library (phar archive)
- Multiple data import types to get your data into the classifier (Directory of files, Database queries, Json, Serialized arrays)
- Multiple types of model caching
- Compatible with HipHop VM
Installation
$ composer require camspiers/statistical-classifier
SVM Support
For SVM Support both libsvm and php-svm are required. For installation intructions refer to php-svm.
Usage
Non-cached Naive Bayes
use Camspiers\StatisticalClassifier\Classifier\ComplementNaiveBayes; use Camspiers\StatisticalClassifier\DataSource\DataArray; $source = new DataArray(); $source->addDocument('spam', 'Some spam document'); $source->addDocument('spam', 'Another spam document'); $source->addDocument('ham', 'Some ham document'); $source->addDocument('ham', 'Another ham document'); $classifier = new ComplementNaiveBayes($source); $classifier->is('ham', 'Some ham document'); // bool(true) $classifier->classify('Some ham document'); // string "ham"
Non-cached SVM
use Camspiers\StatisticalClassifier\Classifier\SVM; use Camspiers\StatisticalClassifier\DataSource\DataArray; $source = new DataArray() $source->addDocument('spam', 'Some spam document'); $source->addDocument('spam', 'Another spam document'); $source->addDocument('ham', 'Some ham document'); $source->addDocument('ham', 'Another ham document'); $classifier = new SVM($source); $classifier->is('ham', 'Some ham document'); // bool(true) $classifier->classify('Some ham document'); // string "ham"
Caching models
Caching models requires maximebf/CacheCache which can be installed via packagist. Additional caching systems can be easily integrated.
Cached Naive Bayes
use Camspiers\StatisticalClassifier\Classifier\ComplementNaiveBayes; use Camspiers\StatisticalClassifier\Model\CachedModel; use Camspiers\StatisticalClassifier\DataSource\DataArray; $source = new DataArray(); $source->addDocument('spam', 'Some spam document'); $source->addDocument('spam', 'Another spam document'); $source->addDocument('ham', 'Some ham document'); $source->addDocument('ham', 'Another ham document'); $model = new CachedModel( 'mycachename', new CacheCache\Cache( new CacheCache\Backends\File( array( 'dir' => __DIR__ ) ) ) ); $classifier = new ComplementNaiveBayes($source, $model); $classifier->is('ham', 'Some ham document'); // bool(true) $classifier->classify('Some ham document'); // string "ham"
Cached SVM
use Camspiers\StatisticalClassifier\Classifier\SVM; use Camspiers\StatisticalClassifier\Model\SVMCachedModel; use Camspiers\StatisticalClassifier\DataSource\DataArray; $source = new DataArray(); $source->addDocument('spam', 'Some spam document'); $source->addDocument('spam', 'Another spam document'); $source->addDocument('ham', 'Some ham document'); $source->addDocument('ham', 'Another ham document'); $model = new Model\SVMCachedModel( __DIR__ . '/model.svm', new CacheCache\Cache( new CacheCache\Backends\File( array( 'dir' => __DIR__ ) ) ) ); $classifier = new SVM($source, $model); $classifier->is('ham', 'Some ham document'); // bool(true) $classifier->classify('Some ham document'); // string "ham"
Unit testing
statistical-classifier/ $ composer install --dev
statistical-classifier/ $ phpunit
统计信息
- 总下载量: 37.01k
- 月度下载量: 0
- 日度下载量: 0
- 收藏数: 175
- 点击次数: 1
- 依赖项目数: 1
- 推荐数: 0
其他信息
- 授权协议: MIT
- 更新时间: 2013-03-14