承接 nmapx/bmdm-soundex 相关项目开发

从需求分析到上线部署,全程专人跟进,保证项目质量与交付效率

邮箱:yvsm@zunyunkeji.com | QQ:316430983 | 微信:yvsm316

nmapx/bmdm-soundex

最新稳定版本:2.1.0

Composer 安装命令:

composer create-project nmapx/bmdm-soundex

包简介

Beider-Morse plus Daitch-Mokotoff soundex

README 文档

README

This is a fork of the algorithm developed by Alexander Beider and Stephen P. Morse for phonetic matching of names and words. This algorithm generates less quantity of false hits comparing to soundex() and methaphone(). Also it's possible to use this algorithm for some non-latin alphabets without a transliteration.

Credits

Authors: Alexander Beider, Paris and Stephen P. Morse, San Francisco
Website: http://stevemorse.org/phoneticinfo.htm (source download, information and contacts)

Information

Currently there are 16 languages supported: Czech, Dutch, English, French, German, Greek (and Greek Latin), Hebrew, Hungarian, Italian, Latvian, Polish, Portuguese, Romanian, Russian (latin and cyrillic), Spanish, Turkish. Also BMPM (Beider-Morse Phonetic Matching) and BMDM as it's derivative can parse Hebrew names by Ashkenazic and Sephardic rules.

Differences

This fork's goal is to get rid of deprecated and global functions, global variables and to represent algorithm in OOP-like style. Also there were implemented some fixes and modifications for unification purposes. While exceeding the limits of procedural code now it's possible to include algorithm in frameworks and third-parity applications without a headache. Latvian language experimental support added.

Requirements

PHP 5.4+; mbstring extenstion

Performance

I strongly encourage to use PHP 7.0 and newer due to major performance enhancement since 5.x versions especially in array processing which is crucial for BMDM. Also there's built-in caching support - make sure that ./runtime directory is writable and let BMDM precompile and cache it's runtime rules. Here're charts of performance with and without caching. Also caching lowers I/O load. Test results available here .

Usage

Include BMDM.php or better use composer to install: composer require dautkom/bmdm

<?php

// You want to run ./composer install before
require "../vendor/autoload.php";
$bmdm = new \dautkom\bmdm\BMDM();

// Process string with a Beider-Morse algorithm and retrieve BM phonetic keys
$p = $bmdm->set('Hello world')->soundex()

// Try to guess string's language
$l = $bmdm->set('Grzegorz')->guess()

// Retrieve all supported languages
$g = $bmdm->getLanguages()

// Process string with a Beider-Morse algorithm and retrieve phonetic keys
$b = $bmdm->set('ברצלונה')->bm->soundex()

// Try to guess string's language and retrieve only language names
$l = $bmdm->set('Grzegorz')->bm->getLanguageNames()

// Retrieve Daitch-Mokotoff soundex values
// Only latin symbols are supported
$d = $bmdm->set('Grzegorz')->dm->soundex()

Ashkenazic and Sephardic support

<?php

require "../vendor/autoload.php";

// Using 'ash' upon init will load Ashkenazi phonetic rules
// Use 'sep' instead of 'ash' to init Sephardic rules
$bmdm = new \dautkom\bmdm\BMDM('ash');

Multiple languages in one string

<?php

require "../vendor/autoload.php";
$bmdm = new \dautkom\bmdm\BMDM();

$p = $bmdm->set('This is Спарта!')->soundex()

Different languages matching

<?php

require "../vendor/autoload.php";
$bmdm = new \dautkom\bmdm\BMDM();

// Words in different languages with the same pronunciation
// in most cases give intersections in results.

print_r($bmdm->set('Zelinska')->soundex());
print_r($bmdm->set('Зелинска')->soundex());

// ## Latin string
// Array
// (
//     [input] => zelinska
//     [numeric] => Array
//         (
//             [0] => Array
//                 (
//                     [0] => 486450
//                 )
// 
//         )
// 
//     [phonetic] => Array
//         (
//             [0] => Array
//                 (
//                     [0] => zYlnzki
//                     [1] => zilnzki
//                 )
// 
//         )
// 
// )
//
// ## Cyrillic string 
// Array
// (
//     [input] => зелинска
//     [numeric] => Array
//         (
//             [0] => Array
//                 (
//                     [0] => 486450
//                 )
// 
//         )
// 
//     [phonetic] => Array
//         (
//             [0] => Array
//                 (
//                     [0] => zYlnzka
//                     [1] => zYlnzko
//                     [2] => zilnzka
//                     [3] => zilnzko
//                 )
// 
//         )
// 
// )

Modification

If you are going to modify rules - disable cache for development process and cleanup ./runtime directory afterwards. Otherwise expired cached data will be loaded.

License

Project is distributed under GNU GPL v3 in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

Copyright (c) 2008-2016 Alexander Beider and Stephen P. Morse
Copyright (c) 2013-2016 Olegs Capligins

统计信息

  • 总下载量: 17.71k
  • 月度下载量: 0
  • 日度下载量: 0
  • 收藏数: 0
  • 点击次数: 1
  • 依赖项目数: 0
  • 推荐数: 0

GitHub 信息

  • Stars: 0
  • Watchers: 0
  • Forks: 10
  • 开发语言: PHP

其他信息

  • 授权协议: GPL-3.0
  • 更新时间: 2020-09-09