定制 henrik9999/string-similarity 二次开发

按需修改功能、优化性能、对接业务系统,提供一站式技术支持

邮箱:yvsm@zunyunkeji.com | QQ:316430983 | 微信:yvsm316

henrik9999/string-similarity

最新稳定版本:1.0.1

Composer 安装命令:

composer require henrik9999/string-similarity

包简介

Finds degree of similarity between two strings, based on Dice's Coefficient, which is mostly better than Levenshtein distance.

README 文档

README

Finds degree of similarity between two strings, based on Dice's Coefficient, which is mostly better than Levenshtein distance.

This implementation actually treats multiple occurrences of a bigram as unique. The correctness of this behavior is most easily seen when getting the similarity between "GG" and "GGGGGGGG", which should obviously not be 1.

This is a PHP implemenation of the Node.js package string-similarity

Usage

Install using:

composer require henrik9999/string-similarity

In your code:

$stringSimilarity = new StringSimilarity();

$similarity = $stringSimilarity->compareTwoStrings("healed", "sealed");

$matches = $stringSimilarity->findBestMatch("healed", [
  "edward",
  "sealed",
  "theatre",
]);

API

The package contains two methods:

compareTwoStrings(string $string1, string $string2, bool $casesensitive)

Returns a fraction between 0 and 1, which indicates the degree of similarity between the two strings. 0 indicates completely different strings, 1 indicates identical strings. The comparison is case-sensitive by default.

Arguments
  1. string1 (string): The first string
  2. string2 (string): The second string
  3. casesensitive (bool): If the comparison should be case-sensitive

Order does not make a difference.

Returns

(number): A fraction from 0 to 1, both inclusive. Higher number indicates more similarity.

Examples
$stringSimilarity->compareTwoStrings("healed", "sealed");
// → 0.8

$stringSimilarity->compareTwoStrings(
  "Olive-green table for sale, in extremely good condition.",
  "For sale: table in very good  condition, olive green in colour."
);
// → 0.6060606060606061

$stringSimilarity->compareTwoStrings(
  "Olive-green table for sale, in extremely good condition.",
  "For sale: green Subaru Impreza, 210,000 miles"
);
// → 0.2558139534883721

$stringSimilarity->compareTwoStrings(
  "Olive-green table for sale, in extremely good condition.",
  "Wanted: mountain bike with at least 21 gears."
);
// → 0.1411764705882353

findBestMatch(string mainString, array targetStrings, bool $casesensitive)

Compares mainString against each string in targetStrings.

Arguments
  1. mainString (string): The string to match each target string against.
  2. targetStrings (array): Each string in this array will be matched against the main string.
  3. casesensitive (bool): If the comparison should be case-sensitive.
Returns

(Object): An object with a ratings property, which gives a similarity rating for each target string, a bestMatch property, which specifies which target string was most similar to the main string, and a bestMatchIndex property, which specifies the index of the bestMatch in the targetStrings array.

Examples
$stringSimilarity->findBestMatch('Olive-green table for sale, in extremely good condition.', [
  'For sale: green Subaru Impreza, 210,000 miles',
  'For sale: table in very good condition, olive green in colour.',
  'Wanted: mountain bike with at least 21 gears.'
]);
// →
array(3) {
  ["ratings"]=>
  array(3) {
    [0]=>
    array(2) {
      ["target"]=>
      string(45) "For sale: green Subaru Impreza, 210,000 miles"
      ["rating"]=>
      float(0.2558139534883721)
    }
    [1]=>
    array(2) {
      ["target"]=>
      string(62) "For sale: table in very good condition, olive green in colour."
      ["rating"]=>
      float(0.6060606060606061)
    }
    [2]=>
    array(2) {
      ["target"]=>
      string(45) "Wanted: mountain bike with at least 21 gears."
      ["rating"]=>
      float(0.1411764705882353)
    }
  }
  ["bestMatch"]=>
  array(2) {
    ["target"]=>
    string(62) "For sale: table in very good condition, olive green in colour."
    ["rating"]=>
    float(0.6060606060606061)
  }
  ["bestMatchIndex"]=>
  int(1)
}

Release Notes

1.0.1

  • Made some perfomance improvements

1.0.0

  • Initial Release

统计信息

  • 总下载量: 12.25k
  • 月度下载量: 0
  • 日度下载量: 0
  • 收藏数: 1
  • 点击次数: 2
  • 依赖项目数: 0
  • 推荐数: 0

GitHub 信息

  • Stars: 1
  • Watchers: 1
  • Forks: 2
  • 开发语言: PHP

其他信息

  • 授权协议: MIT
  • 更新时间: 2022-06-19