ganglio/pds 问题修复 & 功能扩展

解决BUG、新增功能、兼容多环境部署,快速响应你的开发需求

邮箱:yvsm@zunyunkeji.com | QQ:316430983 | 微信:yvsm316

ganglio/pds

最新稳定版本:v1.0.11

Composer 安装命令:

composer require ganglio/pds

包简介

Probabilistic Data Structures to efficiently analyze and mine big datasets

README 文档

README

Probabilistic Data Structures to efficiently analyze and mine big datasets

Latest Stable Version Build Status codecov.io Code Climate License

This package contains a collection of data structures and tools to analyze big amounts of data in a memory efficient way.

Table Of Content

  1. Installation
  2. Namespaces
  3. Interfaces
  4. Classes
  5. Examples

Installation

Install via Composer (make sure you have composer in your path or in your project).

Put the following in your package.json:

{
    "require": {
        "ganglio/PDS": "*"
    }
}

and then run composer install or just run

composer require ganglio/PDS

Namespaces

A number of namespaces are defined in the library.

  • \ganglio\PDS\Bloom
  • \ganglio\PDS\Estimators
  • \ganglio\PDS\Hash
  • \ganglio\PDS\Storage

Interfaces

Estimator

This interface is the basis for cardinality estimators. It defines two methods:

  • add($key) - adds a key to the estimator
  • count() - returns the number of keys added to the estimator

Depending on the implementation the actual class might return an exact estimation, like the Exact class, or an approximation like the HyperLogLog class.

Hash

This interface is the basis for the various hashing classes offered by the package. It defines one method and a constant:

  • hash($str) - performs the actual hashing of the string provided
  • UPPERBOUND - a 32-bit mask to be used by the hashing functions 0xffffffff

Storage

This interface is the basis for the storage classes. It defines four methods:

  • set($key, $value) - sets a value to the key in the storage system
  • get($key) - gets the value stored to the key
  • flush() - flushes the storage system
  • size() - returns the number of keys stored in the storage system

Classes

BitArray (implements Storage)

Implements a single bit array. It's used to implement the Bloob Filter where the set method only accepts Bool as $value.

HyperLogLog (implements Estimator)

Implements the HyperLogLog cardinality estimator algorithm. The actual implementation uses HyperLogLog for big cardinalities and LinearCounting for small ones as it gives a better approximation.

Exact (implements Estimator)

Implement an exact counter. It's primarily a toy class to show how to use the Estimator interface.

Trivial (implements Hash)

Implements a trivial hashing algorithm. Basically adds the ASCII code shifted right by the character position for each characted of the input string and then takes the lower 32 bits. It's a toy class to show how to use the Hash interface.

Pearson (implements Hash)

Implements the Pearson non-cryptographic hashing function.

FVNHash (implements Hash)

Implements the Fowler-Noll-Vo non-cryptographic hashing function. The actual algorithm is the FNV-1 hash.

Generic (implements Hash)

This class is basically a wrapper around the standard PHP hash function. The constructor accepts the algorithm name to use as from the PHP hash_algos() function. If an unknown algorithm is specified it raises an exception, if none is specified MD5 is selected as default.

MultiHash (implements Hash)

This class calculates multiple hashes using different algorithms specified ad arguments of the constructor. It's primarily used in conjunction with the BitArray class to implement the Bloom Filter.

Bloom

This class implements a Bloom Filter, a probabilistic data structure that allows to test if an element is a member of a set with a very small memory footprint.

Examples

TODO

统计信息

  • 总下载量: 32
  • 月度下载量: 0
  • 日度下载量: 0
  • 收藏数: 3
  • 点击次数: 0
  • 依赖项目数: 0
  • 推荐数: 0

GitHub 信息

  • Stars: 3
  • Watchers: 1
  • Forks: 0
  • 开发语言: PHP

其他信息

  • 授权协议: MIT
  • 更新时间: 2015-09-15