承接 koolreport/cleandata 相关项目开发

从需求分析到上线部署,全程专人跟进,保证项目质量与交付效率

邮箱:yvsm@zunyunkeji.com | QQ:316430983 | 微信:yvsm316

koolreport/cleandata

最新稳定版本:1.6.1

Composer 安装命令:

composer require koolreport/cleandata

包简介

Solve the missing data

README 文档

README

Missing data is always a problem with data analysis and data mining. The cleandata package give you methods to solve this data missing issue.

Installation

By downloading .zip file

  1. Download
  2. Unzip the zip file
  3. Copy the folder cleandata into koolreport folder so that look like below
koolreport
├── core
├── cleandata

By composer

composer require koolreport/cleandata

Documentation

The missing value normally comes to KoolReport in form of null value. We solve this by either drop the row or fill new value for it.

DropNull

The DropNull process will drop the row which has null value or meet certain number of null occurrences.

Let look at an example:

$this->src('db')
->query("select * from customers")
->pipe(new DropNull())
->pipe($this->dataStore('clean_data'));

Above is simplest example of using DropNull process. All the row which has null value will be dropped. As a result, return data will be those customers with full informations.

Target a certain columns only

Sometime you only drop the row if some certain columns has null values:

->pipe(new DropNull(array(
    "targetColumns"=>array("salary","tax")
)))

Exclude some columns

If you want to target all columns except some because it is not important, you do:

->pipe(new DropNull(array(
    "excludedColumns"=>array("address","city")
)))

Target specific type of columns

For example, You can target number columns only, if any of those columns has null value, the row will be dropped:

->pipe(new DropNull(array(
    "targetColumnType"=>"number"
)))

You can target to other column types which are string,date,datetime,time

Threshold

For example, if data row contains more than 2 null values, drop the row:

->pipe(new DropNull(array(
    "thresh"=>3,
)))

Targeted value

What if you do not want to drop null value but the 0 value. The missing data to you is the 0 value, you can do

->pipe(new DropNull(array(
    "targetValue"=>0,
)))

Of course, you can set any target values regardless number type or string type. The default value of targetValue is null.

Stricly Null

By default the the null could be empty string or 0 value. To enable strict comparison of both value and type, you set the following:

->pipe(new DropNull(array(
    "strict"=>true,
)))

FillNull

The FillNull value is another method of cleaning data. We do not drop row with null value, rather we fill null value with the new value.

->pipe(new FillNull(array(
    "newValue"=>0
)))

Above code will fill all the null value with 10.

Targeted value

What if you want to target at 0 value, you can do:"

->pipe(new FillNull(array(
    "targetValue"=>0,
    "newValue"=>10,
)))

Fill missing value with MEDIAN and MEAN

In above example, we fill missing value with the value we want. However the better method is to fill them with mean or median of the column values. This solution seems more elegant. You can do:

->pipe(new FillNull(array(
    "newValue"=>FillNull::MEAN,
)))

For median, you do

->pipe(new FillNull(array(
    "newValue"=>FillNull::MEDIAN,
)))

Target some specific columns

You can apply fulling action to some of specified columns:

->pipe(new FillNull(array(
    "targetColumns"=>array("salary","tax"),
)))

Exclude some columns

Some columns are not important and missing value does not affect, you can do:

->pipe(new FillNull(array(
    "excludedColumns"=>array("lastname","gender"),
)))

Target some specific column type

If you want you can apply the the fill to certain number columns:

->pipe(new FillNull(array(
    "targetColumnType"=>"number"
)))

Strictly Null

By default the the null could be empty string or 0 value. To enable strict comparison of both value and type, you set the following:

->pipe(new FillNull(array(
    "strict"=>true,
)))

Support

Please use our forum if you need support, by this way other people can benefit as well. If the support request need privacy, you may send email to us at support@koolreport.com.

统计信息

  • 总下载量: 141.41k
  • 月度下载量: 0
  • 日度下载量: 0
  • 收藏数: 3
  • 点击次数: 1
  • 依赖项目数: 0
  • 推荐数: 0

GitHub 信息

  • Stars: 3
  • Watchers: 2
  • Forks: 0
  • 开发语言: PHP

其他信息

  • 授权协议: MIT
  • 更新时间: 2019-05-09