pathor/url 问题修复 & 功能扩展

解决BUG、新增功能、兼容多环境部署,快速响应你的开发需求

邮箱:yvsm@zunyunkeji.com | QQ:316430983 | 微信:yvsm316

pathor/url

最新稳定版本:1.0.0

Composer 安装命令:

composer require pathor/url

包简介

Pathor is a PHP library for normalizing, analyzing, and comparing URLs.

README 文档

README

Overview

Pathor is a PHP library for normalizing, analyzing, and comparing URLs. It is built on top of the League\Uri library and offers an easy-to-use API for common URL-related operations.

Installation

Install the library via Composer:

composer require pathor/url

Features

  • Normalize URLs by standardizing components (scheme, host, path, query, etc.).
  • Generate a consistent fingerprint (hash) for URLs.
  • Compare multiple URLs to check if they are equivalent.
  • Parse URLs into their individual components.
  • Assemble URLs from their components.
  • Customize normalization with handlers and configurations.

Usage

Basic Usage

Here is a quick example of how to use the Pathor library:

use Pathor\Url;

$pathor = new Url;

$url = 'https://www.example.com/path///../a/b/../c//ё//hello world/?ref=google&b=2&a=1&&=&&foo[1]=222&foo[0]=111#hello world';

// Normalize URL
$normalizedUrl = $pathor->normalize($url);
dd($normalizedUrl); // https://www.example.com/path/a/c/%D1%91/hello%20world?a=1&b=2&foo%5B%5D=111&foo%5B%5D=222#hello%20world

// Generate fingerprint
$fingerprint = $pathor->fingerprint($url);
dd($fingerprint); // b18e86f5d2da88269fd0895af1178d8305ae78fe3fa3e61195af6b50a60f333d

// Compare URLs
$isEqual = $pathor->equals(
    'https://www.example.com/path/a/c/%D1%91/hello%20world?a=1&b=2&foo%5B%5D=111&foo%5B%5D=222#hello%20world',
    'https://www.example.com/path///../a/b/../c//ё//hello world/?ref=google&b=2&a=1&&=&&foo[1]=222&foo[0]=111#hello world',
    'https://www.example.com/path//a/b/../c//ё//hello world/?ref=google&b=2&a=1&&=&&&foo[]=111&foo[]=222#hello world',
);
dd($isEqual); // Outputs: bool(true)

// Get URL details
$details = $pathor->details($url);
dd($details); // Outputs an array with parsed and normalized components

Examples

Examples can be found here.

Configuration

The Url class can be customized with configuration options to adjust the normalization behavior. These options include:

  • fingerprint: Set the hashing algorithm for URL fingerprints (default: sha256).
  • query: Customize query string handling.
    • withoutDuplicates: Remove duplicate query parameters.
    • withoutEmptyPairs: Remove empty query parameters.
    • withSortedParams: Sort query parameters alphabetically.
    • withoutTrackingParams: Remove known tracking parameters (e.g., utm_source).
  • path: Customize path normalization.
    • withoutDotSegments: Remove . and .. segments in the path.
    • withoutEmptySegments: Remove empty segments from the path.
    • withoutTrailingSlash: Remove trailing slashes.

Default Configuration

$config = [
    'fingerprint' => 'sha256', // https://www.php.net/manual/en/function.hash-algos.php

    'query' => [
        'withoutDuplicates' => true,
        'withoutEmptyPairs' => true,
        'withoutNumericIndices' => true,
        'withSortedParams' => true,
        'withoutTrackingParams' => true,
        'trackingParamsList' => static::QUERY_TRACKING_PARAMS,
    ],

    'path' => [
        'withoutDotSegments' => true,
        'withoutEmptySegments' => true,
        'withoutTrailingSlash' => true,
    ],
];

$pathor = new Url($config);

Handlers (Custom normalization)

Custom handlers allow you to define specific rules for processing URL components. Handlers are functions that take the original and normalized values as parameters.

Example:

$handlers = [
    'scheme' => fn(?string $normalized, ?string $original): ?string => $normalized,
    'user' => fn(?string $normalized, ?string $original): ?string => $normalized,
    'password' => fn(?string $normalized, ?string $original): ?string => $normalized,
    'host' => fn(?string $normalized, ?string $original): ?string => strtoupper($original),
    'port' => fn(?int $normalized, ?int $original): ?int => $normalized,
    'path' => fn(?string $normalized, ?string $original): ?string => $normalized,
    'query' => fn(?string $normalized, ?string $original): ?string => $normalized,
    'fragment' => fn(?string $normalized, ?string $original): ?string => $normalized,
];

$pathor = new Url(handlers: $handlers);

Documentation

normalize(string $url): string

Normalizes a given URL by standardizing its components. By default, this includes:

  • Lowercasing the scheme and host.
  • Remove duplicate query parameters.
  • Remove empty query parameters.
  • Sort query parameters alphabetically.
  • Remove known tracking parameters (e.g., utm_source).
  • Remove . and .. segments in the path.
  • Remove empty segments from the path.
  • Remove trailing slashes.
  • And more.

Example:

$normalized = $pathor->normalize('HTTP://Example.COM/../a/B/./');
echo $normalized; // Outputs: http://example.com/a/B

$normalized = $pathor->normalize('https://сайт.рф');
echo $normalized; // Outputs: https://xn--80aswg.xn--p1ai

fingerprint(string $url): string

Generates a hash based on the normalized URL. The hashing algorithm can be configured.

Example:

$fingerprint = $pathor->fingerprint('https://example.com/path?param=value');

echo $fingerprint; // Outputs a hash string (e.g., SHA256)

equals(string ...$urls): bool

Compares two or more URLs to check if they are equivalent after normalization. Throws an exception if less than two URLs are provided.

Example:

$areEqual = $pathor->equals(
    'https://example.com/?utm_source=google',
    'https://example.com:443?ref=site&=',
    'https://example.com:443/',
    'https://example.com:443/?#',
    'https://example.com:443'
);
var_dump($areEqual); // Outputs: bool(true)

parse(string $url): array

Breaks a URL into its components, returning an associative array.

Example:

$components = $pathor->parse('https://user:pass@example.com:8080/path?query=value#fragment');

dd($components);

// ^ array:8 [
//   "scheme" => "https"
//   "host" => "example.com"
//   "user" => "user"
//   "password" => "pass"
//   "port" => 8080
//   "path" => "/path"
//   "query" => "query=value"
//   "fragment" => "fragment"
// ]

build(array $components): string

Assembles a URL from its components. Accepts an associative array with keys like scheme, host, path, etc.

Example:

$url = $pathor->build([
    'scheme' => 'https',
    'host' => 'example.com',
    'path' => 'new-path',
    'query' => ['param' => 'value'], // or string (http_build_query)
    'fragment' => 'section'
]);

echo $url; // Outputs: https://example.com/new-path?param=value#section

details(string $url): array

Returns a detailed breakdown of a normalized URL, including original and modified components.

Example:

$details = $pathor->details('https://www.example.com:443/path///../a/b/../c//ё//hello world/?ref=google&b=2&a=1&&=&&foo[1]=222&foo[0]=111#hello world');

dd($details);

// ^ array:4 [
//   "fingerprint" => "4c64095f06900806842e22f93ee151ab"
//   "original_url" => "https://www.example.com:443/path///../a/b/../c//ё//hello world/?ref=google&b=2&a=1&&=&&foo[1]=222&foo[0]=111#hello world"
//   "normalized_url" => "https://www.example.com/path/a/c/%D1%91/hello%20world?a=1&b=2&foo%5B%5D=111&foo%5B%5D=222#hello%20world"
//   "parsed_url" => array:8 [
//     "scheme" => "https"
//     "host" => "www.example.com"
//     "user" => null
//     "password" => null
//     "port" => null
//     "path" => "/path/a/c/%D1%91/hello%20world"
//     "query" => "a=1&b=2&foo%5B%5D=111&foo%5B%5D=222"
//     "fragment" => "hello%20world"
//   ]
// ]

Contributing

Contributions are welcome! Please submit pull requests or open issues.

License

This library is licensed under the MIT License. See the LICENSE file for details.

统计信息

  • 总下载量: 2
  • 月度下载量: 0
  • 日度下载量: 0
  • 收藏数: 0
  • 点击次数: 0
  • 依赖项目数: 0
  • 推荐数: 0

GitHub 信息

  • Stars: 0
  • Watchers: 1
  • Forks: 0
  • 开发语言: PHP

其他信息

  • 授权协议: MIT
  • 更新时间: 2024-12-11