定制 yoeunes/regex-parser 二次开发

按需修改功能、优化性能、对接业务系统,提供一站式技术支持

邮箱:yvsm@zunyunkeji.com | QQ:316430983 | 微信:yvsm316

yoeunes/regex-parser

最新稳定版本:v1.0.10

Composer 安装命令:

composer require yoeunes/regex-parser

包简介

A powerful PCRE regex parser with lexer, AST builder, validation, ReDoS analysis, and syntax highlighting. Zero dependencies, blazing fast, and production-ready.

README 文档

README

RegexParser

Treat regular expressions as code.

Author Badge GitHub Release Badge License Badge Packagist Downloads Badge GitHub Stars Badge Supported PHP Version Badge

RegexParser: Parse and analyze PCRE patterns in PHP

RegexParser is a PHP 8.2+ library that parses PCRE regex literals into a typed AST and runs analysis through visitors. It is built for learning, validation, and tooling in PHP projects.

Project goals:

  • Make regex approachable for newcomers with clear explanations and practical examples.
  • Provide a stable foundation for validation, linting, and security analysis.
  • Aim to become a common community reference for working with regex in PHP by staying accurate, transparent, and easy to integrate.

If you are new to regex, start with the Regex Tutorial. If you want a short overview, see the Quick Start Guide.

Getting started

# Install the library
composer require yoeunes/regex-parser

# Try the CLI
vendor/bin/regex explain '/\d{4}-\d{2}-\d{2}/'

What RegexParser provides

  • Parse /pattern/flags into a structured AST.
  • Validate syntax and semantics with precise error locations.
  • Explain patterns in plain English.
  • Analyze potential ReDoS risk (theoretical by default) and provide cautious suggestions.
  • Lint codebases via the CLI.
  • Provide a visitor API for custom tooling.

Philosophy & Accuracy

RegexParser separates what it can guarantee from what is heuristic:

  • Guaranteed: parsing, AST structure, error offsets, and syntax validation for the targeted PHP/PCRE version.
  • Heuristic: ReDoS analysis is structural and conservative; treat it as potential risk unless confirmed.
  • Context matters: PCRE version, JIT, and backtrack/recursion limits change practical impact.

How to report a vulnerability responsibly

If you believe a pattern is exploitable:

  1. Run confirmed mode and capture a bounded, reproducible PoC.
  2. Include the pattern, input lengths, timings, JIT setting, and PCRE limits.
  3. Verify impact in the real code path before filing a security issue.

See SECURITY.md for reporting channels.

Safer rewrites (verify behavior)

These techniques reduce backtracking but can change matching behavior. Always validate with tests.

/(a+)+$/     -> /a+$/      (semantics often preserved, but verify captures)
/(a+)+$/     -> /a++$/     (possessive, no backtracking)
/(a|aa)+/    -> /a+/       (only if alternation is redundant)
/(a|aa)+/    -> /(?>a|aa)+/ (atomic, avoids backtracking)

How it works

  • Regex::parse() splits the literal into pattern and flags.
  • The lexer produces a token stream.
  • The parser builds an AST (RegexNode).
  • Visitors walk the AST to validate, explain, analyze, or transform.

For the full architecture, see docs/ARCHITECTURE.md.

CLI quick tour

# Parse and validate a pattern
vendor/bin/regex parse '/^hello world$/'

# Get plain English explanation
vendor/bin/regex explain '/\d{4}-\d{2}-\d{2}/'

# Check for potential ReDoS risk (theoretical by default)
vendor/bin/regex analyze '/(a+)+$/'

# Colorize pattern for better readability
vendor/bin/regex highlight '/\d+/'

# Lint your entire codebase
vendor/bin/regex lint src/

Regex Lint Output

PHP API at a glance

use RegexParser\Regex;
use RegexParser\ReDoS\ReDoSMode;

$regex = Regex::create([
    'runtime_pcre_validation' => true,
]);

// Parse a pattern into AST
$ast = $regex->parse('/^hello world$/i');

// Validate pattern safety
$result = $regex->validate('/(?<=test)foo/');
if (!$result->isValid()) {
    echo $result->getErrorMessage();
}

// Check for ReDoS risk (theoretical by default)
$analysis = $regex->redos('/(a+)+$/');
echo $analysis->severity->value; // 'critical', 'safe', etc.

// Optional: attempt bounded confirmation
$confirmed = $regex->redos('/(a+)+$/', mode: ReDoSMode::CONFIRMED);
echo $confirmed->isConfirmed() ? 'confirmed' : 'theoretical';

// Get human-readable explanation
echo $regex->explain('/\d{4}-\d{2}-\d{2}/');

Integrations

RegexParser integrates with common PHP tooling:

  • Symfony bundle: docs/guides/cli.md
  • PHPStan: vendor/yoeunes/regex-parser/extension.neon
  • Rector: Custom refactoring rules
  • GitHub Actions: vendor/bin/regex lint in your CI pipeline

Performance

RegexParser ships lightweight benchmark scripts in benchmarks/ to track parser, compiler, and formatter throughput.

  • Run formatter benchmarks: php benchmarks/benchmark_formatters.php
  • Run all benchmarks: for file in benchmarks/benchmark_*.php; do echo "Running $file"; php "$file"; echo; done

Documentation

Start here:

Key references:

Contributing

Contributions are welcome! See CONTRIBUTING.md to get started.

# Set up development environment
composer install

# Run tests
composer phpunit

# Check code style
composer phpcs

# Run static analysis
composer phpstan

License

Released under the MIT License.

Support

If you run into issues or have questions, please open an issue on GitHub: https://github.com/yoeunes/regex-parser/issues.

统计信息

  • 总下载量: 6.29k
  • 月度下载量: 0
  • 日度下载量: 0
  • 收藏数: 16
  • 点击次数: 1
  • 依赖项目数: 3
  • 推荐数: 0

GitHub 信息

  • Stars: 15
  • Watchers: 1
  • Forks: 2
  • 开发语言: PHP

其他信息

  • 授权协议: MIT
  • 更新时间: 2025-11-17