承接 codechap/context-trimmer 相关项目开发

从需求分析到上线部署,全程专人跟进,保证项目质量与交付效率

邮箱:yvsm@zunyunkeji.com | QQ:316430983 | 微信:yvsm316

codechap/context-trimmer

Composer 安装命令:

composer require codechap/context-trimmer

包简介

A tokenizer-agnostic text preprocessor to trim context for LLMs.

README 文档

README

A tokenizer-agnostic text preprocessor for trimming context in LLM applications.

Requires PHP 8.2 or higher.

This library provides functions to process, trim, and optimize text for large language model (LLM) context windows. It includes options for removing short words, stripping extraneous punctuation, and compressing whitespace.

Installation

Install via Composer:

composer require codechap/context-trimmer:"dev-master"

Usage

Create a file (for example, run.php) with the following code to see the ContextTrimmer in action:

require_once 'vendor/autoload.php';

use codechap\ContextTrimmer\ContextTrimmer;

// Load your context from a file
$input = file_get_contents('context.txt');

// Configure and trim the input text using chained setters
$result = new ContextTrimmer()
    ->set('removeShortWords', true)
    ->set('minWordLength', 2)
    ->set('removeExtraneous', true)
    ->set('maxTokens', 50)
    ->trim($input);

// Output the trimmed text segments as JSON
echo json_encode($result, JSON_PRETTY_PRINT | JSON_UNESCAPED_SLASHES | JSON_UNESCAPED_UNICODE);

In this example, the ContextTrimmer is configured to remove short words, strip extraneous punctuation, and limit tokens per segment (50 tokens in this case). The resulting trimmed output is returned as an array of text segments.

Running Tests

To run the tests, use:

composer test

License

This library is released under the MIT License. See the LICENSE file for details.

Contributing

Contributions and pull requests are welcome! Please follow the existing coding standards and include tests for new functionality.

统计信息

  • 总下载量: 1
  • 月度下载量: 0
  • 日度下载量: 0
  • 收藏数: 0
  • 点击次数: 0
  • 依赖项目数: 0
  • 推荐数: 0

GitHub 信息

  • Stars: 0
  • Watchers: 1
  • Forks: 0
  • 开发语言: PHP

其他信息

  • 授权协议: MIT
  • 更新时间: 2025-02-03