sundance-solutions/larachain-token-count 问题修复 & 功能扩展

解决BUG、新增功能、兼容多环境部署,快速响应你的开发需求

邮箱:yvsm@zunyunkeji.com | QQ:316430983 | 微信:yvsm316

sundance-solutions/larachain-token-count

最新稳定版本:1.0.0

Composer 安装命令:

composer require sundance-solutions/larachain-token-count

包简介

Quick helper to count tokens

README 文档

README

GO USE https://github.com/yethee/tiktoken-php 👉

Below is supersceded by the above ☝️

Latest Version on Packagist GitHub Tests Action Status GitHub Code Style Action Status Total Downloads

GPT-3 Approximate Token Counter in PHP

This repository contains a PHP function that approximates the token count of a text string, following the tokenization rules used by OpenAI's GPT-3.

GPT-3, an advanced language model developed by OpenAI, reads text in chunks called tokens. A token in GPT-3 can be as short as one character or as long as one word (e.g., 'a', 'apple'). For languages with more complex scripts (like Chinese, Japanese, etc.), one character can be multiple tokens. Spaces and punctuation are also considered separate tokens.

The function provided here offers an approximation of how GPT-3 might tokenize a given string, counting words, spaces, and punctuation as separate tokens. This allows you to estimate the number of tokens in a text string without making an API call, which can be useful for monitoring usage or avoiding unnecessary costs.

Please note that this is a simplified approximation, and the actual tokenization may vary slightly in GPT-3's actual implementation. In particular, some words might be tokenized into multiple tokens if they contain special characters or are very long. Additionally, this method may not accurately tokenize languages other than English, especially those using non-Latin characters.

As of the last update in September 2021, OpenAI has not provided a public method for accurately counting tokens the way GPT-3 does. Therefore, this function is an estimation, not a guaranteed accurate count.

Installation

You can install the package via composer:

composer require sundance-solutions/larachain-token-count

Usage

use SundanceSolutions\LarachainTokenCount\Facades\LarachainTokenCount;
    
    $text = "Your document text...";
    $results = LarachainTokenCount::count($text);
    expect($results)->toEqual(8);

Testing

composer test

Changelog

Please see CHANGELOG for more information on what has changed recently.

Contributing

Please see CONTRIBUTING for details.

Security Vulnerabilities

Please review our security policy on how to report security vulnerabilities.

Credits

License

The MIT License (MIT). Please see License File for more information.

统计信息

  • 总下载量: 8.64k
  • 月度下载量: 0
  • 日度下载量: 0
  • 收藏数: 4
  • 点击次数: 1
  • 依赖项目数: 0
  • 推荐数: 0

GitHub 信息

  • Stars: 4
  • Watchers: 1
  • Forks: 1
  • 开发语言: PHP

其他信息

  • 授权协议: MIT
  • 更新时间: 2023-05-31