guttedgarden/tiktoken
最新稳定版本:1.2.1
Composer 安装命令:
composer require guttedgarden/tiktoken
包简介
PHP 7.4 version of tiktoken
README 文档
README
Yethee's tiktoken library port for PHP 7.4 without the Symfony package
Installation
$ composer require guttedgarden/tiktoken
Usage
use guttedgarden\Tiktoken\EncoderProvider; $provider = new EncoderProvider(); $encoder = $provider->getForModel('gpt-3.5-turbo-0301'); $tokens = $encoder->encode('Hello world!'); print_r($tokens); // OUT: [9906, 1917, 0] $encoder = $provider->get('p50k_base'); $tokens = $encoder->encode('Hello world!'); print_r($tokens); // OUT: [15496, 995, 0]
Cache
The encoder uses an external vocabularies, so caching is used by default to avoid performance issues.
By default, the directory for temporary files is used.
You can override the directory for cache via environment variable TIKTOKEN_CACHE_DIR
or use EncoderProvider::setVocabCache():
use guttedgarden\Tiktoken\EncoderProvider; $encProvider = new EncoderProvider(); $encProvider->setVocabCache('/path/to/cache'); // Using the provider
Disable cache
You can disable the cache, if there are reasons for this, in one of the following ways:
- Set an empty string for the environment variable
TIKTOKEN_CACHE_DIR. - Programmatically:
use guttedgarden\Tiktoken\EncoderProvider; $encProvider = new EncoderProvider(); $encProvider->setVocabCache(null); // disable the cache
Limitations
- Encoding for GPT-2 is not supported.
- Special tokens (like
<|endofprompt|>) are not supported.
License
统计信息
- 总下载量: 17.49k
- 月度下载量: 0
- 日度下载量: 0
- 收藏数: 4
- 点击次数: 1
- 依赖项目数: 0
- 推荐数: 0
其他信息
- 授权协议: MIT
- 更新时间: 2023-10-13