textualization/ropherta-tokenizer
最新稳定版本:v0.0.7
Composer 安装命令:
composer require textualization/ropherta-tokenizer
包简介
GPT3Tokenizer (BPE) with Roberta-base vocabulary.
README 文档
README
This is just a wrapper around GPT3Tokenizer using the HuggingFace RoBERTa vocab and merge files.
See GPT3 documentation for example use (or the generated test case under tests/).
XLM Tokenizer
To use the multilingual version, the SentencePiece dependency needs to be initialized and an aditional model file needs to be downloaded:
composer exec -- php -r "require 'vendor/autoload.php'; Textualization\SentencePiece\Vendor::check();"
composer exec -- php -r "require 'vendor/autoload.php'; Textualization\Ropherta\Tokenizer\Vendor::check();"
Sponsors
We thank our sponsor:
统计信息
- 总下载量: 98
- 月度下载量: 0
- 日度下载量: 0
- 收藏数: 1
- 点击次数: 1
- 依赖项目数: 1
- 推荐数: 0
其他信息
- 授权协议: Apache-2.0
- 更新时间: 2023-08-08