包列表 - Packagist Composer包仓库

content-extract/content-processor

Robust PHP library for batch document processing. Extracts content from PDFs/text and generates structured JSON according to user-defined schemas. Now with semantic structuring, OCR support for scanned PDFs, text normalization, and alias-driven field matching. Production-ready, secure, zero unnecess

版本：1.5.0 下载：17 Stars：1 点击：14

时间：2026-04-19 15:27

yebto/watermark-api

PHP SDK for the YEB Watermark API. Add text, image, pattern watermarks to images, PDFs, video and audio.

版本：未知版本下载：0 Stars：0 点击：9

时间：2026-03-06 22:22

1tomany/pdf-pack-bundle

Symfony bundle for the 1tomany/pdf-pack library

版本：v0.7.1 下载：26 Stars：1 点击：10

时间：2026-03-05 03:18

1tomany/pdf-pack

A simple PHP library that makes rasterizing pages and extracting text from PDFs for large language models easy

版本：v0.7.2 下载：43 Stars：4 点击：10

时间：2026-03-05 00:06

rembish/text-at-any-cost

Extract plain text from common document formats: DOC, PDF, PPT, RTF, DOCX, ODT, RAR

版本：v1.0.0 下载：5 Stars：70 点击：6

时间：2026-02-17 14:30

jcfrane/pdf-text-extractor

A Laravel PDF text extraction package with multiple strategies (PdfParser, XObject, AWS Textract, Tesseract OCR). Handles Canva-generated PDFs, scanned documents, and other edge cases with automatic fallback.

版本：v0.0.3 下载：144 Stars：2 点击：9

时间：2026-02-11 09:00

catchadmin/docloader

A PHP document loader and splitter for RAG applications

版本：v1.0.0 下载：1 Stars：0 点击：8

时间：2026-02-04 05:12

daniel-jorg-schuppelius/php-pdf-toolkit

PHP 8.2+ library for PDF text extraction with automatic reader selection. Supports embedded text and scanned documents via OCR.

版本：v0.12.2 下载：260 Stars：0 点击：10

时间：2026-01-26 11:55

aspose-cloud/aspose-words-cloud

Open, generate, edit, split, merge, compare and convert Word documents. Integrate Cloud API into your solutions to manipulate documents. Convert PDF to Word (DOC, DOCX, ODT, RTF and HTML) and in the opposite direction.

版本：25.12.0 下载：134.4k Stars：33 点击：11

时间：2026-01-04 18:30

typo3/cms-indexed-search

TYPO3 CMS Indexed Search - Provides indexing functionality for TYPO3 pages and records as well as files including PDF, Word, HTML and plain text.

版本：v14.0.1 下载：2.14M Stars：8 点击：8

时间：2026-01-04 15:58

vaites/php-apache-tika

Apache Tika bindings for PHP: extracts text from documents and images (with OCR), metadata and more...

版本：v1.4.1 下载：1.39M Stars：117 点击：8

时间：2026-01-04 15:52

kartik-v/yii2-export

A library to export server/db data in various formats (e.g. excel, html, pdf, csv etc.)

版本：v1.4.3 下载：3.01M Stars：160 点击：8

时间：2026-01-04 15:25

spatie/pdf-to-text

Extract text from a pdf

版本：1.55.0 下载：5.85M Stars：992 点击：6

时间：2026-01-04 10:27

phpoffice/phpword

PHPWord - A pure PHP library for reading and writing word processing documents (OOXML, ODF, RTF, HTML, PDF)

版本：1.4.0 下载：31.78M Stars：7.52k 点击：14

时间：2026-01-04 10:06

smalot/pdfparser

Pdf parser library. Can read and extract information from pdf file.

版本：v2.12.2 下载：30.6M Stars：2.63k 点击：7

时间：2026-01-04 10:04