content-extract/content-processor
Robust PHP library for batch document processing. Extracts content from PDFs/text and generates structured JSON according to user-defined schemas. Now with semantic structuring, OCR support for scanned PDFs, text normalization, and alias-driven field matching. Production-ready, secure, zero unnecess
时间:2026-04-19 15:27
yebto/watermark-api
PHP SDK for the YEB Watermark API. Add text, image, pattern watermarks to images, PDFs, video and audio.
时间:2026-03-06 22:22
1tomany/pdf-pack-bundle
Symfony bundle for the 1tomany/pdf-pack library
时间:2026-03-05 03:18
1tomany/pdf-pack
A simple PHP library that makes rasterizing pages and extracting text from PDFs for large language models easy
时间:2026-03-05 00:06
rembish/text-at-any-cost
Extract plain text from common document formats: DOC, PDF, PPT, RTF, DOCX, ODT, RAR
时间:2026-02-17 14:30
jcfrane/pdf-text-extractor
A Laravel PDF text extraction package with multiple strategies (PdfParser, XObject, AWS Textract, Tesseract OCR). Handles Canva-generated PDFs, scanned documents, and other edge cases with automatic fallback.
时间:2026-02-11 09:00
catchadmin/docloader
A PHP document loader and splitter for RAG applications
时间:2026-02-04 05:12
daniel-jorg-schuppelius/php-pdf-toolkit
PHP 8.2+ library for PDF text extraction with automatic reader selection. Supports embedded text and scanned documents via OCR.
时间:2026-01-26 11:55
aspose-cloud/aspose-words-cloud
Open, generate, edit, split, merge, compare and convert Word documents. Integrate Cloud API into your solutions to manipulate documents. Convert PDF to Word (DOC, DOCX, ODT, RTF and HTML) and in the opposite direction.
时间:2026-01-04 18:30
typo3/cms-indexed-search
TYPO3 CMS Indexed Search - Provides indexing functionality for TYPO3 pages and records as well as files including PDF, Word, HTML and plain text.
时间:2026-01-04 15:58
vaites/php-apache-tika
Apache Tika bindings for PHP: extracts text from documents and images (with OCR), metadata and more...
时间:2026-01-04 15:52
kartik-v/yii2-export
A library to export server/db data in various formats (e.g. excel, html, pdf, csv etc.)
时间:2026-01-04 15:25
phpoffice/phpword
PHPWord - A pure PHP library for reading and writing word processing documents (OOXML, ODF, RTF, HTML, PDF)
时间:2026-01-04 10:06
smalot/pdfparser
Pdf parser library. Can read and extract information from pdf file.
时间:2026-01-04 10:04