riodevnet/elephscraper
最新稳定版本:v1.0.0
Composer 安装命令:
composer require riodevnet/elephscraper
包简介
ElephScraper is a lightweight and PHP-native web scraping toolkit built using Guzzle and Symfony DomCrawler. It provides a clean and powerful interface to extract HTML content, metadata, and structured data from any website.
README 文档
README
ElephScraper is a lightweight and PHP-native web scraping toolkit built using Guzzle and Symfony DomCrawler. It provides a clean and powerful interface to extract HTML content, metadata, and structured data from any website.
Fast. Clean. Eleph-style scraping. 🐘⚡
🚀 Features
- ✅ Extract metadata: title, description, keywords, author, charset, canonical, and more
- ✅ Supports Open Graph, Twitter Card, CSRF tokens, and HTTP-equiv headers
- ✅ Extract headings, paragraphs, images, lists, and links
- ✅ Powerful
filter()method with support for class/ID/tag-based selectors - ✅ Return raw HTML or clean plain text
- ✅ Clean return types: string, array, or associative array
- ✅ Built with Guzzle + Symfony DomCrawler + CssSelector
📦 Installation
Install via Composer:
composer require riodevnet/elephscraper
Requires PHP 7.4 or newer.
🛠️ Basic Usage
<?php require_once __DIR__ . '/vendor/autoload.php'; use Riodevnet\Elephscraper\ElephScraper; $scraper = new ElephScraper("https://example.com"); echo $scraper->title(); // "Welcome to Example.com" echo $scraper->description(); // "Example site for testing" print_r($scraper->h1()); // ["Main Title", "News"] print_r($scraper->openGraph());
🧪 Available Methods
🔹 Page Metadata
$scraper->title(); $scraper->description(); $scraper->keywords(); $scraper->keywordString(); $scraper->charset(); $scraper->canonical(); $scraper->contentType(); $scraper->author(); $scraper->csrfToken(); $scraper->image();
🔹 Open Graph & Twitter Card
$scraper->openGraph(); // All OG meta $scraper->openGraph("og:title"); // Specific OG tag $scraper->twitterCard(); // All Twitter tags $scraper->twitterCard("twitter:title");
🔹 Headings & Text
$scraper->h1(); $scraper->h2(); $scraper->h3(); $scraper->h4(); $scraper->h5(); $scraper->h6(); $scraper->p();
🔹 Lists
$scraper->ul(); // all <ul><li> text $scraper->ol(); // all <ol><li> text
🔹 Images
$scraper->images(); // just src URLs $scraper->imageDetails(); // src, alt, title
🔹 Links
$scraper->links(); // just hrefs $scraper->linkDetails(); // full detail
🔍 Custom DOM Filtering
▸ Example: Filter Single Element
$scraper->filter( element: 'div', attributes: ['id' => 'main'], multiple: false, extract: ['.title', '#desc', 'p'], returnHtml: false );
▸ Example: Filter Multiple Elements
$scraper->filter( element: 'div', attributes: ['class' => 'card'], multiple: true, extract: ['h2', '.subtitle', '#info'], returnHtml: false );
▸ Example: Return HTML Content
$scraper->filter( element: 'section', attributes: ['class' => 'hero'], returnHtml: true );
Extract selectors support:
- Tag names:
h1,p,span, etc.- Class:
.className- ID:
#idNameOutput keys auto-normalized to original selector.
🤝 Contributing
Found a bug? Want to add features? Open an issue or create a pull request!
📄 License
MIT License © 2025 — ElephScraper
🔗 Related Libraries
💡 Why ElephScraper?
ElephScraper is your dependable PHP elephant — strong, smart, and always ready to extract the right data.
统计信息
- 总下载量: 3
- 月度下载量: 0
- 日度下载量: 0
- 收藏数: 3
- 点击次数: 0
- 依赖项目数: 0
- 推荐数: 0
其他信息
- 授权协议: MIT
- 更新时间: 2025-07-03