phikhi/url-to-text 问题修复 & 功能扩展

解决BUG、新增功能、兼容多环境部署,快速响应你的开发需求

邮箱:yvsm@zunyunkeji.com | QQ:316430983 | 微信:yvsm316

phikhi/url-to-text

最新稳定版本:v1.0.5

Composer 安装命令:

composer require phikhi/url-to-text

包简介

Extract texts from an url

README 文档

README

Extract any texts from a distant HTML page 🚧 WORK IN PROGRESS (do not use) 🚧

Installation

composer require phikhi/url-to-text

Usage

Basic usage

use Phikhi\UrlToText\UrlToText;

$text = (new UrlToText())
    ->from('https://phikhi.com')
    ->extract()
    ->toArray();
/*
[
    'lorem ipsum dolor sit amet',
    'non gloriam sine audentes',
    '...'
];
*/

$text = (new UrlToText())
    ->from('https://phikhi.com')
    ->extract()
    ->toJson();
// ['lorem ipsum dolor sit amet', 'non gloriam sine audentes', '...'];

$text = (new UrlToText())
    ->from('https://phikhi.com')
    ->extract()
    ->toText();
/*
lorem ipsum dolor sit amet
non gloriam sine audentes
...
*/

Advanced usage

You can customize the tags you want to parse

$text = (new UrlToText())
    ->from('https://phikhi.com')
    ->allow(['div', 'span']) // will add these tags to the existing allowed tags array (H*, p, li, a).
    ->extract()
    ->toArray();

If you want to overwrite the allowed tags array instead of extending it, you can pass a second parameter to the allow() method

$text = (new UrlToText())
    ->from('https://phikhi.com')
    ->allow(['div', 'span'], overwrite: true) // will replace the existing allowed tags array with this one.
    ->extract()
    ->toArray();

By default, script and style tags are automatically stripped before extracting the allowed tags from the DOM, to prevent some weird behavior during extraction. But you can still customize them if you need with the deny() method.

$text = (new UrlToText())
    ->from('https://phikhi.com')
    ->deny(['svg']) // will add the `svg` tag to the existing denied tags array (script, style).
    ->extract()
    ->toArray();

If you want to overwrite the denied tags array instead of extending it, you can pass a second parameter to the deny() method

$text = (new UrlToText())
    ->from('https://phikhi.com')
    ->deny(['svg'], overwrite: true) // will replace the existing denied tags array with this one.
    ->extract()
    ->toArray();

统计信息

  • 总下载量: 14
  • 月度下载量: 0
  • 日度下载量: 0
  • 收藏数: 0
  • 点击次数: 4
  • 依赖项目数: 0
  • 推荐数: 0

GitHub 信息

  • Stars: 0
  • Watchers: 1
  • Forks: 0
  • 开发语言: PHP

其他信息

  • 授权协议: MIT
  • 更新时间: 2023-03-02