How to start with Crawler & ZyteCrawler
Utilizing Different Crawlers with EasyAI
When integrating web crawling functionality in your applications using EasyAI, you have the flexibility to choose between the basic Crawler and the more advanced ZyteCrawler that leverages the Zyte API for enhanced capabilities. Below are examples of how to work with both:
Using the Crawler
Crawleruse EasyAI\Tools\Crawler;
$crawler = new Crawler();To fetch the full HTML content of a webpage:
// one URL
$html = $crawler->getHtml('https://hosono.ai/en');
// multiple URLs
$urls = [
'https://hosono.ai/en',
'https://hosono.ai/de'
];
$html = $crawler->getHtmls($urls);Or, to retrieve only the textual content:
// one URL
$text = $crawler->getText('https://hosono.ai/en');
// multiple URLs
$urls = [
'https://hosono.ai/en',
'https://hosono.ai/de'
];
$text = $crawler->getTexts($urls);Using the ZyteCrawler
ZyteCrawlerThe ZyteCrawler works similarly to the Crawler but offers access to the Zyte API's advanced features. Depending on your specific requirements, you can choose the most suitable crawler for your project.
use EasyAI\Tools\ZyteCrawler;
$zyteCrawler = new ZyteCrawler();To fetch the full HTML content of a webpage:
// one URL
$html = $zyteCrawler->getHtml('https://hosono.ai/en');
// multiple URLs
$urls = [
'https://hosono.ai/en',
'https://hosono.ai/de'
];
$html = $zyteCrawler->getHtmls($urls);Or, to retrieve only the textual content:
// one URL
$text = $zyteCrawler->getText('https://hosono.ai/en');
// multiple URLs
$urls = [
'https://hosono.ai/en',
'https://hosono.ai/de'
];
$text = $zyteCrawler->getTexts($urls);Last updated
Was this helpful?