python:crawl4ai安装

python:crawl4ai安装

一,项目 地址:

https://github.com/unclecode/crawl4ai

 

二,通过pip安装:

$ mkdir crawl4ai
$ cd crawl4ai/
$ python3 -m venv venv
$ source venv/bin/activate
(venv) liuhongdi@liuhongdi-pc:/data/python/crawl4ai$ pip install -U crawl4ai
Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple

执行安装命令:

(venv) liuhongdi@liuhongdi-pc:/data/python/crawl4ai$ crawl4ai-setup

三,测试效果:

import asyncio
from crawl4ai import *async def main():async with AsyncWebCrawler() as crawler:result = await crawler.arun(# url="https://movie.douban.com/explore?support_type=movie&is_all=false&category=%E7%83%AD%E9%97%A8&type=%E5%85%A8%E9%83%A8",url="https://baidu.com",# js_code="window.scrollTo(0, document.body.scrollHeight);",timeout=6000,  # 6秒超时# wait_for="document.querySelector('.drc-subject-card')",# wait_for="css:.drc-subject-card")print(result.markdown)html_content = result.model_dump_json()print(html_content)if __name__ == "__main__":asyncio.run(main())