1. <strong id="7actg"></strong>
    2. <table id="7actg"></table>

    3. <address id="7actg"></address>
      <address id="7actg"></address>
      1. <object id="7actg"><tt id="7actg"></tt></object>

        Scrapling Python 爬蟲庫

        聯(lián)合創(chuàng)作 · 2025-02-19 17:32

        Scrapling 是一款 Python 网页爬虫库,具有闪电般快速、智能且难以被检测的特点。

        特性

        • 提供快速且隐蔽的 HTTP 请求
        • 自适应网站变化,智能追踪元素
        • 性能卓越,比 BeautifulSoup 快 240 倍
        • 提供强大的反反爬虫功能,轻松绕过网站防护
        •  快速 JSON 序列化:比标准库快 10 倍
        • 富文本处理:所有字符串内置了正则表达式、清理方法等

        示例代码

        from scrapling import Fetcher

        fetcher = Fetcher(auto_match=False)

        # Do http GET request to a web page and create an Adaptor instance

        page = fetcher.get('https://quotes.toscrape.com/', stealthy_headers=True)

        # Get all text content from all HTML tags in the page except `script` and `style` tags

        page.get_all_text(ignore_tags=('script', 'style'))

        # Get all quotes elements, any of these methods will return a list of strings directly (TextHandlers)

        quotes = page.css('.quote .text::text') # CSS selector

        quotes = page.xpath('//span[@class="text"]/text()') # XPath

        quotes = page.css('.quote').css('.text::text') # Chained selectors

        quotes = [element.text for element in page.css('.quote .text')] # Slower than bulk query above

        # Get the first quote element

        quote = page.css_first('.quote') # same as page.css('.quote').first or page.css('.quote')[0]

        # Tired of selectors? Use find_all/find

        # Get all 'div' HTML tags that one of its 'class' values is 'quote'

        quotes = page.find_all('div', {'class': 'quote'})

        # Same as

        quotes = page.find_all('div', class_='quote')

        quotes = page.find_all(['div'], class_='quote')

        quotes = page.find_all(class_='quote') # and so on...

        # Working with elements

        quote.html_content # Get Inner HTML of this element

        quote.prettify() # Prettified version of Inner HTML above

        quote.attrib # Get that element's attributes

        quote.path # DOM path to element (List of all ancestors from <html> tag till the element itself)

        瀏覽 3
        點贊
        評論
        收藏
        分享

        手機掃一掃分享

        分享
        舉報
        評論
        圖片
        表情
        推薦
        點贊
        評論
        收藏
        分享

        手機掃一掃分享

        分享
        舉報
        1. <strong id="7actg"></strong>
        2. <table id="7actg"></table>

        3. <address id="7actg"></address>
          <address id="7actg"></address>
          1. <object id="7actg"><tt id="7actg"></tt></object>
            国产精品久久99精品毛片三a | 青娱乐亚洲国内 | 性荷兰videos艳星极品 | 成人做爰黄AA片免费看三区动漫 | 欧美美女日逼视频 | 免费看又黄又无码 | 艳妇h圆房~h嗯啊 | 狂男狂男揉吃奶60分钟 | 女人脱了裤衩让男人桶 | 肉体肉体xxxx肉体d |