程序師世界是廣大編程愛好者互助、分享、學習的平台,程序師世界有你更精彩!
首頁
編程語言
C語言|JAVA編程
Python編程
網頁編程
ASP編程|PHP編程
JSP編程
數據庫知識
MYSQL數據庫|SqlServer數據庫
Oracle數據庫|DB2數據庫
您现在的位置: 程式師世界 >> 編程語言 >  >> 更多編程語言 >> Python

Python crawler programming idea (152): use scrapy to grab data, and use itemloader to save multiple pieces of captured data

編輯:Python

         In the last article, I passed ItemLoader Saved a piece of captured data , If you want to save multiple or all captured data , Need parse Method returns a MyscrapyItem Array .

         The following example will still grab the blog list page in the previous article example , But it will save all blog data of the crawl page , Include the title of each blog 、 Summary and Url.

import scrapy
from scrapy.loader import *
from scrapy.loader.processors import *
from bs4 import *
from myscrapy.items import MyscrapyItem
class ItemLoaderSpider1(scrapy.Spider):
name = 'ItemLoaderSpider1'
start_urls = [
'https://geekori.com/blogsCenter.php?uid=geekori'
]
def parse(self,response):
# To return MyscrapyItem An array of objects
items = []
# Get the blog list data of the blog page
sectionList = response.xpath('//*[@id="all"]/div[1]/section').extract()
# Process each blog list data through cyclic iteration
for section in sectionList:

  1. 上一篇文章:
  2. 下一篇文章:
Copyright © 程式師世界 All Rights Reserved