程序師世界是廣大編程愛好者互助、分享、學習的平台,程序師世界有你更精彩!
首頁
編程語言
C語言|JAVA編程
Python編程
網頁編程
ASP編程|PHP編程
JSP編程
數據庫知識
MYSQL數據庫|SqlServer數據庫
Oracle數據庫|DB2數據庫
您现在的位置: 程式師世界 >> 編程語言 >  >> 更多編程語言 >> Python

Python crawler programming idea (151): use scrapy to grab data, and use itemloader to save a single piece of captured data

編輯:Python

         Pass in the previous text parse Method returns a MyscrapyItem Object to save the captured data to the specified file , This article will introduce another way to save data :ItemLoader.

Essentially ,ItemLoader Object also returns a item How to save data , It's just ItemLoader Object will item and response( The object used to obtain response data from the server ) It was packaged .

ItemLoader The common parameters of class construction methods are 2 individual :item and response, among item Is used to specify the Item object ( In this case MyscrapyItem object ),response Used to specify the object to get data from the service ( This example is response, It's also parse Methods the first 2 Parameters ).

         The following example will pass a ItemLoader Objects and XPath Intercept the title of the first article in the article list 、 Summary and Url. They are stored separately in title、abstract and href Of the three attributes . Finally, when running the web crawler, it will pass “-o” Command line parameters specify the type of file to save ( Determine the file type by extension ), After successful operation , The captured data will be saved to the specified file .

import scrapy
from scr

  1. 上一篇文章:
  2. 下一篇文章:
Copyright © 程式師世界 All Rights Reserved