程序師世界是廣大編程愛好者互助、分享、學習的平台,程序師世界有你更精彩!
首頁
編程語言
C語言|JAVA編程
Python編程
網頁編程
ASP編程|PHP編程
JSP編程
數據庫知識
MYSQL數據庫|SqlServer數據庫
Oracle數據庫|DB2數據庫
您现在的位置: 程式師世界 >> 編程語言 >  >> 更多編程語言 >> Python

Python crawler, record the crawling process, list data, page turning, post mode, save dictionary

編輯:Python

Record the process of your own crawler , I'm working on a project recently .

The website to be crawled is relatively simple .

The problem is :

post The way , Some of the website's data needs to be used post Way to get .

such as ,

This part should see 《 Projects initiated 》, Mouse click required , At first I thought it was ajax, In fact, it's not , yes js The way to get .

therefore , A careful study reveals , In fact, the website is like this .

https://s*****view.php?id=GKUdgjKayCQvY

Specific parts are omitted , Look at this website , It's nothing , But check through the browser , You can find , Mouse click 《 Projects initiated 》, There will be one. js action .

If there is only one page ,

like this

Then you won't find js action . But if there are many , Need to click , You will find , need js 了 .

This action , Is included post Of .

The specific parameters are as follows

therefore , In fact, the requested URL , It can be formed in this way .

https://sd.zhiyuanyun.com/app/api/view.php?m=get_opps&type=2&id=89608371&p=3

therefore , Here is id,p It's the page . Others are default parameters .

And then use post The way , Construct the request .

def get_proj_number(id):
print("((((((((( >>>>>>>> now obtain organization A total of How many projects ")
params = (('m', 'get_opps'), ('type', '2'), ('id', id), ('p', "1"), )
response = requests.get(
'https://sd.zhiyuanyun.com/app/api/view.php', headers=headers, params=params)
selector = Selector(response)

such , hold p The parameter is made into a for Circulation is all right .

Save list data

The last data page requested is a list

So how to save this list .

List contains th  and td

that I'll go straight to td Make a list , then zip once .

I just got a simpler one . Make one zip(list)


  1. 上一篇文章:
  2. 下一篇文章:
Copyright © 程式師世界 All Rights Reserved