程序師世界是廣大編程愛好者互助、分享、學習的平台,程序師世界有你更精彩!
首頁
編程語言
C語言|JAVA編程
Python編程
網頁編程
ASP編程|PHP編程
JSP編程
數據庫知識
MYSQL數據庫|SqlServer數據庫
Oracle數據庫|DB2數據庫
您现在的位置: 程式師世界 >> 編程語言 >  >> 更多編程語言 >> Python

Does not exist! Python says that a browser that doesnt give data doesnt exist!

編輯:Python

Sometimes our code is always confused ?

Why do others collect xx The website can succeed , And I always don't return the data

When this happens, it is often because we don't give enough disguise , Identified ~

It's like people , You must wear clothes when you go out, don't you , If you don't wear !

Walk outside , It must be the most conspicuous one , Who will you catch if you don't

Another is that I have run successfully before , Why can't I run again now ~

And throw a word to me “ The system detects that you frequently visit , Please come back later ”

All right. ! Now let's seriously introduce how to deal with this situation ~

Be able to disguise , Think about it , How do people access websites

This time we're talking about camouflage Header , When you want to crawl the data of a website

You have to think about , If someone else crawls your data , What can you do

Don't you want to , Let others casually and madly request your server

Will you also , Take certain measures

such as , I have a website , You analyzed my address

When you want to pass python When I came to climb …

Here I'll write a simple example that can be requested

from flask import Flask
app = Flask(__name__)
@app.route('/getInfo')
def hello_world():
return " Pretend there's a lot of data here "
if __name__ == "__main__":
app.run(debug=True)

ok , Suppose you analyze my address now ,

That is to say, you can go through /getInfo You can get the data

You feel great , Began to ask

 url = 'http://127.0.0.1:5000/getInfo'
response = requests.get(url)
print(response.text)

you 're right , You did get the data at this time

however ! I think something's wrong , Want to see the requested header Information

@app.route('/getInfo')
def hello_world():
print(request.headers)
return " Pretend there's a lot of data here "
if __name__ == "__main__":
app.run(debug=True)

As a result, I saw headers This is the message

Host: 127.0.0.1:5000
User-Agent: python-requests/2.21.0
Accept-Encoding: gzip, deflate
Accept: */*
Connection: keep-alive

User-Agent: python-requests/2.21.0

Actually use python To request , Who do you say I won't seal you ?

So I make a judgment at this time , You can't get the data

@app.route('/getInfo')
def hello_world():
if(str(request.headers.get('User-Agent')).startswith('python')):
return " The system detects that you frequently visit , Please come back later "
else:
return " Pretend there's a lot of data here "
Welcome to white whoring Q Group :660193417 ###
if __name__ == "__main__":
app.run(debug=True)

Your request at this time

if __name__ == '__main__':
url = 'http://127.0.0.1:5000/getInfo'
response = requests.get(url)
print(response.text)

The result is

“ The system detects that you frequently visit , Please come back later ”

You've been exposed to me , Want to do it again , So what to do ?

Disguise yourself ,python No access

The browser can access , So you can modify your request header

First visit... In the browser , Then, when capturing the data, we get Header data

You can also use it Chrome Control panel for Header


With Header After the message , You can use requests Easy access to modules

Okay , Now you learn to pretend to be a browser

 Welcome to white whoring Q Group :660193417 ###
if __name__ == '__main__':
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.121 Safari/537.36'
}
url = 'http://127.0.0.1:5000/getInfo'
response = requests.get(url,headers=headers)
print(response.text)

Get it again and you'll find , The return is

Pretend there's a lot of data here

ok, You got the data again

All right. , This is the end of the article ~ If it's helpful to you, just like it and collect it !

I'm a panda , I'll see you in the next article


  1. 上一篇文章:
  2. 下一篇文章:
Copyright © 程式師世界 All Rights Reserved