程序師世界是廣大編程愛好者互助、分享、學習的平台,程序師世界有你更精彩!
首頁
編程語言
C語言|JAVA編程
Python編程
網頁編程
ASP編程|PHP編程
JSP編程
數據庫知識
MYSQL數據庫|SqlServer數據庫
Oracle數據庫|DB2數據庫
您现在的位置: 程式師世界 >> 編程語言 >  >> 更多編程語言 >> Python

python帶你實現視頻采集、自動評論及自動點贊

編輯:Python

前言

嗨喽,大家好呀~這裡是愛看美女的茜茜吶

技術賦能,用科技提升每個人獨特的幸福感。

在快上,用戶可以用照片和短視頻記錄自己的生活點滴,也可以通過直播與粉絲實時互動。

快的內容覆蓋生活的方方面面,用戶遍布全國各地。

在這裡,人們能找到自己喜歡的內容,找到自己感興趣的人,看到更真實有趣的世界,也可以讓世界發現真實有趣的自己。

知識點:

  • 動態數據抓包

  • requests發送請求

  • json數據解析

開發環境:

  • python 3.8 運行代碼

  • pycharm 2021.2 輔助敲代碼

  • requests 第三方模塊 發送請求 Python工具 訪問網站

代碼實現:

  1. 發送請求

  2. 獲取數據

  3. 解析數據

  4. 保存數據

采集視頻代碼

網址裡的網名被我刪啦,你可以看一下它的鏈接如何的然後自己添加一下

具體爬的什麼網站我會在評論區打出~大家注意看哦

如果你實在不會或有點點小懶癌的小可耐也可以私聊我領取完整源碼哦~

導入模塊

import requests # 第三方模塊 發送請求
import re

偽裝

headers = {

'content-type': 'application/json',
'Cookie': 'kpf=PC_WEB; kpn=KUAISHOU_VISION; clientid=3; did=web_d3f9d8c2cbebafd126b80eb0b1c13360; client_key=65890b29; didv=1658130458000; userId=270932146; kuaishou.server.web_st=ChZrdWFpc2hvdS5zZXJ2ZXIud2ViLnN0EqABymzXlGDinYWz3v5NKZWKq6Ld14uOvyRNPT3Gi7uJwI8CE4aatjowKRbPtRt5YIE3s2otZdFEzL7kvW1PQuijqUT_qUe4-u0FlfN1S49mhR4QRc9YKQNObXAPYzZRWIRcrSvdohIwUW8TBTSWLUtMlMh2He2FyvNMR-JfhUHaK-YSkwqXKUj-N-zlHTCPp0z0y6cSgrR9RIdlXqIJFifSbxoSsguEA2pmac6i3oLJsA9rNwKEIiB86mXKYIgbGBbtkVuyoy8TCIwZ2uckiTnfAGZiyV9imCgFMAE; kuaishou.server.web_ph=7353170c91b8f7f05c250730c2faea5355e1',
'Host': 'www..com',
'Origin': 'https://www..com',
'Referer': 'https://www..com/search/video?searchKey=%E6%B3%B3%E8%A3%85%E5%B0%8F%E5%A7%90%E5%A7%90',
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.0.0 Safari/537.36'
}
for page in range(1, 11):
# post請求裡面才會有
json = {

'operationName': "visionSearchPhoto",
'query': "fragment photoContent on PhotoEntity {\n id\n duration\n caption\n likeCount\n viewCount\n realLikeCount\n coverUrl\n photoUrl\n photoH265Url\n manifest\n manifestH265\n videoResource\n coverUrls {\n url\n __typename\n }\n timestamp\n expTag\n animatedCoverUrl\n distance\n videoRatio\n liked\n stereoType\n profileUserTopPhoto\n __typename\n}\n\nfragment feedContent on Feed {\n type\n author {\n id\n name\n headerUrl\n following\n headerUrls {\n url\n __typename\n }\n __typename\n }\n photo {\n ...photoContent\n __typename\n }\n canAddComment\n llsid\n status\n currentPcursor\n __typename\n}\n\nquery visionSearchPhoto($keyword: String, $pcursor: String, $searchSessionId: String, $page: String, $webPageArea: String) {\n visionSearchPhoto(keyword: $keyword, pcursor: $pcursor, searchSessionId: $searchSessionId, page: $page, webPageArea: $webPageArea) {\n result\n llsid\n webPageArea\n feeds {\n ...feedContent\n __typename\n }\n searchSessionId\n pcursor\n aladdinBanner {\n imgUrl\n link\n __typename\n }\n __typename\n }\n}\n",
'variables': {
'keyword': "泳裝小姐姐", 'pcursor': str(page), 'page': "search", 'searchSessionId': "MTRfMjcwOTMyMTQ2XzE2NTg5MjM5NDExODBf5rOz6KOF5bCP5aeQ5aeQXzE4NzQ"}
}
url = 'https://www..com/graphql'

1. 發送請求

 response = requests.post(url=url, headers=headers, json=json)

<Response [200]>: 請求成功

<Response [400]>: 沒有在服務器裡面找到你想要的資源

給不給你數據 是兩回事

2. 獲取數據

.text: 字符串

.json(): 字典類型數據

 json_data = response.json()

3. 解析數據

xpath

css 只能取網頁源代碼裡面數據的

re 如果當 xpathcssjson 都不可以用的時候 都可以取 (復雜)

json 只能取 {"":""} ["", ""]

 feeds = json_data['data']['visionSearchPhoto']['feeds']
for i in range(0, len(feeds)):
photoUrl = feeds[i]['photo']['photoUrl']
caption = feeds[i]['photo']['caption']
print(caption, photoUrl)
caption = re.sub('[\\/:*?"<>|\\n]', '', caption)

4. 保存視頻

一般情況下, 大部分網站 視頻鏈接 圖片鏈接 音頻鏈接 都可以直接用get

.content: 獲取視頻二進制數據

 video_data = requests.get(photoUrl).content
with open(f'video/{
caption}.mp4', mode='wb') as f:
f.write(video_data)

源碼、解答、教程加Q裙:261823976 點擊藍字加入【python學習裙】

自動評論, 自動點贊

import requests
class KuaiShou():
def __init__(self):
self.headers = {

'content-type': 'application/json',
'Cookie': 'kpf=PC_WEB; kpn=KUAISHOU_VISION; clientid=3; did=web_d3f9d8c2cbebafd126b80eb0b1c13360; client_key=65890b29; didv=1658130458000; userId=270932146; kuaishou.server.web_st=ChZrdWFpc2hvdS5zZXJ2ZXIud2ViLnN0EqABymzXlGDinYWz3v5NKZWKq6Ld14uOvyRNPT3Gi7uJwI8CE4aatjowKRbPtRt5YIE3s2otZdFEzL7kvW1PQuijqUT_qUe4-u0FlfN1S49mhR4QRc9YKQNObXAPYzZRWIRcrSvdohIwUW8TBTSWLUtMlMh2He2FyvNMR-JfhUHaK-YSkwqXKUj-N-zlHTCPp0z0y6cSgrR9RIdlXqIJFifSbxoSsguEA2pmac6i3oLJsA9rNwKEIiB86mXKYIgbGBbtkVuyoy8TCIwZ2uckiTnfAGZiyV9imCgFMAE; kuaishou.server.web_ph=7353170c91b8f7f05c250730c2faea5355e1',
'Host': 'www..com',
'Origin': 'https://www..com',
'Referer': 'https://www..com/search/video?searchKey=%E6%B3%B3%E8%A3%85%E5%B0%8F%E5%A7%90%E5%A7%90',
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.0.0 Safari/537.36'
}
self.url = 'https://www..com/graphql'
def getSearch(self, keyword, page):
""" 獲取搜索視頻 :param keyword: 關鍵字 :param page: 頁碼 :return: json_data """
json = {

'operationName': "visionSearchPhoto",
'query': "fragment photoContent on PhotoEntity {\n id\n duration\n caption\n likeCount\n viewCount\n realLikeCount\n coverUrl\n photoUrl\n photoH265Url\n manifest\n manifestH265\n videoResource\n coverUrls {\n url\n __typename\n }\n timestamp\n expTag\n animatedCoverUrl\n distance\n videoRatio\n liked\n stereoType\n profileUserTopPhoto\n __typename\n}\n\nfragment feedContent on Feed {\n type\n author {\n id\n name\n headerUrl\n following\n headerUrls {\n url\n __typename\n }\n __typename\n }\n photo {\n ...photoContent\n __typename\n }\n canAddComment\n llsid\n status\n currentPcursor\n __typename\n}\n\nquery visionSearchPhoto($keyword: String, $pcursor: String, $searchSessionId: String, $page: String, $webPageArea: String) {\n visionSearchPhoto(keyword: $keyword, pcursor: $pcursor, searchSessionId: $searchSessionId, page: $page, webPageArea: $webPageArea) {\n result\n llsid\n webPageArea\n feeds {\n ...feedContent\n __typename\n }\n searchSessionId\n pcursor\n aladdinBanner {\n imgUrl\n link\n __typename\n }\n __typename\n }\n}\n",
'variables': {
'keyword': keyword, 'pcursor': str(page), 'page': "search",
'searchSessionId': "MTRfMjcwOTMyMTQ2XzE2NTg5MjM5NDExODBf5rOz6KOF5bCP5aeQ5aeQXzE4NzQ"}
}
json_data = requests.post(url=self.url, headers=self.headers, json=json).json()
return json_data
def isLike(self, photoAuthorId, photoId):
""" 點贊操作 :param photoAuthorId: 作品的作者id :param photoId: 作品id :return: """
json = {

'operationName': "visionVideoLike",
'query': "mutation visionVideoLike($photoId: String, $photoAuthorId: String, $cancel: Int, $expTag: String) {\n visionVideoLike(photoId: $photoId, photoAuthorId: $photoAuthorId, cancel: $cancel, expTag: $expTag) {\n result\n __typename\n }\n}\n",
'variables': {

'cancel': 0,
'expTag': "1_a/2001481596260506114_xpcwebsearchxxnull0",
'photoAuthorId': photoAuthorId,
'photoId': photoId
}
}
json_data = requests.post(url=self.url, headers=self.headers, json=json).json()
return json_data
def postComment(self, content, photoAuthorId, photoId):
""" 發布評論 :param content: 評論內容 :param photoAuthorId: 作品的作者id :param photoId: 作者id :return: """
json = {

'operationName': "visionAddComment",
'query': "mutation visionAddComment($photoId: String, $photoAuthorId: String, $content: String, $replyToCommentId: ID, $replyTo: ID, $expTag: String) {\n visionAddComment(photoId: $photoId, photoAuthorId: $photoAuthorId, content: $content, replyToCommentId: $replyToCommentId, replyTo: $replyTo, expTag: $expTag) {\n result\n commentId\n content\n timestamp\n status\n __typename\n }\n}\n",
'variables': {

'content': content,
'expTag': "1_a/2001481596260506114_xpcwebsearchxxnull0",
'photoAuthorId': photoAuthorId,
'photoId': photoId
}
}
json_data = requests.post(url=self.url, headers=self.headers, json=json).json()
return json_data
def getComment(self, photoId, pcursor):
""" 獲取評論 :param photoId: 作品id :param pcursor: 頁碼 :return: 評論內容 """
json = {

'operationName': "commentListQuery",
'query': "query commentListQuery($photoId: String, $pcursor: String) {\n visionCommentList(photoId: $photoId, pcursor: $pcursor) {\n commentCount\n pcursor\n rootComments {\n commentId\n authorId\n authorName\n content\n headurl\n timestamp\n likedCount\n realLikedCount\n liked\n status\n subCommentCount\n subCommentsPcursor\n subComments {\n commentId\n authorId\n authorName\n content\n headurl\n timestamp\n likedCount\n realLikedCount\n liked\n status\n replyToUserName\n replyTo\n __typename\n }\n __typename\n }\n __typename\n }\n}\n",
'variables': {
'photoId': photoId, 'pcursor': str(pcursor)}
}
json_data = requests.post(url=self.url, headers=self.headers, json=json).json()
return json_data
def getUserInfo(self, userId):
""" 獲取用戶信息 :param userId: 用戶id :return: """
json = {

'operationName': "visionProfile",
'query': "query visionProfile($userId: String) {\n visionProfile(userId: $userId) {\n result\n hostName\n userProfile {\n ownerCount {\n fan\n photo\n follow\n photo_public\n __typename\n }\n profile {\n gender\n user_name\n user_id\n headurl\n user_text\n user_profile_bg_url\n __typename\n }\n isFollowing\n __typename\n }\n __typename\n }\n}\n",
'variables': {
'userId': userId}
}
json_data = requests.post(url=self.url, headers=self.headers, json=json).json()
return json_data
def getUserPhoto(self, userId, pcursor):
""" 獲取用戶作品 :param userId: 用戶id :param pcursor: 頁碼參數 :return: """
json = {

'operationName': "visionProfilePhotoList",
'query': "fragment photoContent on PhotoEntity {\n id\n duration\n caption\n likeCount\n viewCount\n realLikeCount\n coverUrl\n photoUrl\n photoH265Url\n manifest\n manifestH265\n videoResource\n coverUrls {\n url\n __typename\n }\n timestamp\n expTag\n animatedCoverUrl\n distance\n videoRatio\n liked\n stereoType\n profileUserTopPhoto\n __typename\n}\n\nfragment feedContent on Feed {\n type\n author {\n id\n name\n headerUrl\n following\n headerUrls {\n url\n __typename\n }\n __typename\n }\n photo {\n ...photoContent\n __typename\n }\n canAddComment\n llsid\n status\n currentPcursor\n __typename\n}\n\nquery visionProfilePhotoList($pcursor: String, $userId: String, $page: String, $webPageArea: String) {\n visionProfilePhotoList(pcursor: $pcursor, userId: $userId, page: $page, webPageArea: $webPageArea) {\n result\n llsid\n webPageArea\n feeds {\n ...feedContent\n __typename\n }\n hostName\n pcursor\n __typename\n }\n}\n",
'variables': {
'userId': userId, 'pcursor': pcursor, 'page': "profile"}
}
json_data = requests.post(url=self.url, headers=self.headers, json=json).json()
return json_data
if __name__ == '__main__':
kuaishou = KuaiShou()

尾語

感謝你觀看我的文章吶~本次航班到這裡就結束啦

希望本篇文章有對你帶來幫助 ,有學習到一點知識~

躲起來的星星也在努力發光,你也要努力加油(讓我們一起努力叭)。

最後,博主要一下你們的三連呀(點贊、評論、收藏),不要錢的還是可以搞一搞的嘛~

不知道評論啥的,即使扣個6666也是對博主的鼓舞吖 感謝


  1. 上一篇文章:
  2. 下一篇文章:
Copyright © 程式師世界 All Rights Reserved