MongoDB By C++ Non relational database written in language , It is an open source database system based on distributed file storage , Its content storage form is similar to JSON object , Its field values can contain other documents 、 Arrays and document arrays , Very flexible . In this section , Let's have a look Python 3 Next MongoDB Storage operations for .
Before we start , Please make sure it's installed MongoDB And launched its service , And it's installed Python Of PyMongo library .
Connect MongoDB when , We need to use PyMongo In the library MongoClient. Generally speaking , Pass in MongoDB Of IP And port , The first parameter is the address host, The second parameter is port port( If you don't pass it parameters , The default is 27017):
import pymongo client = pymongo.MongoClient(host='localhost', port=27017)
So you can create MongoDB The connection object of .
in addition ,MongoClient The first parameter of host It can also be passed directly into MongoDB Connection string of , It uses mongodb start , for example :
client = MongoClient('mongodb://localhost:27017/')This can also achieve the same connection effect .
MongoDB Multiple databases can be set up in , Next we need to specify which database to operate on . Here we have test Database as an example to illustrate , The next step is to specify the database to be used in the program :
db = client.test
This call client Of test Property to return test database . Of course , We can also specify :
db = client['test']
These two ways are equivalent .
MongoDB Each database of contains many collections (collection), They are similar to tables in a relational database .
The next step is to specify the set to operate on , Here you specify a collection named students. Similar to a specified database , There are also two ways to specify sets :
collection = db.students
collection = db['students']
So we have a statement Collection object .
Next , You can insert data . about students This collection , New student data , This data is represented in a dictionary :
student = {
'id': '20170101',
'name': 'Jordan',
'age': 20,
'gender': 'male'
} The student number is specified here 、 full name 、 Age and gender . Next , Call directly collection Of insert() Method to insert data , The code is as follows :
result = collection.insert(student) print(result)
stay MongoDB in , Every data actually has a _id Attribute to uniquely identify . If the property is not explicitly specified ,MongoDB Will automatically generate a ObjectId Type of _id attribute .insert() Method returns after execution _id value .
The operation results are as follows :
5932a68615c2606814c91f3d
Of course , We can also insert multiple pieces of data at the same time , Just pass it as a list , Examples are as follows :
student1 = {
'id': '20170101',
'name': 'Jordan',
'age': 20,
'gender': 'male'
}
student2 = {
'id': '20170202',
'name': 'Mike',
'age': 21,
'gender': 'male'
}
result = collection.insert([student1, student2])
print(result) The return result corresponds to _id Set :
[ObjectId('5932a80115c2606a59e8a048'), ObjectId('5932a80115c2606a59e8a049')] actually , stay PyMongo 3.x In the version , It's not officially recommended insert() The method . Of course , There's nothing wrong with continuing to use it . Official recommendation insert_one() and insert_many() Method to insert a single record and multiple records respectively , Examples are as follows :
student = {
'id': '20170101',
'name': 'Jordan',
'age': 20,
'gender': 'male'
}
result = collection.insert_one(student)
print(result)
print(result.inserted_id)The operation results are as follows :
<pymongo.results.InsertOneResult object at 0x10d68b558> 5932ab0f15c2606f0c1cf6c5
And insert() The method is different , This time back is InsertOneResult object , We can call it inserted_id Property acquisition _id.
about insert_many() Method , We can pass the data as a list , Examples are as follows :
student1 = {
'id': '20170101',
'name': 'Jordan',
'age': 20,
'gender': 'male'
}
student2 = {
'id': '20170202',
'name': 'Mike',
'age': 21,
'gender': 'male'
}
result = collection.insert_many([student1, student2])
print(result)
print(result.inserted_ids)The operation results are as follows :
<pymongo.results.InsertManyResult object at 0x101dea558>
[ObjectId('5932abf415c2607083d3b2ac'), ObjectId('5932abf415c2607083d3b2ad')] The return type of this method is InsertManyResult, call inserted_ids Property to get the _id list .
After inserting data , We can use find_one() or find() Method to query , among find_one() The query results in a single result ,find() Returns a generator object . Examples are as follows :
result = collection.find_one({'name': 'Mike'})
print(type(result))
print(result) Here we check name by Mike The data of , Its return result is the dictionary type , The operation results are as follows :
<class 'dict'>
{'_id': ObjectId('5932a80115c2606a59e8a049'), 'id': '20170202', 'name': 'Mike', 'age': 21, 'gender': 'male'} You can find , It's more _id attribute , This is it. MongoDB Added automatically during insertion .
Besides , We can also ObjectId To query , You need to use bson In the library objectid:
from bson.objectid import ObjectId
result = collection.find_one({'_id': ObjectId('593278c115c2602667ec6bae')})
print(result)The query result is still dictionary type , As follows :
{'_id': ObjectId('593278c115c2602667ec6bae'), 'id': '20170101', 'name': 'Jordan', 'age': 20, 'gender': 'male'} Of course , If the query result does not exist , Will return None.
For multiple data queries , We can use find() Method . for example , Look here for age 20 The data of , Examples are as follows :
results = collection.find({'age': 20})
print(results)
for result in results:
print(result)The operation results are as follows :
<pymongo.cursor.Cursor object at 0x1032d5128>
{'_id': ObjectId('593278c115c2602667ec6bae'), 'id': '20170101', 'name': 'Jordan', 'age': 20, 'gender': 'male'}
{'_id': ObjectId('593278c815c2602678bb2b8d'), 'id': '20170102', 'name': 'Kevin', 'age': 20, 'gender': 'male'}
{'_id': ObjectId('593278d815c260269d7645a8'), 'id': '20170103', 'name': 'Harden', 'age': 20, 'gender': 'male'} The return is Cursor type , It's equivalent to a generator , We need to traverse to get all the results , Each of these results is a dictionary type .
If you want to query a person older than 20 The data of , It is written as follows :
results = collection.find({'age': {'$gt': 20}}) The query condition key value here is no longer a simple number , It's a dictionary , Its key name is comparison symbol $gt, It means greater than , The key value is 20.
The comparison symbols are summarized in the following table .
Symbol
meaning
Example
$lt
Less than
{'age': {'$lt': 20}}
$gt
Greater than
{'age': {'$gt': 20}}
$lte
Less than or equal to
{'age': {'$lte': 20}}
$gte
Greater than or equal to
{'age': {'$gte': 20}}
$ne
It's not equal to
{'age': {'$ne': 20}}
$in
In scope
{'age': {'$in': [20, 23]}}
$nin
Out of range
{'age': {'$nin': [20, 23]}}
in addition , You can also do regular matching queries . for example , Look up the name with M Student data at the beginning , Examples are as follows :
results = collection.find({'name': {'$regex': '^M.*'}}) Use here $regex To specify regular matching ,^M.* Representative to M Regular expression at the beginning .
Here, some function symbols are classified as the following table .
Symbol
meaning
Example
Example meaning
$regex
Match regular expression
{'name': {'$regex': '^M.*'}}
name With M start
$exists
Whether the attribute exists
{'name': {'$exists': True}}
name Attributes exist
$type
Type judgment
{'age': {'$type': 'int'}}
age The type of int
$mod
Digital analog operation
{'age': {'$mod': [5, 0]}}
Age model 5 more than 0
$text
Text query
{'$text': {'$search': 'Mike'}}
text Type contains Mike character string
$where
Advanced condition query
{'$where': 'obj.fans_count == obj.follows_count'}
The number of fans is equal to the number of followers
More detailed usage of these operations , Can be in MongoDB Official documents found : https://docs.mongodb.com/manual/reference/operator/query/.
To count the number of data in the query results , You can call count() Method . such as , Count all the data :
count = collection.find().count() print(count)
Or statistics meet a certain condition of the data :
count = collection.find({'age': 20}).count()
print(count)The running result is a number , That is, the number of qualified data .
Sorting time , Call directly sort() Method , And pass in the sorted field and ascending descending order flag . Examples are as follows :
results = collection.find().sort('name', pymongo.ASCENDING)
print([result['name'] for result in results])The operation results are as follows :
['Harden', 'Jordan', 'Kevin', 'Mark', 'Mike']
Here we call pymongo.ASCENDING Specify ascending order . If you want to sort them in descending order , You can pass in pymongo.DESCENDING.
In some cases , We might want to take just a few elements , You can use skip() Method offset several positions , For example, offset 2, Just ignore the first two elements , Get the third and later elements :
results = collection.find().sort('name', pymongo.ASCENDING).skip(2)
print([result['name'] for result in results])The operation results are as follows :
['Kevin', 'Mark', 'Mike']
in addition , You can also use limit() Method to specify the number of results to take , Examples are as follows :
results = collection.find().sort('name', pymongo.ASCENDING).skip(2).limit(2)
print([result['name'] for result in results])The operation results are as follows :
['Kevin', 'Mark']
If not used limit() Method , It would have returned three results , With restrictions , Will intercept two results and return .
It is worth noting that , When the number of databases is very large , If ten million 、 Billion level , It's best not to use large offsets to query data , Because this is likely to lead to memory overflow . At this point, you can use operations similar to the following to query :
from bson.objectid import ObjectId
collection.find({'_id': {'$gt': ObjectId('593278c815c2602678bb2b8d')}}) At this time, you need to record the last query _id.
For data updates , We can use update() Method , Specify the updated conditions and the updated data . for example :
condition = {'name': 'Kevin'}
student = collection.find_one(condition)
student['age'] = 25
result = collection.update(condition, student)
print(result) Here we want to update name by Kevin The age of the data : First, specify the query criteria , And then look up the data , After changing age, call update() Method to transfer the original condition and modified data into .
The operation results are as follows :
{'ok': 1, 'nModified': 1, 'n': 1, 'updatedExisting': True} The returned result is in dictionary form ,ok On behalf of successful execution ,nModified Represents the number of data affected .
in addition , We can also use $set Operators update the data , The code is as follows :
result = collection.update(condition, {'$set': student}) So you can just update student Fields that exist in the dictionary . If there were other fields , It will not be updated , It won't delete . And if not $set Words , Then all the previous data will be used student Dictionary replacement ; If there are other fields , Will be deleted .
in addition ,update() In fact, the method is not recommended by the government . It's also divided into update_one() Methods and update_many() Method , The usage is more strict , Their second parameter needs to use $ The type operator is used as the key name of the dictionary , Examples are as follows :
condition = {'name': 'Kevin'}
student = collection.find_one(condition)
student['age'] = 26
result = collection.update_one(condition, {'$set': student})
print(result)
print(result.matched_count, result.modified_count) Here we call update_one() Method , The second parameter can no longer be passed directly into the modified dictionary , It needs to be used {'$set': student} Form like this , The return result is UpdateResult type . Then call... Separately matched_count and modified_count attribute , The number of matching data and the number of affected data can be obtained .
The operation results are as follows :
<pymongo.results.UpdateResult object at 0x10d17b678> 1 0
Let's do another example :
condition = {'age': {'$gt': 20}}
result = collection.update_one(condition, {'$inc': {'age': 1}})
print(result)
print(result.matched_count, result.modified_count) Here you specify that the query condition is older than 20, Then update the condition to {'$inc': {'age': 1}}, That's age plus 1, After execution, the age of the first eligible data will be increased by 1.
The operation results are as follows :
<pymongo.results.UpdateResult object at 0x10b8874c8> 1 1
You can see that the number of matches is 1 strip , The number of influence items is also 1 strip .
If the update_many() Method , All eligible data will be updated , Examples are as follows :
condition = {'age': {'$gt': 20}}
result = collection.update_many(condition, {'$inc': {'age': 1}})
print(result)
print(result.matched_count, result.modified_count)At this time, the number of matches is no longer 1 The article , The operation results are as follows :
<pymongo.results.UpdateResult object at 0x10c6384c8> 3 3
You can see , At this time, all the matched data will be updated .
Delete operation is relatively simple , Call directly remove() Method to specify the conditions for deletion , At this point, all data that meet the conditions will be deleted . Examples are as follows :
result = collection.remove({'name': 'Kevin'})
print(result)The operation results are as follows :
{'ok': 1, 'n': 1} in addition , There are still two new ways of recommendation ——delete_one() and delete_many(). Examples are as follows :
result = collection.delete_one({'name': 'Kevin'})
print(result)
print(result.deleted_count)
result = collection.delete_many({'age': {'$lt': 25}})
print(result.deleted_count)The operation results are as follows :
<pymongo.results.DeleteResult object at 0x10e6ba4c8> 1 4
delete_one() That is, delete the first qualified data ,delete_many() That is to delete all eligible data . Their return results are DeleteResult type , You can call deleted_count Property to get the number of deleted data .
in addition ,PyMongo Some combination methods are also provided , Such as find_one_and_delete()、find_one_and_replace() and find_one_and_update(), They are found and deleted 、 Replacement and update operations , Its usage is basically consistent with the above method .
in addition , You can also operate on indexes , Related methods are create_index()、create_indexes() and drop_index() etc. .
About PyMongo Detailed usage , See the official document :http://api.mongodb.com/python/current/api/pymongo/collection.html.
in addition , There are also some operations on the database and the collection itself , No more explanation here , See the official document :http://api.mongodb.com/python/current/api/pymongo/.
This section explains how to use PyMongo operation MongoDB Methods of data addition, deletion, modification and query .
This resource starts from Cui Qingcai's personal blog Jingmi : Python3 Practical course of web crawler development | Quiet search
In this paper, from https://juejin.cn/post/6844903597465927694, If there is any infringement , Please contact to delete .