程序師世界是廣大編程愛好者互助、分享、學習的平台,程序師世界有你更精彩!
首頁
編程語言
C語言|JAVA編程
Python編程
網頁編程
ASP編程|PHP編程
JSP編程
數據庫知識
MYSQL數據庫|SqlServer數據庫
Oracle數據庫|DB2數據庫
您现在的位置: 程式師世界 >> 編程語言 >  >> 更多編程語言 >> Python

Python reads tens of millions of data and automatically writes it to MySQL database

編輯:Python

Python Read data is automatically written MySQL database , This requirement is very common in the work , Mainly involves python Operating the database , Read / write updates, etc , The database may be mongodb、 es, Their treatment ideas are similar , You only need to change the syntax of operating the database . This article will give you a systematic share of how to write tens of millions of data into mysql, It is divided into two scenarios , Two ways .

Scene one : Data does not need to be written frequently mysql

Use navicat The Import Wizard function of the tool . Support multiple file formats , You can automatically create a table according to the file fields , You can also insert data into an existing table , It's very fast and convenient .

Scene two : Data is incremental , Need to automate and write frequently mysql

Test data :csv Format , about 1200 Line ten thousand

import pandas as pd
data = pd.read_csv('./tianchi_mobile_recommend_train_user.csv')
data.shape

Print the results

Mode one :

  • python pymysql library

  • install pymysql command

pip install pymysql

Code implementation

import pymysql
#  Database connection information
conn = pymysql.connect(
       host='127.0.0.1',
       user='root',
       passwd='wangyuqing',
       db='test01', 
       port = 3306,
       charset="utf8")
#  Block processing
big_size = 100000
#  Block traversal writes to  mysql    
with pd.read_csv('./tianchi_mobile_recommend_train_user.csv',chunksize=big_size) as reader:
    for df in reader:
        datas = []
        print(' Handle :',len(df))
#         print(df)
        for i ,j in df.iterrows():
            data = (j['user_id'],j['item_id'],j['behavior_type'],
                    j['item_category'],j['time'])
            datas.append(data)
        _values = ",".join(['%s', ] * 5)
        sql = """insert into users(user_id,item_id,behavior_type
        ,item_category,time) values(%s)""" % _values
        cursor = conn.cursor()
        cursor.executemany(sql,datas)
        conn.commit()
 #  Close the service       
conn.close()
cursor.close()
print(' Deposit successful !')

Mode two :

  • pandas sqlalchemy:pandas Need to introduce sqlalchemy To support sql, stay sqlalchemy With the support of , It can realize the query of all common database types 、 Update and other operations .

Code implementation

from sqlalchemy import create_engine
engine = create_engine('mysql+pymysql://root:[email protected]:3306/test01')
data = pd.read_csv('./tianchi_mobile_recommend_train_user.csv')
data.to_sql('user02',engine,chunksize=100000,index=None)
print(' Deposit successful !')

python Study

If you want to learn Python, But we can't find the learning path and resources
Welcome to join the new exchange 【 Junyang 】:905229245
Discuss programming knowledge together , Become a great God , There are also software installation packages in the group , Practical cases 、 Learning materials

summary

pymysql The method takes 12 branch 47 second , It takes a long time , A lot of code , and pandas It takes only five lines of code to implement this requirement , It only took 4 About minutes .

Finally add , Method 1 needs to create a table in advance , Mode 2 does not require .

So I recommend you to use the second method , It is convenient and efficient . If you still feel that your speed is slow , Consider joining multiple processes 、 Multithreading .

The most complete three store data in MySQL Database method :

  • Direct deposit , utilize navicat Import Wizard function of

  • Python pymysql

  • Pandas sqlalchemy


  1. 上一篇文章:
  2. 下一篇文章:
Copyright © 程式師世界 All Rights Reserved