How to insert bulk data into MySQL with Python

Mysql insert is so slow in my Python program.

When we try to insert huge data to MySQL table with using Python code, sometimes it takes long time.

If possible, we want to insert bulk data in short time.

So today I will introduce about "How to insert bulk data into MySQL with Python".

目次

How to insert bulk data into MySQL with Python

In order to insert bulk data into MySQL with Python, we can use pymysql module.

And we can use  executemany method instead of execute method.

pymysql bulk insert

So check it with test code.

Test with pymysql

So let's try with pymysql module.

STEP
Install pymysql

First, you have to install pymysql module.

pip install pymysql
STEP
Prepare table for insert

Then prepare table for insert.

CREATE TABLE hoge (
  id int(11) NOT NULL AUTO_INCREMENT,
  name varchar(50) NOT NULL,
  value double NOT NULL,
  text text NOT NULL,
  PRIMARY KEY (id)
) ENGINE=InnoDB AUTO_INCREMENT=98289 DEFAULT CHARSET=latin1
STEP
Insert one by one

Next, try to insert 100,000 records by execute method.

import pymysql
import datetime

print("Connect to DB")
conn = pymysql.connect(user="root", password="", host="localhost", database="test")

# Insert one by one
def insert_data(values):
    insert_sql = "INSERT INTO test.hoge (name, value, text) values (%s,%s,%s)"

    cur = conn.cursor()
    for value in values:
        cur.execute(insert_sql, value)

def main():
    # Generate data
    values = []
    print("Generate data")
    for i in range(100000):
        name = "name_{}".format(i)
        value = i
        text = "text_{}".format(i)
        values.append([name,value,text])
    print("Length of data: {}".format(len(values)))
    print()

    # Insert one by one
    print("Insert data")
    start_time = datetime.datetime.now()
    print("Start:" + str(start_time))
    insert_data(values)
    end_time = datetime.datetime.now()
    print("End:" + str(end_time))
    diff_time = end_time - start_time
    print("Diff:" + str(diff_time))
    print()

if __name__ == "__main__":
    main()

In my environment, it takes 13 seconds.

STEP
Bulk Insert

Next, try to insert 100,000 records by executemanymethod.

import pymysql
import datetime

print("Connect to DB")
conn = pymysql.connect(user="root", password="", host="localhost", database="test")

# Insert one by one
def insert_data(values):
    insert_sql = "INSERT INTO test.hoge (name, value, text) values (%s,%s,%s)"

    cur = conn.cursor()
    for value in values:
        cur.execute(insert_sql, value)

def main():
    # Generate data
    values = []
    print("Generate data")
    for i in range(100000):
        name = "name_{}".format(i)
        value = i
        text = "text_{}".format(i)
        values.append([name,value,text])
    print("Length of data: {}".format(len(values)))
    print()

    # Insert one by one
    print("Insert data")
    start_time = datetime.datetime.now()
    print("Start:" + str(start_time))
    insert_data(values)
    end_time = datetime.datetime.now()
    print("End:" + str(end_time))
    diff_time = end_time - start_time
    print("Diff:" + str(diff_time))
    print()

if __name__ == "__main__":
    main()

Now it takes 2 seconds.
The difference is executemany in insert_data_bulk function.
It is about 5 times faster than one-by-one-insert.

With using 1,000,000 records, One-by-one-insert takes 137 seconds to insert them.
And Bulk-insert takes 26 seconds.
So it is also 5 times faster than one-by-one-insert.

According to this test, we can understand that executemany is faster than execute .

If you handle bulk data, you should use executemany method.

Conclusion

Today I explained about "How to insert bulk data into MySQL with Python".

You can follow important point below.

In order to insert bulk data into MySQL with Python, try to use executemany of pymysql module.

Use bulk insert and save your time.

この記事が気に入ったら
いいね または フォローしてね!

If you like this article, please share !
  • URLをコピーしました!
  • URLをコピーしました!

Author

karasanのアバター karasan System engineer

Mid-career engineer (AI, Data science, Salesforce, etc.).
Good at Python and SQL.

目次