Mysql insert is so slow in my Python program.
When we try to insert huge data to MySQL table with using Python code, sometimes it takes long time.
If possible, we want to insert bulk data in short time.
So today I will introduce about "How to insert bulk data into MySQL with Python".
How to insert bulk data into MySQL with Python
In order to insert bulk data into MySQL with Python, we can use pymysql
module.
And we can use executemany
method instead of execute
method.
So check it with test code.
Test with pymysql
So let's try with pymysql
module.
First, you have to install pymysql
module.
pip install pymysql
Then prepare table for insert.
CREATE TABLE hoge (
id int(11) NOT NULL AUTO_INCREMENT,
name varchar(50) NOT NULL,
value double NOT NULL,
text text NOT NULL,
PRIMARY KEY (id)
) ENGINE=InnoDB AUTO_INCREMENT=98289 DEFAULT CHARSET=latin1
Next, try to insert 100,000 records by execute
method.
import pymysql
import datetime
print("Connect to DB")
conn = pymysql.connect(user="root", password="", host="localhost", database="test")
# Insert one by one
def insert_data(values):
insert_sql = "INSERT INTO test.hoge (name, value, text) values (%s,%s,%s)"
cur = conn.cursor()
for value in values:
cur.execute(insert_sql, value)
def main():
# Generate data
values = []
print("Generate data")
for i in range(100000):
name = "name_{}".format(i)
value = i
text = "text_{}".format(i)
values.append([name,value,text])
print("Length of data: {}".format(len(values)))
print()
# Insert one by one
print("Insert data")
start_time = datetime.datetime.now()
print("Start:" + str(start_time))
insert_data(values)
end_time = datetime.datetime.now()
print("End:" + str(end_time))
diff_time = end_time - start_time
print("Diff:" + str(diff_time))
print()
if __name__ == "__main__":
main()
In my environment, it takes 13 seconds.
Next, try to insert 100,000 records by executemany
method.
import pymysql
import datetime
print("Connect to DB")
conn = pymysql.connect(user="root", password="", host="localhost", database="test")
# Insert one by one
def insert_data(values):
insert_sql = "INSERT INTO test.hoge (name, value, text) values (%s,%s,%s)"
cur = conn.cursor()
for value in values:
cur.execute(insert_sql, value)
def main():
# Generate data
values = []
print("Generate data")
for i in range(100000):
name = "name_{}".format(i)
value = i
text = "text_{}".format(i)
values.append([name,value,text])
print("Length of data: {}".format(len(values)))
print()
# Insert one by one
print("Insert data")
start_time = datetime.datetime.now()
print("Start:" + str(start_time))
insert_data(values)
end_time = datetime.datetime.now()
print("End:" + str(end_time))
diff_time = end_time - start_time
print("Diff:" + str(diff_time))
print()
if __name__ == "__main__":
main()
Now it takes 2 seconds.
The difference is executemany
in insert_data_bulk
function.
It is about 5 times faster than one-by-one-insert.
With using 1,000,000 records, One-by-one-insert takes 137 seconds to insert them.
And Bulk-insert takes 26 seconds.
So it is also 5 times faster than one-by-one-insert.
According to this test, we can understand that executemany
is faster than execute
.
If you handle bulk data, you should use executemany
method.
Conclusion
Today I explained about "How to insert bulk data into MySQL with Python".
You can follow important point below.
In order to insert bulk data into MySQL with Python, try to use executemany
of pymysql
module.
Use bulk insert and save your time.