Since I’m working on a lot of CSV files, I write this article to share with you how to import CSV file into MySQL database.
Import CSV file into MySQL
As you might not know, MySQL has supports for handling CSV files natively, which means you can export and import database tables into/from a CSV-format file.
Because MariaDB is a fork from MySQL, so everything in this article will work with both MySQL and MariaDB. However, for reading convenience, I only use the term “MySQL“.
Take some time to read about CSV format if you don’t know it.
Okay, get back to our task, that is, to import CSV file into MySQL, there are basically three methods.
- Using SQL script.
- Using the package tool,
- Write the code.
Hmmm…the 3rd method looks out of scope for this article, because you can use any programming language to write code to parse CSV file and import into MySQL database. It should be an easy task for any developer.
I will talk about first two methods.
Following is the CSV file that I use in this article.
title,author,created_at "Backup and restore MySQLDB","Pete Houston","2018-08-27 12:15:32" "Quick guide to SQLite3","Pete Houston","2018-09-12 22:35:56" "Migrate to MongoDB","Pete Houston","2018-10-01 03:48:18"
The table to get imported from CSV file has following structure:
CREATE TABLE `articles` ( `id` int(11) NOT NULL AUTO_INCREMENT, `title` varchar(256) NOT NULL, `author` varchar(64) DEFAULT NULL, `created_at` timestamp NULL DEFAULT CURRENT_TIMESTAMP, `updated_at` timestamp NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP, PRIMARY KEY (`id`) ) ENGINE=InnoDB AUTO_INCREMENT=19 DEFAULT CHARSET=utf8mb4
Ready to get started!
1. Using SQL script
This is done by making use of LOAD DATA syntax.
Don’t get confused about the complicated syntax definition. Take a look at following example:
LOAD DATA INFILE '/home/petehouston/data/articles.csv' INTO TABLE articles FIELDS TERMINATED BY ',' ENCLOSED BY '"' LINES TERMINATED BY '\n' IGNORE 1 ROWS (title,author,@created_at) SET created_at = STR_TO_DATE(@created_at, '%Y-%m-%d %H:%i:%s') ;
You need to pass CSV absolute file path right after
INFILE, and the name of table to get imported after
The next three lines:
FIELDS TERMINATED BY: to tell SQL the character used for field separation. If you use tab
\t, then change it.
ENCLOSED BY: to denote that value of field being put inside this pair of character. Especially, when you have string-type fields.
LINES TERMINATED BY: to specify when row is complete.
If there are headers, the first line in CSV file, then we should ignore it, and that is done through
IGNORE 1 ROWS. On the other hand, if there is no header, you don’t need to add that query part.
Next part of the query is the list of column names to import into table. If there is a sign
@ before column name, it means the value of field is saved into that variable, and it’s a variable.
By using variable, we can transform CSV field data right before inserting into table. So the next line:
SET created_at = STR_TO_DATE(@created_at, '%Y-%m-%d %H:%i:%s')is to convert time string into Date object and assign to
created_at field in table.
Phew…a lot of words just to explain a simple query!
The utility is created for one purpose, to handle import external data into MySQL.
Again, you will get confused by looking at its documentation, too many options, and which are what, you ask?
Try this one:
mysqlimport -h localhost -P3306 -u root -p testdb --ignore-lines=1 --lines-terminated-by="\n" --fields-terminated-by="," --fields-enclosed-by="\"" -c title,author,created_at "/home/petehouston/data/articles.csv"
Let’s break it down.
-h SERVER_IP: MySQL server
-P3306: port number. Default to 3306, replace it with your server configuration.
-u DB_USER: user to login. Make sure this user has
-p DB_PASS: user password.
testdb: replace it with the database you want to import CSV file into.
--ignore-lines=N: ignore first N lines in CSV file. Practically, as I explained previously, this is useful to ignore header line.
--lines-terminated-by: same as
LINES TERMINATED BY.
--fields-terminated-by: same as
FIELDS TERMINATED BY.
--fields-enclosed-by: same as
FIELDS ENCLOSED BY.
-c: list of columns to import.
- CSV file path at last position.
To tell you honestly, I don’t like the first two methods though I write about them in this article. I prefer the last method, my favorite.