Import CSV file into MySQL

0
5390
Import CSV file into MySQL

Since I’m working on a lot of CSV files, I write this article to share with you how to import CSV file into MySQL database.

Import CSV file into MySQL

As you might not know, MySQL has supports for handling CSV files natively, which means you can export and import database tables into/from a CSV-format file.

Because MariaDB is a fork from MySQL, so everything in this article will work with both MySQL and MariaDB. However, for reading convenience, I only use the term “MySQL“.

Take some time to read about CSV format if you don’t know it.

Okay, get back to our task, that is, to import CSV file into MySQL, there are basically three methods.

  1. Using SQL script.
  2. Using the package tool, mysqlimport.
  3. Write the code.

Hmmm…the 3rd method looks out of scope for this article, because you can use any programming language to write code to parse CSV file and import into MySQL database. It should be an easy task for any developer.

I will talk about first two methods.

Following is the CSV file that I use in this article.

title,author,created_at
"Backup and restore MySQLDB","Pete Houston","2018-08-27 12:15:32"
"Quick guide to SQLite3","Pete Houston","2018-09-12 22:35:56"
"Migrate to MongoDB","Pete Houston","2018-10-01 03:48:18"

The table to get imported from CSV file has following structure:

CREATE TABLE `articles` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `title` varchar(256) NOT NULL,
  `author` varchar(64) DEFAULT NULL,
  `created_at` timestamp NULL DEFAULT CURRENT_TIMESTAMP,
  `updated_at` timestamp NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=19 DEFAULT CHARSET=utf8mb4

Ready to get started!

1. Using SQL script

This is done by making use of LOAD DATA syntax.

Don’t get confused about the complicated syntax definition. Take a look at following example:

LOAD DATA INFILE '/home/petehouston/data/articles.csv'
	 INTO TABLE articles
	 FIELDS TERMINATED BY ','
	 ENCLOSED BY '"'
 	 LINES TERMINATED BY '\n'
 IGNORE 1 ROWS
 (title,author,@created_at)
 SET created_at = STR_TO_DATE(@created_at, '%Y-%m-%d %H:%i:%s')
;

You need to pass CSV absolute file path right after INFILE, and the name of table to get imported after INTO TABLE.

The next three lines:

  • FIELDS TERMINATED BY: to tell SQL the character used for field separation. If you use tab  \t , then change it.
  • ENCLOSED BY: to denote that value of field being put inside this pair of character. Especially, when you have string-type fields.
  • LINES TERMINATED BY: to specify when row is complete.

If there are headers, the first line in CSV file, then we should ignore it, and that is done through IGNORE 1 ROWS. On the other hand, if there is no header, you don’t need to add that query part.

Next part of the query is the list of column names to import into table. If there is a sign @ before column name, it means the value of field is saved into that variable, and it’s a variable.

By using variable, we can transform CSV field data right before inserting into table. So the next line:  SET created_at = STR_TO_DATE(@created_at, '%Y-%m-%d %H:%i:%s')is to convert time string into Date object and assign to created_at field in table.

Phew…a lot of words just to explain a simple query!

2. Using mysqlimport

The utility is created for one purpose, to handle import external data into MySQL.

Again, you will get confused by looking at its documentation, too many options, and which are what, you ask?

Try this one:

mysqlimport -h localhost -P3306 -u root -p testdb --ignore-lines=1 --lines-terminated-by="\n" --fields-terminated-by="," --fields-enclosed-by="\""  -c title,author,created_at "/home/petehouston/data/articles.csv"

Let’s break it down.

  • -h SERVER_IP: MySQL server
  • -P3306: port number. Default to 3306, replace it with your server configuration.
  • -u DB_USER: user to login. Make sure this user has FILE privileges.
  • -p DB_PASS: user password.
  • testdb: replace it with the database you want to import CSV file into.
  • --ignore-lines=N: ignore first N lines in CSV file. Practically, as I explained previously, this is useful to ignore header line.
  • --lines-terminated-by: same as LINES TERMINATED BY.
  • --fields-terminated-by: same as FIELDS TERMINATED BY.
  • --fields-enclosed-by: same as FIELDS ENCLOSED BY.
  • -c: list of columns to import.
  • CSV file path at last position.
NOTE:

  • Use double quote character " to wrap option value. I mean --lines-terminated-by="\n", not --lines-terminated-by='\n'.
  • If CSV file has line ending in Windows style, then you should use this --lines-terminated-by="\r\n".
  • Don’t copy paste command into terminal, type it manually to avoid Unicode encoding.
  • On command failure, see result table (output immediate when mysqlimport fails) and verify if your input values are all correct.

Conclusion

To tell you honestly, I don’t like the first two methods though I write about them in this article. I prefer the last method, my favorite.