Generate Schema and Import CSV into SQL Databases from the Terminal

The Data Scientist and several professionals in the same space work with data from several sources. It does not take long for one to face the challenge of importing data from a CSV file into an SQL database. I am a great fan of the terminal so I want to show you how to do it from the terminal.

Prerequisites

I am going to use a Python Package for this. I suggest that you work in a virtual environment. The Real Python’s article on virtual environments is a good resource if need be. I assume you are using Python3.
1. Install needed packages.

PyMySQL: This is a pure Python MySQL driver which helps us to connect to a MySQL database using Python.
psycopg2: This the most popular Postgresql database adapter for Python.
csvkit: It is a suite of command-line tools for converting to and working with CSV. It provides the csvsql module we are going to use to import the data from our CSV file.

* Download the sample CSV file,customer_info.csv.
For the sake of this tutorial, I have uploaded a CSV file which contains some customer information you can use to test this tool I am introducing. Click this link to download.

Create a new MySQL or Postgresql database. I named mine test. The data will be imported into a table I have named customer_info.

Importing Your Data

The Quick Approach.

This approach automatically generates the Schema from the CSV data, creates the table and inserts the data into the table.
db_user: This is the database user with access to the test database you created.
password: This is the password of the user.

For MySQL

For Postgresql

The Lengthier (and Safer) Approach.

  • Generate the query for creating the customer_info table from your terminal.

For MySQL

Expected Result:

The result above is the query we are going to use to create the customer_info table in the test database. You can edit the generated schema to suit your need. I will leave it as it is in this tutorial.

For Postgresql

Expected Result:

NOTE THE DIFFERENCE IN HOW EACH DATABASE EXPECTS THE QUERY TO BE. MySQL is wrapped in single quotes but Postgresql is wrapped in double-quotes.

Import your data into the table.

For MySQL

In Postgresql

SUMMARY

We just learnt how to import data from a CSV file into tables in our MySQL and Postgresql databases with the csvkit package. You can take the direct approach which automatically creates your table and inserts the data. The longer and in my opinion safer approach is to generate the query that creates the table, inspect and edit the generated schema and import your data into the table.

CSVKIT is a powerful tool every Data Scientist should have in his or her toolbox.

Originally published at https://dev.to on March 7, 2020.

Building Products and Services, and Sharing Knowledge.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store