Robin's Blog

Using SQLAlchemy to access MySQL without frustrating library installation issues

This is more a ‘note to myself’ than anything else, but I expect some other people might find it useful.

I’ve often struggled with accessing MySQL from Python, as the ‘default’ MySQL library for Python is MySQLdb. This library has a number of problems: 1) it is Python 2 only, and 2) it requires compiling against the MySQL C library and header files, and so can’t be simply installed using pip.

There is a Python 3 version of MySQLdb called mysqlclient, but this also requires compiling against the MySQL libraries and header files, so can be complicated to install.

The best library I’ve found as a replacement is PyMySQL which is a pure Python library (so no need to install MySQL libraries and header files). It’s API is basically exactly the same as MySQLdb, so it’s easy to switch across.

Right, that’s the introduction – and we’re really at the actual point of this post, which is how to go about using the PyMySQL library ‘under the hood’ when you’re accessing databases through SQLAlchemy.

The weird thing is that I’m not actually using SQLAlchemy by choice in my code – but it is used by pandas to convert between SQL and data frames.

For example, you can write code like this:

from sqlalchemy import create_engine
eng = create_engine('mysql://user:[email protected]/database')
df.to_sql('table', eng, if_exists='append', index=False)

which will append the data in df to a table in a database running on the local machine.

The create_engine call is a SQLAlchemy function which creates an engine to handle all of the complex communication to and from a specific database.

Now, when you specify a database connection string with the mysql:// prefix, SQLAlchemy tries to use the MySQLdb library to do the underlying communication with the MySQL database – and fails if it can’t be found.

So, now we’re at the actual solution: which is that you can give SQLAlchemy a ‘dialect’ to use to connect to a database – and this can be used to change the underlying library that is used to talk to the database.

So, you can change your connection string to mysql+pymysql://user:[email protected]/database and it will use the PyMySQL library. It’s as simple as that!

There are other dialects that you can use to connect to MySQL using different underlying libraries – although these aren’t recommended by the authors of SQLAlchemy. You can find a list of them here.

_I do data science work – including processing data in MySQL databases – as part of my freelance work. Please contact me for more details._


Categorised as: Programming, Python


Leave a Reply

Your email address will not be published. Required fields are marked *