import polars as pl
from pathlib import Path
import sqlite3
print(pl.__version__)1.6.0
9.1 Reading data from SQL databases
So far we’ve only talked about reading data from CSV files. That’s a pretty common way to store data, but there are many others! Polars has a number of I/O methods at its disposal (see the documentation for a full list of options). In this chapter we’ll talk about reading data from SQL databases.
You can read data from a SQL database using the pl.read_database function. read_database will automatically convert SQL column names to DataFrame column names.
read_database takes 2 arguments: a query statement and a connection URI. This is great because it means you can read from any kind of SQL database – it doesn’t matter if it’s MySQL, SQLite, PostgreSQL, or something else.
This example reads from a SQLite database, but any other database would work the same way.
read_db_path = Path('../data/weather_2012.sqlite').absolute()
read_uri = f"sqlite:////{read_db_path}"
df = pl.read_database_uri("SELECT * from weather_2012 LIMIT 3", read_uri)
df| id | date_time | temp |
|---|---|---|
| i64 | datetime[ns] | f64 |
| 1 | 2012-01-01 00:00:00 | -1.8 |
| 2 | 2012-01-01 01:00:00 | -1.8 |
| 3 | 2012-01-01 02:00:00 | -1.8 |
9.2 Writing to a SQLite database
Polars has a write_database function which creates a database table from a dataframe. Let’s use it to move our 2012 weather data into SQL.
weather_df = pl.read_csv('../data/weather_2012.csv')
write_db_path = Path('../data/test_db.sqlite').absolute()
write_uri = f"sqlite:////{write_db_path}"
con = sqlite3.connect(write_db_path)
con.execute("DROP TABLE IF EXISTS weather_2012")
weather_df.write_database("weather_2012", write_uri)8784
We can now read from the weather_2012 table in test_db.sqlite, and we see that we get the same data back:
df = pl.read_database_uri("SELECT * from weather_2012 LIMIT 3", write_uri)
df| Date/Time | Temp (C) | Dew Point Temp (C) | Rel Hum (%) | Wind Spd (km/h) | Visibility (km) | Stn Press (kPa) | Weather |
|---|---|---|---|---|---|---|---|
| str | f64 | f64 | i64 | i64 | f64 | f64 | str |
| "2012-01-01 00:00:00" | -1.8 | -3.9 | 86 | 4 | 8.0 | 101.24 | "Fog" |
| "2012-01-01 01:00:00" | -1.8 | -3.7 | 87 | 4 | 8.0 | 101.24 | "Fog" |
| "2012-01-01 02:00:00" | -1.8 | -3.4 | 89 | 7 | 4.0 | 101.26 | "Freezing Drizzle,Fog" |
The nice thing about having your data in a database is that you can do arbitrary SQL queries. This is cool especially if you’re more familiar with SQL. Here’s an example of sorting by the Weather column:
df = pl.read_database_uri("SELECT * from weather_2012 ORDER BY Weather LIMIT 3", write_uri)
df| Date/Time | Temp (C) | Dew Point Temp (C) | Rel Hum (%) | Wind Spd (km/h) | Visibility (km) | Stn Press (kPa) | Weather |
|---|---|---|---|---|---|---|---|
| str | f64 | f64 | i64 | i64 | f64 | f64 | str |
| "2012-01-03 19:00:00" | -16.9 | -24.8 | 50 | 24 | 25.0 | 101.74 | "Clear" |
| "2012-01-05 18:00:00" | -7.1 | -14.4 | 56 | 11 | 25.0 | 100.71 | "Clear" |
| "2012-01-05 19:00:00" | -9.2 | -15.4 | 61 | 7 | 25.0 | 100.8 | "Clear" |
If you have a PostgreSQL database or MySQL database, reading from it works exactly the same way as reading from a SQLite database.
9.3 Connecting to other kinds of database
To connect to a MySQL database:
Note: For these to work, you will need a working MySQL / PostgreSQL database, with the correct localhost, database name, etc. pl.read_database_uri(“select * from MY_TABLE”, “mysql://username:password@server:port/database”) To connect to a PostgreSQL database: pl.read_database_uri(“select * from MY_TABLE”, “postgresql://username:password@server:port/database”)