Python & SQL (with SQLite/MySQL)

Introduction to Python for DE
Setting Up Your Environment
Python and Pandas for ETL
Handling JSON and CSV Files
Python with AWS SDK (Boto3)
Python & SQL (with SQLite/MySQL)
Data Cleaning with Pandas
Working with APIs in Python
Building Batch Jobs in Python
Real-Time Data Pipelines with Python
Logging & Error Handling in Python
ETL Jobs with Cron and AWS Lambda

Python & SQL (with SQLite/MySQL): A Complete Guide for Data Engineers

In the world of data engineering, combining Python with SQL is one of the most powerful skill sets you can have. Whether you’re extracting data from a database, transforming it into a usable format, or loading it into another system (the classic ETL process), knowing how to connect Python with databases like SQLite or MySQL is essential.

In this guide, you’ll learn everything you need to know about integrating Python with SQL databases. We’ll explore:

The basics of SQL and relational databases
How to use SQLite for lightweight, file-based databases
Connecting Python to MySQL for more scalable database solutions
Performing read/write operations using Python
Using popular libraries like sqlite3, mysql-connector-python, and SQLAlchemy
Best practices and code examples

By the end, you’ll be able to set up a SQL-backed Python project, write queries, automate database workflows, and build solid foundations for any data engineering project.

Why Use SQL with Python?

SQL is the language of databases. Python is a general-purpose scripting language with powerful capabilities. Combining the two enables:

Data extraction from relational sources
Complex query handling
Dynamic report generation
Data transformation pipelines
Seamless integration with data science and ML workflows

SQL handles structured data beautifully, and Python adds programmability, automation, and integration.

Getting Started with SQLite in Python

What is SQLite?

SQLite is a self-contained, file-based database engine. It requires no server installation and is excellent for small projects, rapid prototyping, and applications that don’t need multi-user concurrency.

Why Use SQLite?

Lightweight and simple
No need for separate DB server
Great for local development and testing
Built-in Python support

1. Connect to SQLite Database

import sqlite3

# Connect to a local file-based database
conn = sqlite3.connect('example.db')
cursor = conn.cursor()

2. Create a Table

cursor.execute('''CREATE TABLE IF NOT EXISTS users (
                    id INTEGER PRIMARY KEY,
                    name TEXT,
                    email TEXT)''')

3. Insert Data

cursor.execute("INSERT INTO users (name, email) VALUES (?, ?)", ('Alice', 'alice@example.com'))
conn.commit()

4. Query Data

cursor.execute("SELECT * FROM users")
for row in cursor.fetchall():
    print(row)

5. Close the Connection

conn.close()

Tip: Use context managers (with sqlite3.connect(...) as conn:) for better resource handling.

Using Python with MySQL

What is MySQL?

MySQL is an open-source, full-featured relational database management system (RDBMS). It’s widely used in enterprise applications, websites, and scalable cloud infrastructure.

Why MySQL?

Supports large-scale applications
Multi-user concurrency
Can be hosted locally or in the cloud (e.g., AWS RDS)

1. Install MySQL Connector

Install the MySQL client for Python:

pip install mysql-connector-python

2. Connect to a MySQL Database

import mysql.connector

conn = mysql.connector.connect(
    host="localhost",
    user="your_user",
    password="your_password",
    database="your_db"
)
cursor = conn.cursor()

3. Create Table & Insert Data

cursor.execute("""
CREATE TABLE IF NOT EXISTS employees (
    id INT AUTO_INCREMENT PRIMARY KEY,
    name VARCHAR(100),
    department VARCHAR(50)
)
""")

cursor.execute("INSERT INTO employees (name, department) VALUES (%s, %s)", ("Bob", "Engineering"))
conn.commit()

4. Querying Data

cursor.execute("SELECT * FROM employees")
rows = cursor.fetchall()
for row in rows:
    print(row)

5. Handling Exceptions

try:
    conn = mysql.connector.connect(...)
except mysql.connector.Error as err:
    print(f"Error: {err}")

Use Case: Building a Python ETL Pipeline with SQL

Let’s say you have:

Source data in CSV files
Need to transform and load into a MySQL or SQLite database

Example:

import pandas as pd
import sqlite3

# Load data from CSV
df = pd.read_csv('data.csv')

# Connect to SQLite
conn = sqlite3.connect('etl.db')

# Write to table
df.to_sql('sales', conn, if_exists='replace', index=False)
print("Data loaded into SQLite successfully")

You can schedule this script using cron (Linux), Task Scheduler (Windows), or run it as an AWS Lambda if modified.

SQLAlchemy: An Abstraction Layer

SQLAlchemy allows you to use the same Python code across SQLite, MySQL, PostgreSQL, and more without changing your queries or DB connection syntax.

Install SQLAlchemy:

pip install sqlalchemy

Example: Connecting to MySQL with SQLAlchemy

from sqlalchemy import create_engine
import pandas as pd

engine = create_engine('mysql+mysqlconnector://user:password@localhost/mydb')
df = pd.read_sql("SELECT * FROM employees", engine)
print(df.head())

Best Practices

Use parameterized queries to avoid SQL injection.
Always close connections and cursors properly.
Use connection pooling for production (e.g., with SQLAlchemy).
Normalize your tables for better data organization.
Use indexes for faster querying.

Security Considerations

Avoid hardcoding passwords. Use .env files or cloud secrets managers.
Set up user roles and permissions carefully.
Regularly patch your DB engine.

Real-World Scenario

A data engineer might:

Use Python to query MySQL for raw sales data
Transform it using Pandas
Store summaries back into MySQL for reporting
Or export to S3, then trigger further downstream pipelines

This setup enables full-cycle automation of data operations.

Conclusion

Combining Python and SQL gives you an unbeatable foundation for backend data processing and analytics. Whether you’re dealing with local files via SQLite or managing enterprise-scale data via MySQL, Python can handle it all.

Incorporate SQL into your Python scripts, build ETL workflows, automate reports, or serve clean data to your ML models. Mastering these integrations will make you a highly effective data engineer in any environment.