Python & SQL (with SQLite/MySQL): A Complete Guide for Data Engineers

In the world of data engineering, combining Python with SQL is one of the most powerful skill sets you can have. Whether you’re extracting data from a database, transforming it into a usable format, or loading it into another system (the classic ETL process), knowing how to connect Python with databases like SQLite or MySQL is essential.

In this guide, you’ll learn everything you need to know about integrating Python with SQL databases. We’ll explore:

  • The basics of SQL and relational databases

  • How to use SQLite for lightweight, file-based databases

  • Connecting Python to MySQL for more scalable database solutions

  • Performing read/write operations using Python

  • Using popular libraries like sqlite3, mysql-connector-python, and SQLAlchemy

  • Best practices and code examples

By the end, you’ll be able to set up a SQL-backed Python project, write queries, automate database workflows, and build solid foundations for any data engineering project.


Why Use SQL with Python?

SQL is the language of databases. Python is a general-purpose scripting language with powerful capabilities. Combining the two enables:

  • Data extraction from relational sources

  • Complex query handling

  • Dynamic report generation

  • Data transformation pipelines

  • Seamless integration with data science and ML workflows

SQL handles structured data beautifully, and Python adds programmability, automation, and integration.


Getting Started with SQLite in Python
What is SQLite?

SQLite is a self-contained, file-based database engine. It requires no server installation and is excellent for small projects, rapid prototyping, and applications that don’t need multi-user concurrency.

Why Use SQLite?
  • Lightweight and simple

  • No need for separate DB server

  • Great for local development and testing

  • Built-in Python support

1. Connect to SQLite Database
import sqlite3

# Connect to a local file-based database
conn = sqlite3.connect('example.db')
cursor = conn.cursor()
2. Create a Table
cursor.execute('''CREATE TABLE IF NOT EXISTS users (
                    id INTEGER PRIMARY KEY,
                    name TEXT,
                    email TEXT)''')
3. Insert Data
cursor.execute("INSERT INTO users (name, email) VALUES (?, ?)", ('Alice', 'alice@example.com'))
conn.commit()
4. Query Data
cursor.execute("SELECT * FROM users")
for row in cursor.fetchall():
    print(row)
5. Close the Connection
conn.close()

✅ Tip: Use context managers (with sqlite3.connect(...) as conn:) for better resource handling.


Using Python with MySQL
What is MySQL?

MySQL is an open-source, full-featured relational database management system (RDBMS). It’s widely used in enterprise applications, websites, and scalable cloud infrastructure.

Why MySQL?
  • Supports large-scale applications

  • Multi-user concurrency

  • Can be hosted locally or in the cloud (e.g., AWS RDS)

1. Install MySQL Connector

Install the MySQL client for Python:

pip install mysql-connector-python
2. Connect to a MySQL Database
import mysql.connector

conn = mysql.connector.connect(
    host="localhost",
    user="your_user",
    password="your_password",
    database="your_db"
)
cursor = conn.cursor()
3. Create Table & Insert Data
cursor.execute("""
CREATE TABLE IF NOT EXISTS employees (
    id INT AUTO_INCREMENT PRIMARY KEY,
    name VARCHAR(100),
    department VARCHAR(50)
)
""")

cursor.execute("INSERT INTO employees (name, department) VALUES (%s, %s)", ("Bob", "Engineering"))
conn.commit()
4. Querying Data
cursor.execute("SELECT * FROM employees")
rows = cursor.fetchall()
for row in rows:
    print(row)
5. Handling Exceptions
try:
    conn = mysql.connector.connect(...)
except mysql.connector.Error as err:
    print(f"Error: {err}")

Use Case: Building a Python ETL Pipeline with SQL

Let’s say you have:

  • Source data in CSV files

  • Need to transform and load into a MySQL or SQLite database

Example:

import pandas as pd
import sqlite3

# Load data from CSV
df = pd.read_csv('data.csv')

# Connect to SQLite
conn = sqlite3.connect('etl.db')

# Write to table
df.to_sql('sales', conn, if_exists='replace', index=False)
print("Data loaded into SQLite successfully")

You can schedule this script using cron (Linux), Task Scheduler (Windows), or run it as an AWS Lambda if modified.


SQLAlchemy: An Abstraction Layer

SQLAlchemy allows you to use the same Python code across SQLite, MySQL, PostgreSQL, and more without changing your queries or DB connection syntax.

Install SQLAlchemy:

pip install sqlalchemy
Example: Connecting to MySQL with SQLAlchemy
from sqlalchemy import create_engine
import pandas as pd

engine = create_engine('mysql+mysqlconnector://user:password@localhost/mydb')
df = pd.read_sql("SELECT * FROM employees", engine)
print(df.head())

Best Practices
  1. Use parameterized queries to avoid SQL injection.

  2. Always close connections and cursors properly.

  3. Use connection pooling for production (e.g., with SQLAlchemy).

  4. Normalize your tables for better data organization.

  5. Use indexes for faster querying.


Security Considerations
  • Avoid hardcoding passwords. Use .env files or cloud secrets managers.

  • Set up user roles and permissions carefully.

  • Regularly patch your DB engine.


Real-World Scenario

A data engineer might:

  • Use Python to query MySQL for raw sales data

  • Transform it using Pandas

  • Store summaries back into MySQL for reporting

  • Or export to S3, then trigger further downstream pipelines

This setup enables full-cycle automation of data operations.


Conclusion

Combining Python and SQL gives you an unbeatable foundation for backend data processing and analytics. Whether you’re dealing with local files via SQLite or managing enterprise-scale data via MySQL, Python can handle it all.

Incorporate SQL into your Python scripts, build ETL workflows, automate reports, or serve clean data to your ML models. Mastering these integrations will make you a highly effective data engineer in any environment.