In this blog post, we explore essential database schema design patterns tailored for building scalable and maintainable e-commerce applications. Using SQLAlchemy code snippets, you'll learn how to model data and manage polymorphic relationships.
If you've ever worked on the backend of a real-world application, especially one with lots of users, products, or transactions, you already know that how you structure your data can either make your life easier or slowly turn into a nightmare.
At first, it’s tempting to just spin up a few tables or collections and get moving. A users
table here, a products
table there, maybe a quick orders
table when the business side asks for it. It works, until it doesn’t.
Suddenly:
That’s where schema design patterns come in.
Much like design patterns in code, schema design patterns help us make smarter decisions about how we organize data, especially when working with complex domains like e-commerce, where relationships between data are everywhere: users place orders, orders contain products, products have categories, and categories form hierarchies. Sound familiar?
In this post, we’ll walk through some of the most common and powerful schema design patterns for relational databases like PostgreSQL or MySQL. And instead of throwing abstract theory at you, we’ll stick to one clear, consistent example: an e-commerce platform.
By the end, you’ll be able to:
Let’s dive in starting with some foundational principles you’ll want in your schema design toolbox.
To keep things concrete throughout this post, we’ll ground each schema design pattern in the same example: a simplified but realistic e-commerce platform.
Imagine you're building a backend system for an online store, think something like Shopify, Amazon, or Etsy, but at a manageable scale. You’re not reinventing the entire internet, but you do need to support real-world complexity.
Here’s what our platform needs to support:
Over time, this system will evolve: we may want to support promotions, wishlists, product bundles, multi-language descriptions, or even multiple vendors. So whatever schema we design needs to handle current requirements but also stay flexible and maintainable as new features are added.
Throughout the rest of this post, we’ll refer back to this e-commerce domain to show how each schema design pattern can be applied in practice and how different approaches might work better for relational databases.
Ready? Let’s start with the foundational principles that should guide every schema design decision.
Before we dive into specific design patterns, it’s worth stepping back and asking:
What does “good” schema design actually look like?
Whether you're designing tables in PostgreSQL or documents in MongoDB, the core goals are often the same:
Let’s break down a few fundamental principles that will help guide your schema decisions.
Normalization is the process of organizing your data into separate tables or collections to reduce redundancy and ensure consistency.
For example, instead of storing a category name inside every product record, you store categories in their own table and reference them via a foreign key. That way, if the name of a category changes, you only have to update it in one place.
This helps with:
But, and this is important, over-normalization can hurt performance, especially when your app needs to read a lot of related data at once. If you're constantly joining five tables to serve a single API response, it's time to rethink.
Rule of thumb → Normalize until it hurts, then consider denormalizing, especially for read-heavy use cases.
Denormalization means intentionally duplicating data to make reads faster. This is especially common in NoSQL databases, but it can also be useful in relational systems.
In our e-commerce example, it might make sense to store a snapshot of product information like name, price, thumbnail, inside each order record. That way, even if the product changes later, the order still reflects what the customer actually bought at that time.
Benefits of denormalization:
Downside? Data duplication means you have to keep things in sync or accept some eventual inconsistency. So use denormalization when performance or business logic justifies it, and make the trade-off consciously.
Schema design isn't just about performance, it’s also about trusting your data.
Good schemas enforce rules like:
product.price
should never be null
In relational databases, constraints like NOT NULL
, UNIQUE
, and FOREIGN KEY
are your friends. In NoSQL systems, you may need to enforce these at the application layer or with validation rules.
A schema that enforces integrity is harder to break and easier to debug when things go wrong.
One of the most overlooked aspects of schema design is this simple question:
How will your application use this data?
It’s tempting to model your database like your business domain; users, orders, products, reviews. But if your frontend needs to fetch a product and its reviews and its average rating all at once, your schema should support that efficiently.
In other words:
Always consider read and write patterns before finalizing your schema structure.
Your schema will evolve. Business logic will shift. New features will be added. And you’ll need to handle it without bringing the whole system down.
Some tips:
A well-designed schema supports change, it doesn’t fight it.
With these principles in mind, you're ready to explore actual schema design patterns that help implement these ideas in real-world systems.
Relational databases like PostgreSQL and MySQL have been powering backend systems for decades. They’re stable, consistent, and designed to handle structured data and relationships incredibly well. But the way you model those relationships in your schema has a big impact on how your system performs, evolves, and stays maintainable. In this section, we’ll walk through essential relational schema patterns using SQLAlchemy ORM, our e-commerce example will guide the way.
This is the bread and butter of relational modeling.
Use case:
In SQLAlchemy:
from sqlalchemy import Column, Integer, String, DateTime, ForeignKey
from sqlalchemy.orm import relationship, declarative_base
Base = declarative_base()
class User(Base):
__tablename__ = "users"
id = Column(Integer, primary_key=True)
name = Column(String)
orders = relationship("Order", back_populates="user")
class Order(Base):
__tablename__ = "orders"
id = Column(Integer, primary_key=True)
user_id = Column(Integer, ForeignKey("users.id"))
created_at = Column(DateTime)
user = relationship("User", back_populates="orders")
What this gives you:
user.orders
and order.user
ForeignKey
Don’t forget to index foreign keys like user_id
for faster lookups!
When both sides of the relationship can have many entries, you’ll need a join table. This pattern is super common and SQLAlchemy handles it gracefully.
Use case:
In SQLAlchemy:
from sqlalchemy import Numeric
class Product(Base):
__tablename__ = "products"
id = Column(Integer, primary_key=True)
name = Column(String)
price = Column(Numeric)
order_items = relationship("OrderItem", back_populates="product")
class Order(Base):
__tablename__ = "orders"
id = Column(Integer, primary_key=True)
user_id = Column(Integer, ForeignKey("users.id"))
created_at = Column(DateTime)
user = relationship("User", back_populates="orders")
order_items = relationship("OrderItem", back_populates="order")
class OrderItem(Base):
__tablename__ = "order_items"
order_id = Column(Integer, ForeignKey("orders.id"), primary_key=True)
product_id = Column(Integer, ForeignKey("products.id"), primary_key=True)
quantity = Column(Integer)
price_at_purchase = Column(Numeric)
order = relationship("Order", back_populates="order_items")
product = relationship("Product", back_populates="order_items")
Why this pattern shines:
Some types of data don’t just relate to other tables, they relate to themselves. In our e-commerce system, a great example is product categories. A category can have subcategories, and those can have their own subcategories, and so on like a tree.
Other real-world examples of hierarchical data:
Relational databases don’t have built-in tree structures, but there are several well-known patterns we can use, each with trade-offs.
This is the simplest and most intuitive way to model a tree. Each row just points to its parent.
Use case:
Product categories with parent-child relationships.
In SQLAlchemy:
from sqlalchemy import ForeignKey
from sqlalchemy.orm import backref
class Category(Base):
__tablename__ = "categories"
id = Column(Integer, primary_key=True)
name = Column(String)
parent_id = Column(Integer, ForeignKey("categories.id"))
parent = relationship("Category", remote_side=[id], backref=backref("children"))
What this gives you:
category.children
gives you all subcategoriescategory.parent
gives you the parentBut... You’ll need recursive queries to fetch an entire tree which PostgreSQL supports with WITH RECURSIVE
, but SQLAlchemy doesn’t do natively.
So for deep category trees like full breadcrumb navigation, you might need to drop down into raw SQL:
from sqlalchemy import text
stmt = text("""
WITH RECURSIVE category_path AS (
SELECT id, name, parent_id
FROM categories
WHERE id = :start_id
UNION ALL
SELECT c.id, c.name, c.parent_id
FROM categories c
JOIN category_path cp ON cp.id = c.parent_id
)
SELECT * FROM category_path;
""")
result = session.execute(stmt, {"start_id": 42}).fetchall()
Great for basic trees, but querying deep hierarchies gets tricky.
Instead of tracking just the immediate parent, we store the full path as a string.
Example path → "Electronics > Phones > Smartphones"
becomes "1/3/8"
using category IDs
SQLAlchemy model:
class Category(Base):
__tablename__ = "categories"
id = Column(Integer, primary_key=True)
name = Column(String)
path = Column(String) # e.g. "1/3/8"
def get_depth(self):
return len(self.path.strip("/").split("/"))
Pros:
path LIKE '1/3/%'
Cons:
You can wrap path updates into SQLAlchemy events or use helper functions to keep them clean.
This one’s mathematically elegant but operationally complex. Each node has a left
and right
value, defining its place in a virtual depth-first tree traversal.
We won’t dive into full implementation here it’s tricky to maintain, but it’s worth knowing if you:
In most practical backends like our e-commerce platform the Adjacency List pattern with some raw SQL support will get you 90% of the way. If you really need fast subtree lookups, Path Enumeration can be a great middle ground.
Some data changes and that’s totally fine. But sometimes, we don’t just want to update it. We want to remember what it used to be. In our e-commerce system, a few obvious examples:
In short, we often need versioning or audit trails. Let’s walk through common schema patterns to make that happen.
Sometimes, you don’t need full version history, you just want to freeze some data at a certain point in time.
Use case:
An order should remember what the product looked like when the customer bought it.
class OrderItem(Base):
__tablename__ = "order_items"
order_id = Column(Integer, ForeignKey("orders.id"), primary_key=True)
product_id = Column(Integer, ForeignKey("products.id"), primary_key=True)
quantity = Column(Integer)
product_name = Column(String) # snapshot
price_at_purchase = Column(Numeric)
# relationships...
Why this works well:
Trade-off:
If you change your product model a lot, this snapshot structure can get messy or out of sync. It’s a lightweight fix, not a full audit solution.
What if you do want to keep a full history of how a product or user changed? Let’s say you want to see the price of a product at any point in time. You can store versions in a separate table:
class Product(Base):
__tablename__ = "products"
id = Column(Integer, primary_key=True)
current_name = Column(String)
current_price = Column(Numeric)
versions = relationship("ProductVersion", back_populates="product")
class ProductVersion(Base):
__tablename__ = "product_versions"
id = Column(Integer, primary_key=True)
product_id = Column(Integer, ForeignKey("products.id"))
name = Column(String)
price = Column(Numeric)
valid_from = Column(DateTime)
valid_to = Column(DateTime, nullable=True) # null = current
product = relationship("Product", back_populates="versions")
How it works:
ProductVersion
rowvalid_from
and valid_to
fields define the time window
# Example: get the product version on a specific date
from sqlalchemy import and_
session.query(ProductVersion).filter(
ProductVersion.product_id == 42,
and_(
ProductVersion.valid_from <= "2025-06-01",
or_(
ProductVersion.valid_to == None,
ProductVersion.valid_to > "2025-06-01"
)
)
).first()
Why this is powerful:
Sometimes you don’t need the state of a record, you need a log of actions. Think: “User X updated Product Y at time Z”. This is best modeled with an append-only audit log:
class AuditLog(Base):
__tablename__ = "audit_logs"
id = Column(Integer, primary_key=True)
user_id = Column(Integer, ForeignKey("users.id"))
entity_type = Column(String) # "product"
entity_id = Column(Integer)
action = Column(String) # "update", "delete"
timestamp = Column(DateTime)
metadata = Column(JSONB) # optional: details of change
You can hook into SQLAlchemy events like before_flush
or add logging directly in your service layer.
When this is useful:
When designing your database schema, especially in hybrid environments or NoSQL systems, you face a classic dilemma:
Should you embed related data inside a single document/row, or keep it referenced separately?
Why Does This Matter?
Embedding means storing related data inside a parent record.
Referencing means storing related data in a separate table/collection, linked by an ID.
If you know that addresses will always be loaded with the user, embedding can be simpler and faster.
class User(Base):
__tablename__ = "users"
id = Column(Integer, primary_key=True)
name = Column(String)
addresses = Column(JSON) # Embedded list of addresses as JSON
Pros:
Cons:
When you need to query, update, or manage related data independently, referencing is cleaner.
class Address(Base):
__tablename__ = "addresses"
id = Column(Integer, primary_key=True)
user_id = Column(Integer, ForeignKey("users.id"))
street = Column(String)
city = Column(String)
postal_code = Column(String)
user = relationship("User", back_populates="addresses")
class User(Base):
__tablename__ = "users"
id = Column(Integer, primary_key=True)
name = Column(String)
addresses = relationship("Address", back_populates="user")
Pros:
Cons:
Sometimes the best of both worlds! For example, embed simple address data for fast reads, but also keep a reference table for advanced queries or analytics.
Sometimes, you want a single relationship to link to multiple different types of objects. In our e-commerce app, a perfect example is comments: users can comment on products, orders, or even support tickets. Rather than creating separate comment tables for each, polymorphic associations let us keep it clean and DRY.
How Does It Work?
Instead of a plain foreign key, you store:
This way, a comment can point to any “parent” object dynamically. Let’s look at that example.
from sqlalchemy import Column, Integer, String, ForeignKey, Table
from sqlalchemy.orm import relationship, declared_attr
class Comment(Base):
__tablename__ = "comments"
id = Column(Integer, primary_key=True)
content = Column(String)
parent_id = Column(Integer)
parent_type = Column(String)
@property
def parent(self):
if self.parent_type == "product":
return session.query(Product).get(self.parent_id)
elif self.parent_type == "order":
return session.query(Order).get(self.parent_id)
# add more types as needed
Or we can use a cleaner approach with using SQLAlchemy’s Single Table Inheritance (STI). So basically, If the related types share a common interface, you can design a base class and use inheritance:
class Commentable(Base):
__tablename__ = "commentables"
id = Column(Integer, primary_key=True)
type = Column(String)
__mapper_args__ = {
'polymorphic_on': type,
'polymorphic_identity': 'commentable'
}
class Product(Commentable):
__tablename__ = 'products'
id = Column(Integer, ForeignKey('commentables.id'), primary_key=True)
name = Column(String)
__mapper_args__ = {
'polymorphic_identity': 'product'
}
class Order(Commentable):
__tablename__ = 'orders'
id = Column(Integer, ForeignKey('commentables.id'), primary_key=True)
user_id = Column(Integer, ForeignKey("users.id"))
__mapper_args__ = {
'polymorphic_identity': 'order'
}
class Comment(Base):
__tablename__ = "comments"
id = Column(Integer, primary_key=True)
content = Column(String)
commentable_id = Column(Integer, ForeignKey('commentables.id'))
commentable = relationship("Commentable", backref="comments")
Why Use Polymorphic Associations?
But remember, polymorphism adds some complexity to queries and joins. Be sure to index the parent_type
and parent_id
fields properly for performance.
Designing a database schema is never a one-size-fits-all deal. By exploring these schema design patterns, you’re better equipped to build a robust, scalable, and maintainable backend. Remember, the best pattern depends on your app’s specific needs and how your data flows. Our e-commerce example showed how these patterns come to life in a practical setting, but your mileage might vary and that’s okay! Feel free to experiment, iterate, and adapt as your application grows. And if you ever get stuck, just come back to these patterns, they’re your toolkit for crafting clean, efficient data models.