Taming Databases in Microservices

27 Apr, 2025

tl;dr: Microservices give flexibility and independence, but databases don’t scale as easily as code. To survive, you need sharding, replication, caching, polyglot persistence, and a healthy respect for indexes.

Why Databases Are the Hardest Part

Microservice architectures promise flexibility, scalability, and independent teams. But databases are the stubborn part. They are stateful, not stateless, which makes them hard to scale horizontally. Unlike application servers, you can’t just spin up ten more nodes and call it a day.

In microservices, each service often has its own database, which creates new challenges:

Scaling relational databases horizontally
Different services needing different data structures
Guessing database needs too early
Synchronizing data changes across services

Let’s look at these problems one by one — and the strategies that actually work.

Problem 1: Scaling Relational Databases

Relational databases (SQL) are stateful. You can throw more CPU and RAM at them (vertical scaling), but that hits limits quickly. Horizontal scaling — splitting across servers — is complex because of consistency.

Solution: Sharding and Federated Databases

Sharding means splitting the data into logical parts.

Examples:

By geography (EU vs US customers)
By customer ID ranges
By product category

In SQL Server, the Federated Database approach routes inserts and queries to the right shard using filters and keys. Instead of one massive DB, you have multiple smaller ones, each responsible for part of the data.

It’s messy, but it works — and suddenly, one bottleneck DB becomes many manageable ones.

Problem 2: Different Data Needs

Not every service has the same requirements.

A payment or account balance service → must be ACID
An analytics service → needs fast reads on denormalized data
A recommendation engine → may prefer a graph DB

Trying to force one database type to serve them all is a recipe for pain.

Solution: Polyglot Persistence

Use the right database for the job. This is called polyglot persistence.

SQL for transactions (payments, balances)
NoSQL (MongoDB, DynamoDB) for analytics or documents
Graph DBs (Neo4j) for relationships
Time-series DBs (Timescale, InfluxDB) for metrics

Yes, it adds complexity (more databases to manage), but it avoids bending one system into something it’s not.

Problem 3: Wrong Requirements Early

At the start of a project, it’s easy to pick a database because it’s “popular” or “easy.” Later, you realize it’s not suited for the load or consistency you actually need.

Solution: Proper Requirements Analysis

Before picking a database for each microservice, check:

Data model: relational, document, graph?
Consistency: strong vs eventual?
Transactions: what isolation level do you need?
Load: mostly reads, writes, or balanced?
Growth: how fast will data volume increase?

Asking these questions up front prevents painful migrations later.

Problem 4: Synchronizing Data

When data changes in one service, others often need to know. Without synchronization, systems drift into inconsistent states.

Example: In an e-commerce system, if a customer updates their email in the account service, the notification service also needs to know — otherwise, messages go to the wrong place.

Solution: Event-Driven Architecture & CDC

Event-driven architecture: publish domain events (like CustomerEmailUpdated) into Kafka or RabbitMQ. Other services subscribe and react.
Change Data Capture (CDC): track changes at the DB level (via triggers or transaction logs) and send them to other services.

Event-driven is best for workflows. CDC works well for read-heavy systems where you don’t want to add more writes.

Beyond the Core Problems: Reliability Matters Too

So far, we’ve focused on scaling and performance problems: how to split databases, use the right type, or keep services in sync. But microservices don’t just need to be fast — they need to be reliable.

A single database outage can take down a whole service (or worse, your entire system). That’s why replication, caching, and tuning are just as important as sharding or polyglot persistence.

Replication and High Availability

Replication means keeping copies of the database on multiple nodes.

Benefits:

If one node dies, others still serve data.
You can spread read requests across replicas.

Example: In PostgreSQL, streaming replication copies the Write Ahead Log (WAL) to standby nodes in real time. If the master dies, a standby takes over.

This makes the system more reliable and also helps with scaling reads.

Caching: The Pressure Valve

Sometimes the best optimization is to avoid hitting the DB at all.

With Redis or Memcached, you can cache frequent queries:

First, check the cache.
If not found, hit the DB.
Store the result in cache for next time.

This massively reduces DB load and speeds up responses.

Query and Schema Optimization

Scaling is not always about bigger hardware. Often it’s about smarter queries.

Schema design: avoid duplicate data, plan relationships
Indexes: index frequently filtered fields
Queries: never SELECT *, only fetch what you need
JOINs: keep them minimal — they kill performance on big tables
ORM tricks: use eager loading to avoid N+1 queries

These basics often buy you months of breathing space before needing heavier solutions like sharding.

Indexes: The Secret Weapon

Indexes are where you get real performance gains. In microservices, where each service owns its own DB, indexes are often the key to keeping things fast.

Why Indexes Matter

Cut query response time
Reduce CPU/memory use
Allow services to scale without extra infra
Make inter-service calls faster

Example: Pending Orders

Imagine a SaaS system with millions of rows in an Orders table. You run a query like this:

SELECT * 
FROM orders 
WHERE shipped_at IS NULL 
ORDER BY created_at;

It asks for all unshipped orders, sorted by creation time. As the table grows, this query gets slower and slower.

The DB does a full table scan (20M rows)
Filters down to 1M unshipped
Sorts them in memory/disk

The result: UI delays, background jobs hanging, even blocked connection pools.

Why the DB Chooses Bad Plans

If ~5%+ of rows match, Postgres prefers a sequential scan.
Filtering IS NULL can be indexed, but the planner often skips it.
Sorting on millions of rows is expensive and spills to disk.

Fixing It

1. Partial Index

CREATE INDEX idx_orders_unshipped_created 
ON orders(created_at) 
WHERE shipped_at IS NULL;

This indexes only the rows you care about (unshipped). Smaller, faster, and updates only happen when relevant rows change.

Result: query time dropped from ~3.3s to ~230ms.

2. Combined Index

CREATE INDEX idx_orders_shipped_created 
ON orders(shipped_at, created_at);

This organizes rows by shipped_at and then by created_at. It helps, but the index is larger (covers all rows), so it’s less efficient than the partial index.

3. Extended Statistics

Postgres can better optimize if it knows the real data distribution:

ALTER TABLE orders ALTER COLUMN shipped_at SET STATISTICS 1000;
ALTER TABLE orders ALTER COLUMN created_at SET STATISTICS 1000;
ANALYZE orders;

This helps the planner make smarter choices.

Extra Microservices Considerations

Shared databases: one service’s slow scan can hurt others. Partial indexes reduce the impact.
Connection pools: slow queries block connections, ripple effect on APIs. Fast queries free them up.
Materialized views: for very common queries (e.g., “today’s unshipped orders”), materialized views per service can cut load.

Best Practices

From experience, here’s what works best:

Use partial indexes for frequent states (pending, active, unprocessed).
Optimize each microservice’s DB individually.
Dedicate separate DBs for critical services like checkout or billing.
Monitor execution plans regularly, run ANALYZE often.
Combine DB optimization with caching strategies.
For cross-service workflows, use distributed transaction patterns like Saga.

Lessons Learned

Tuning first, scaling second. Always try query/index optimization before going big with sharding.
Replication + sharding are powerful but add complexity. Use them only when load demands it.
Polyglot persistence is real. Different services need different DBs — embrace it, but keep it manageable.
Event-driven sync is the key to avoiding inconsistent data.
Indexes are gold. The right index can make a 10x–14x difference.
Databases are never done. It’s ongoing work — like cleaning a kitchen that everyone keeps using.