DynamoDB and It’s Partition Strategy For Read Heavy Use-case

Devarpi Sheth
4 min readMar 8, 2024

When you run your application in AWS Cloud and you choose NOSQL database, high chance you will end up using DynamoDB as your NOSQL database.

AWS has called out all the best practice of DynamoDB partition designs.

Let’s say even after following this best practices you are still getting hot partition on read and/or write use case for partition that you are using, or you are not in position to apply more cardinality in your partition keys, what are my path forward. This page is talking about great patterns that you can apply to solve that challenge.

What are potential use-case that can get you in to hot partitions?

  1. Typically when you use DynamoDB as shared cache layer and your use case is read heavy, you will get in to this challenge.
  2. You do not have more cardinality available.
  3. Even you went live and now amount of keys that you ended up creating as real data came to your system you might get in to this challenge of hot key due to amount of reads you have.

My use case

I did face the same challenge in one of my product and we were able to react quickly.

  • Caching table shared with-in domain.
  • Number of domain tables where we have more read than data size.

Key points before I share the solution — we had following in DynamoDB tables.

  1. We have used single table pattern where we had capability of logical GSI vs hard coded physical GSI.
  2. We had some of this cache written or change less frequency vs read was happening really heavy.
  3. We have big keys that contributed toward heavy use of RCU. (We solve that by using s3 as big data store and kept pointer in dynamodb, still read of these keys were high over write)
  4. We had keys that was not easy to get more cardinality by functional nature of the keys. (this was key challenge)

How did we solve these challenges?

We one of applied 3 patterns explained below, where 3rd pattern is combination of pattern one and two.

Pattern one: More partition with GSI pattern

Come up with Write to main table with all possible GSI (20 soft limit, 30 hard limit when I write this blog). Create those 30 virtual partition. Example your PK is ‘ABC’, GSI keys could be ABC_GSI_1,ABC_GSI_2 … ABC_GSI_30.

  • *** This gives you 3000*30 RCU per seconds. ***

When you are reading the items, read from GSI randomly picked based on time.

  /**
* method to return random partion no
* pk and sk
* @param {*} max
* @returns partion id
*/
function randomWithMax(max) {
return Math.floor(Math.random() * max);
}
GSI to get more partition

Pattern two: More Partitions with partition replicas and same values

Use main table and create multiple rows for single items which you are using as cache. Challenge is if your write use case high, you will spend more time on write, during read time you can read partition number base on the same program.

PK is ABC#1, ABC#2 .. ABC#N

  • *** This gives you N*3000 RCU per seconds. ***
  • N is number of partition your use case needs (tune it to your need).

Pattern three: Multiple Partition with GSI and more replicas

Combination of Pattern one and two, where you use Multiple GSIs and multiple partitions in main table as well as all GSIs

In main table add more partition with GSI keys with differnt partions as well, this way you can consume mutiple partitons inside GSIs.

Get GSI number between 1 and 30 based on time. Get partition number between 1 and N based on time.

  • *** This will give N * 30 * 3000 RCU per seconds. ***
  • N is number of partition your use case needs (tune it to your need).

Key points to consider here. Number of partition will determine how long you are going to take to update the cache keys during update/create flow in main table. So choose N that fits your use case.**

Key Takeaway

  • Heavy read usecase for dynamodb, partion key cardinality is key
  • If you can not get more cardinality, having more partition with GSI is good and easy fix.
  • Replica is good approach but slows you down on writes.
  • Combining Replica and GSI approach can give you Nth level of partitions. Use it wisely.
  • Keep your item size small, use compression when you stroe value.

Disclaimer: The above pattern elaboration is based on my understanding as well as intellectual enrichment derived from my wonderful teams, to whom I am forever grateful, during our various discussions while finding solutions for performance challenges.

--

--

Devarpi Sheth

I am a technology leader with more than 20 years’ experience , I am focused on cloud technology and have over 10 years of experience in AWS cloud.