Slawomir Stec

Dashboard

The application dashboard is a default view presented to the user after login. It consists of two parts:

Sum of all contacts/users/to-do items in the system
Audit events for all users, the newest on the top

I decided to implement this to test out the DynamoDb model design and PynamoDB library. So let's begin with the PynamoDB.

PynamoDB

PynamoDB is a python abstraction over DynamoDB API.

In the official AWS Python documentation, you can find sample code using the boto3 library. Boto3 is the AWS SDK for Python. For example, to query a table like 'Movies', you need to execute code similar to the one below:

From what I have learnt, ORM solutions for DynamoDB are not recommended. It would be best if you used direct SDK API calls. I found a similar comment about ORM in the Alex DeBrie book:

https://www.dynamodbbook.com/

I think the correct approach is to avoid additional levels of abstraction for DynamoDb interactions. It is a repetitive task to maintain mappings, using string placeholders and different expressions/key definitions for every query. This type of code can quickly inflate the codebase which encourages you to copy & paste between files. But there are also benefits. For example, I find syntax very expressive and easy to use.

Designing a DynamoDb data model can be challenging for developers that are only familiar with the relational database. This approach allows beginners to focus on the design problem and not on the syntax.

However, I did not use the boto3 library in the sample application. PynamoDB is not an ORM; it is an intermediate interface between your code and DynamoDB API. I found it easy to use, and the documentation is short and concise. So I decided to give it a try.

You can find samples in the official documentation of PynamoDB. For the application dashboard, I defined two PynamoDb models. I'm using a single table design for DynamoDB, which means I must use a single table for all the application data I store. I design PK (partitions key)/SK (sort key) so that entities share the same prefix. I use prefixes like table names in relational databases. The prefix is used with the # sign to express the parent -> child relationship.

Returning to the entity models, the first table stores the count of contacts/users/to-do items. Only one row in the application table is needed for this data. The count is updated in services responsible for the creation of the entities. I model this entity with a singleton entity for which partition key (PK) is: DASHBOARD#DASHBOARD_STATS# and sort key (SK) is equal to PK: DASHBOARD#DASHBOARD_STATS#

Some of the classes in the examples are part of my API, i.e., BaseSingletonEntityForPyAwsV1. They don't belong to the PynamoDb; please ignore them. In short, in a single table design, sharing the PK and SK prefix is quite common. This is why you don't see PK/SK declarations in the DashboardStats model. I do this in the parent class by default, and the child only defines the prefix.

Second model stores audit events (actions executed by users). Every time entities are created, I save one audit event. Besides the required keys, the audit event has the type and creation date. In this case, I define the model prefix, but in the constructor, I set different SK. I need to do this because it is not a singleton. Please ignore GSI setters. They are not used in this example.

In conclusion, the PK (partition key) value is equal to DASHBOARD#DASHBOARD_STATS#. for both models. SK has two formats for the value:

For singleton: SK = PK = DASHBOARD#DASHBOARD_STATS#
For one-to-many relation: SK = #DASHBOARD#DASHBOARD_STATS#{unique-item-id}

The prefix for log item SK is slightly different. There is an additional hash in front of the key. This is because we have one access pattern in the sample application:

Return statistics AND latest audit events in single DynamoDb query request.

It is essential to do this in a single call. You don't want to fetch them separately because it will cost more read capacity units.

How this is realized, and why additional # is required for audit event SK?

In the query result, we want to get the first record for statistics (singleton) followed by a list of the newest audit log events. Because we want to get the latest one, we must add an additional # to audit event SK that they appear before singleton. Prepending # to the SK changes the order of the items in the index to make our access pattern possible. DynamoDB query must scan the index backward to get the latest entries. Please check the screenshot. A backward scan will start returning entries starting from statistics singleton. Then the latest added audit events, that is why # is appended to SK.

In PynamoDB, the query with the backward scan is defined by a parameter: scan_index_forward=False. Additionally, we must limit the number of rows to 10. To return combined data from singleton and audit events, I defined a new model with projection columns for both of them. DynamoDB query is straightforward; it relies on the ordering and value of the PK/SK items. Because we are using a single table design, our keys are unique and defined for each entity, and we can be sure that DASHBOARD#DASHBOARD_STATS# represents an entity with dashboard a) stats b) audit events. In our case, the PK/SK combination plus explicit sort order (#SK prefix) ensures that in the response, we will get following columns with not null fields:

UI template

PyAws 0.2.4