Notifications for any online presence are both commonplace and paramount. They drive customer retention as they are subtle reminders for users to check in—sometimes not so subtle. I see you, Duolingo.
Every site has a different approach to notifications. They can be built within the application itself or spread further outside.
For instance, email, SMS, or push notifications. Some banks even persist in sending actual letters as a notification, no matter how many times you ask them not to. It is 2024!
A little word on push notifications: While they’re okay on a mobile device, I find them wildly irritating on a desktop or laptop. There is enough interference with other annoying popups as is, so these won’t be on BetaBud’s radar.
Initially, all automated notifications will be included in the application. I find this is the correct balance as it allows the user to stay informed; however, email updates for the user can be manual.
It isn’t a scalable approach to reach out to each user individually. However, it is an excellent way to build a rapport with your user base, particularly if you want to fine-tune the application to them.
So, what will trigger a notification?
Every response a user makes to a form will trigger a notification, as each response is important to a user.
These notifications will then be aggregated on a per-form basis. Suppose five responses have been submitted for two forms for a single owner. The root value of the notifications bell will display as 2, as only two forms have had responses. Upon clicking the notification bell button, each form title will be displayed along with the count of new reactions, perhaps 2 and 3, respectively, in this scenario.
Rather than clearing on viewing the notifications, they will be cleared by manually selecting “Mark all as Read”. This means the clear function is a conscious control so a user won’t inadvertently lose notifications by fat-fingering the application.
All activity currently runs through DynamoDB, as has been described in this post here. This follows a single table design as all data is closely related.
The beauty of using DynamoDB here is that we can easily harness DynamoDB Streams to listen to item changes, creations or deletions. This change log is stored for 24 hours and can be used to process data in near real-time.
Additionally, a record will appear only once and maintain the order of the change, with the table partition key dictating the shard. This is essential for notifications, as it reduces race conditions and issues with aggregations.
A diagram demonstrating the different infrusture and how they tie together.
So the flow is a response is saved as an item in DynamoDB. This is then added to the DynamoDB Stream. An AWS Lambda will then be used to process the results of these changes.
The Lambda will only consume changes of type response. This is predicated by the event source mapping, which filters out any other item type. The event source mapping also dictates the size of the batches and the max wait time for the Lambda to trigger, among other things. This helps define the frequency of the Lambdas to be run.
Snippet showing the properties to filter records and batch the processing, reducing invocations
The frequency needs to be kept in a fine balance. If the records aren’t being processed quickly enough, then the Iterator Age will climb, as the DynamoDB Stream contains more that need to be processed - remember, beyond 24 hours, these will also drop off the stream. On the opposite side, it's wasting compute - read money - to trigger too many Lambdas unnecessarily.
Another note here on triggering too many Lambdas: There is a maximum concurrency for Lambdas per account. The original quota is 50—this can be increased with a service request. As said, this concurrency is per account, not per Lambda. Reaching this will have adverse effects on any other Lambdas within the account, as only that many can run at once.
Now, a Lambda can reserve its own concurrency aside from the Unreserved Pool, meaning it won't be impacted should another Lambda become greedy. This will dedicate concurrency from the Total Account Pool to that individual Lambda, subtracting from the Unreserved Pool total. It should be noted that the Unreserved Pool cannot drop below the original quota of 50.
Now that we have the delicate balance of Lambda’s event source mapping, how about Lambda's implementation?
A diagram showing the processing of the AWS Lambda itself
First, we parse the events from the DynamoDB Stream into our response object. The events can contain the item's old and new images, depending on whether it was an update, create, or delete. Our responses are immutable and can’t be deleted, so we only care about the New image here.
Once parsed, we group by the parent Form and the parent Form’s User of the response. This allows us to understand who to notify. Then, we load the User, and the Forms by performing a batch get from DynamoDB.
On the User object, there may be existing notifications if they haven’t yet cleared them. The original parsed grouping will then begin a process of calculating. If notifications exist, they will just accumulate the counts; otherwise, they will add the total counts from the current batch.
This will result in each User object containing a list of notifications, which include the Form’s ID, Title, and number of notifications. This allows the Front End to render the data and create the action link for the notification item.
Further, the response counts will be aggregated on the Form object itself. This means that each Form knows the total number of responses it has received. In future, this will also allow for aggregation of metrics. For instance, satisfying the question: What is the average rating response for question 4?
These items can then be updated back in the DynamoDB store. This can be done using a Batch Write action, which will allow for 25 items to be saved at once. As we grouped and aggregated earlier, the total input batch doesn't necessarily mean that many items will be updated. We can iterate over multiple batch saves without the concern that two Lambdas aren't trying to update at once, due to the aforemention sharding.
Using DynamoDB Streams is an alternative approach compared to if a SQL data store was initially chosen. This trigger complexity wouldn’t be required; we could’ve just written the aggregation in a query. However, I particularly like this asynchronous approach that delegates the aggregation elsewhere. It means we maintain the extremely low latency that DynamoDB offers.
As suggested, this stream can be used to create metrics within the parent Form item. Additionally, it can be used for other functionality. Perhaps some other analysis is needed, or we need to stream data into another data store—perhaps even a relational database. It is easily possible with DynamoDB Streams. I'm a Big Fan!
1"BatchSize": 100,
2"MaximumBatchingWindowInSeconds": 60,
3"FilterCriteria": {
4 "Filters":
5 [
6 {"Pattern":
7 "{\"dynamodb\":
8 {\"NewImage\":
9 {\"PK\":
10 {\"S\":
11 [{\"prefix\": \"response#\"}]
12 }}}}
13 "}
14 ]
15}