Lab: Prime Competing Consumers

In this week’s lab, you will build an AWS app for computing whether large numbers are prime. To support a large number of customers, you will use AWS lambdas that compete to read from a queue. This allows scaling up the number of lambdas to match the problem size (up to limits set by AWS Learner Academy). You will build out a system matching the following figure:

Figure: Overview of API gateway feeding through to primality computation and storage in a DynamoDB
Figure: Overview of API gateway feeding through to primality computation and storage in a DynamoDB

The pieces are as follows:

Elements of the lambdas will follow the previous lab: they will process JSON data and return results as JSON data. The difference is that you will be sending data to and from the SQS and the DynamoDB rather than just responding to PUT requests.

This lab assumes you know how to write an AWS lambda as done in the previous lab. The writeup addresses various issues you will need to solve to build a complete system. You will have to tie these steps together for the solution. A suggested order of implementation is as follows:

  1. Implement the queue-filling lambda (that is, the lambda that takes JSON requests and stores the result in the SQS).
  2. Start the SQS and use PostMan to send data. Confirm that the data is getting into the queue.
  3. Implement a lambda that takes data from the SQS and writes the number to the Primality-storing DynamoDB. Do not try to fully implement the test for primeness; keep this step to just declaring every number is prime or composite or even flipping a coin between the two.
  4. Start the DynamoDB.
  5. Confirm data does get into the database, but there is no need to examine its contents yet.
  6. Implement the Database-dumping Lambda and use PostMan to confirm that the numbers to test are getting into the database.
  7. Implement the prime-testing lambda as directed below.
  8. Use PostMan to confirm the full system is working.

The sequence for processing a number will be as follows. As for the previous lab, the input will be JSON data.

  1. The client posts a JSON value such as
{"integer":1000000000000000}

to the route /add_prime.

  1. The queue-filling lambda will pick this up and put it in the SQS. It is to be named com.primecheck.QueueFillingHandler. Note the handler puts the JSON directly into the SQS queue without modification.
  2. The prime-testing lambda, at the endpoint com.primecheck.PrimeTestingHandler, will take values from the SQS, parses them into a com.primecheck.PrimeCandidate object, determine if the number is prime, and write the result to a primes DynamoDB table. As directed above, your first implementation of this will just declare all numbers are prime (or all are composite). After the full system is running, you will rewrite this to actually check if a number is prime or now.
  3. The database-dumping lambda allows clients to determine what numbers have been checked and the results for those numbers. It will use the route '\primes'. In a real system clients would query the database for a specific result, but we are not implementing that for simplicity.

More Detailed Requirements

This section briefly describes the major components you must make for this lab and the required names for each component. Descriptions of how to gradually build these components come later in the lab.

All data in this system will use the Java class library BigInteger. This allows using integers larger than what will fit in int or long long. This is useful since modern encryption needs to use very large numbers to ensure security. Machines like Rosie make it trivial to factor numbers with a mere ten or twenty digits.

The queue-filling lambda should listen in on the /add_prime API route. The queue-filling handler class should have the fully-qualified name com.primecheck.QueueFillingHandler. (This is the string added after the URL to get to your API gateway.) The responsibility of the queue-filling lambda is to receive JSON requests and forward them to SQS. The JSON objects should be in this format:

{"integer":100000}

where 1000000 is replaced with the integer whose primality is to be tested. It should respond with a confirmation including the integer when it is successful. It does NOT need to be formatted as in the example in the previous section – this lambda does not need to parse the JSON at all.

If the JSON received is not in this format return a return code of 400 with the body

{"error":"some message here"}

where you can replace “some message here” with some other string of your choice.

The fully-qualified name of the prime-testing class should be com.primecheck.PrimeTestingHandler. The responsibility of the prime-testing lambda is to determine if the numbers it receives are prime. It should write a key-value pair to the dynamo DB where the key is the integer (as a string) and the value is {"IsPrime":true} if the integer is prime and {"IsPrime":false} if it is not. Do not include leading zeros when converting the integer to a string. There are more details about how a lambda can write a record into a DynamoDB below.

The fully-qualified name of the database-dumping class should be com.primecheck.DatabaseDumpingHandler. The database-dumping lambda should listen in on the /primes API route. The responsibility of the database-dumping lambda is to return the entire contents of the database to any client. This simplifies viewing the contents of the database.

We will interface with your application through Postman using the POST request /add_prime.

Getting Started

Create a fresh repository with your code from last week’s lab. Refactor the package to com.primecheck and introduces the classes named above. The best way to transfer code from one repository to another is simply check out both repositories on your computer and then use the Windows File Manager (or the equivalent on other computers) to copy the files over. Be careful about directory structures as you do this; do not introduce new folders.

Debugging with logs

With so many pieces, you will want to use logs to track data through the system. If you submit a request to a lambda (say, through Postman), you can then visit the “Monitor” tab in the AWS console for the lambda. Next, go to “View Cloud Watch Logs” and click on the latest log stream in the window that opens. This should let you see log messages you write.

A tricky issue with services like AWS Lambda is that it is easy to accidentally execute the wrong version of the code. A good practice is to change the output in some way - maybe write a version number of the code to the log - so you can confirm that the code executed is the code you expected. Note that logs include the fully-qualified name of the class writing the log message at the start of each line and that your output shows up at the end of the line.

If the logs feel like they update slowly – taking a minute or two to come through – make sure you did not include any delay when you created your queues or sent your messages. (See SQS below.)

You may noticed the first runs are slower than succeeding runs. This is because of the “cold start” issue discussed in class: it takes time for AWS to start a lambda if it is not already running.

A note on working with multiple lambdas

You could create a separate package for each lambda. But this requires you to update multiple packages at the same time and to keep everything consistent. Just like you can have a single Java program that different mains in different classes, you can embed multiple lambda event handlers in a single jar file. This is why you specify the lambda code by its full path (the package, class, and method name). This considerably simplifies deployment. The cost is that the .jar file can become large, and sometimes AWS will give you a warning about moving your package to S3. You can ignore this warning; drag your jar file from IntelliJ to the upload location as usual. The warning is intended for developers working with production code where having multiple lambda entry points in the same jar file can result in additional charges.

Note that you do not need to upload the .jar to ALL the lambdas each time you make a change. If you are pretty sure a lambda already has working code, you do not need to replace it with the refreshed version. However, you might want to replace them all one more time when you are done to make sure your current code-base is indeed still correct throughout.

When creating your lambda, be sure to set the maximum number of concurrent lambdas to 2 (the minimum). Attempting to use more than 10 concurrent lambdas for AWS Academy results in your account being deactivated. Your instructor would then need to contact AWS support to ask that your account be reactivated.

Increasing Lambda Timeout

Your prime-testing lambda will need a timeout bigger than 15 seconds. On the same page where you upload the code for the lambda, go to the configuration tab, go to the general configuration section, on the far right click edit, and then set the timeout to a minute or so and save.

Parsing JSON into a Java object

In last week’s lab, a JSON object was parsed into an instance of the class UnicornLocation. This week, you will need to define your own class. It will need to match the new format of JSON requests. Give this class the fully-qualified name com.primecheck.PrimeCandidate. This class should have a single field BigInteger integer and should have a getter and setter for this field. Be sure to create a zero-argument constructor; the code which unpacks the JSON data will first create an object using this constructor an then use setter methods to fill the data appropriately. It can simplify debugging to add a second constructor that takes a BigInteger argument.

Warning: the name of your number field in PrimeCandidate must match the name of the field in the JSON object that the client will send. The beanFrom method is using an old Java “beans” convention that expects this.

SQS

Setting up the SQS Queue

You can create your queue through the AWS web console (just search for SQS and poke around a little). Do not use a FIFO queue. The default queue type and defaults will work fine. Alternatively, you can create the queue using Java code. Note the queue can be any name you like as long as you stick to standard identifier characters (letters, numbers, underscores, but with no embedded spaces or special characters).

The AWS SDK for Java 1.x documentation on SWS includes example snippets on:

Some tips on using this API: The lambda reading from the queue should ignore the APIGatewayProxyRequestEvent input and instead read from the queue by the queue’s name. It is good to follow the code examples in the documentation, but do not use DelaySeconds when creating the queue, and do not use withDelaySeconds(...) when you send your message. In addition, do not append the date to the queue name since you need just one queue. Some methods that take a queueURL require you to get the URL using String queueURL = sqs.getQueueUrl(QUEUE_NAME).getQueueUrl(); instead of just using QUEUE_NAME (the simple string name for the queue).

You do need to add the dependency

    <dependency>
        <groupId>com.amazonaws</groupId>
        <artifactId>aws-java-sdk-sqs</artifactId>
        <version>1.12.699</version>
    </dependency>

to your pom.xml file so that you can import the SQS classes.

As an aside, whenever you add something to your pom.xml, we recommend using IntelliJ’s File | Repair IDE feature to force the indexer to recognize the new classes these provide. Clicking through the dialog that comes up will remove the IntelliJ shows for portions of your code (the red text for some names).

Once your com.primecheck.QueueFillingHandler can put the message into the queue, assign it to the /add_prime route and test the route with Postman.

Receiving a message from the SQS queue

The prime-testing lambda is triggered by data being put into the SQS queue. To set this up in the AWS console, go to the “Lambda triggers” tab and click through the options there to add an already-existing lambda. Alternatively, you can go into the lambda, go to Triggers, and select the queue as a trigger for the lambda; this has the same effect.

For a lambda to Receive an SQS event, the handler class needs to start with this line:

public class PrimeTestingHandler implements RequestHandler<SQSEvent, Void> {

Then the handler can use this code to process all of the messages that came to it:

    @Override
    public Void handleRequest(SQSEvent sqsEvent, Context context) {
        try {
            List<SQSEvent.SQSMessage> messages = sqsEvent.getRecords();
            if(messages.isEmpty()) {
                logger.info("No messages to receive!");
            }
            for (SQSEvent.SQSMessage m : messages) {
                logger.info("Message: " + m.getBody());
                PrimeCandidate primeCandidate = JSON.std.beanFrom(PrimeCandidate.class, m.getBody().toString());
                // process the primeCandidate here
            }
            logger.info("Complete. Processed all the messages.");
        } catch (Exception e) {
            logger.error("Error while processing the request", e);
        }
        return null;
    }

The PrimeCandidate class is com.primecheck.PrimeCandidate, the JavaBean class you are instructed to create above.

One new import you will need is:

import com.amazonaws.services.lambda.runtime.events.SQSEvent;

Messages are automatically deleted by AWS when it sends them to your lambda this way.

Once your come.primecheck.PrimeTestingHandler can read a message from the queue, again run your postman check and confirm you receive the message in your logs.

Using DynamoDB to store results

Create a DynamoDB table called primes. You can do this through the web interface by searching for DynamoDB, then creating a table and filling in this name as you work your way through the defaults. Give it a partition key named “Integer”. Make this a String, not a Number.

Add the following dependency to pom.xml:

    <dependency>
        <groupId>com.amazonaws</groupId>
        <artifactId>aws-java-sdk-dynamodb</artifactId>
        <version>1.12.699</version>
    </dependency>

To insert a prime into the table, you can use this helper method within your com.primecheck.PrimeTestingHandler:

    private void createPrimeRecord(PrimeCandidate primeCandidate, boolean isPrime) {
        HashMap<String, AttributeValue> item_values =
                new HashMap<String,AttributeValue>();

        item_values.put("Integer", new AttributeValue(""+primeCandidate.getInteger()));
        item_values.put("IsPrime", new AttributeValue(""+isPrime));

        final AmazonDynamoDB ddb = AmazonDynamoDBClientBuilder.defaultClient();

        try {
            ddb.putItem(TABLE_NAME, item_values);
        } catch (ResourceNotFoundException e) {
            logger.error("Error: The table \"%s\" cannot be found.\n", TABLE_NAME);
            logger.error("Be sure that it exists and that you have typed its name correctly!");
        } catch (AmazonServiceException e) {
            logger.error(e.getMessage());
        }
    }

Here are the related imports, in case you need them:

import com.amazonaws.services.sqs.model.ResourceNotFoundException;
import com.amazonaws.AmazonServiceException;
import com.amazonaws.services.dynamodbv2.AmazonDynamoDB;
import com.amazonaws.services.dynamodbv2.AmazonDynamoDBClientBuilder;

Add this code to your prime-checking lambda and run it to see if anything gets added to the database. Since you will not have code yet to determine if a number is prime, send true or hack it to put in some other placeholder in the database for the moment.

Note that the Items summary shown through the web console can easily be out of date. Clicking “Get live item count” will show you how many items are ACTUALLY in the table right now.

Dumping the database

To dump the entire contents of the database, use this method:

    public static void getAllRecords(Logger logger) {
        AmazonDynamoDB client = AmazonDynamoDBClientBuilder.standard().build();

        ScanRequest scanRequest = new ScanRequest()
                .withTableName(PrimeTestingHandler.TABLE_NAME);

        ScanResult result = client.scan(scanRequest);
        for (Map<String, AttributeValue> returned_item : result.getItems()){
            try {
                if (returned_item != null) {
                    Set<String> keys = returned_item.keySet();
                    for (String key : keys) {
                        logger.info(String.format("%s: %s\n",
                                key, returned_item.get(key).toString()));
                    }
                } else {
                    logger.error("No items found in the table "+PrimeTestingHandler.TABLE_NAME+"\n");
                }
            } catch (AmazonServiceException e) {
                logger.error(e.getErrorMessage());
            }
        }
    }

Adjust this code to print something readable and return it as text to the user as part of the /primes route. The exact format does NOT need to be JSON.

It requires these imports:

import com.amazonaws.AmazonServiceException;
import com.amazonaws.services.dynamodbv2.AmazonDynamoDB;
import com.amazonaws.services.dynamodbv2.AmazonDynamoDBClientBuilder;
import com.amazonaws.services.dynamodbv2.model.AttributeValue;
import com.amazonaws.services.dynamodbv2.model.ScanRequest;
import com.amazonaws.services.dynamodbv2.model.ScanResult;

Once your /primes path successfully dumps the contents of the database, hit this route with Postman (or your browser!) to see the contents of the database.

Testing primality

Write a method isPrime that takes a BigInteger and returns true if that integer is prime. Be sure to use BigIntegers throughout your method so you can handle the large numbers we will be testing.

You are encouraged to write your own Java code that tests if an integer is prime by simply checking that it has no divisors between 2 and its square root. You can also swipe some code from online, but be sure it is correct (some are not), and be sure to include a link to your source in accordance with our department’s requirements for software reuse in coding assignments.

There are several methods of BigInteger that are useful for determining primality:

You might find it helpful to write your prime detection algorithm with ints first (where you can develop and debug faster) and then translate it to BigInteger code.

Testing code through lambdas is a very slow process. Write a standard main in com.primecheck.PrimeTestingHandler so you can check isPrime works locally.

Trying your solution on some large integers

Once all three lambdas are running, test your solution by using Postman to send the following values (quickly, in succession) to '/add_prime':

Then, use Postman to “ping” (that is, send a message) to your /primes route every few seconds. You will see the table get filled in over time; this is your application working! If you forgot to set your timeout to a minute or so earlier, you will want to revisit that step now. See the above for details.

Problems and fixes

import com.amazonaws.services.lambda.runtime.Context;
import com.amazonaws.services.lambda.runtime.RequestHandler;
import com.amazonaws.services.lambda.runtime.events.APIGatewayProxyRequestEvent;
import com.amazonaws.services.lambda.runtime.events.APIGatewayProxyResponseEvent;

import com.amazonaws.services.sqs.model.*;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import com.fasterxml.jackson.jr.ob.JSON;

import com.amazonaws.services.sqs.AmazonSQS;
import com.amazonaws.services.sqs.AmazonSQSClientBuilder;

import java.math.BigInteger;
* For PrimeTestingHandler:
import com.amazonaws.services.dynamodbv2.model.*;
import com.amazonaws.services.lambda.runtime.Context;
import com.amazonaws.services.lambda.runtime.RequestHandler;
import com.amazonaws.services.lambda.runtime.events.SQSEvent;
import com.amazonaws.services.sqs.model.ResourceNotFoundException;
import com.fasterxml.jackson.jr.ob.JSON;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

import java.io.IOException;
import java.io.InputStream;
import java.io.OutputStream;
import java.math.BigInteger;
import java.nio.charset.StandardCharsets;
import java.util.*;

import com.amazonaws.AmazonServiceException;
import com.amazonaws.services.dynamodbv2.AmazonDynamoDB;
import com.amazonaws.services.dynamodbv2.AmazonDynamoDBClientBuilder;
import com.amazonaws.services.dynamodbv2.model.ScanResult;
* For DatabaseDumpingHandler:
import com.amazonaws.services.dynamodbv2.AmazonDynamoDB;
import com.amazonaws.services.dynamodbv2.AmazonDynamoDBClientBuilder;
import com.amazonaws.services.dynamodbv2.model.AttributeValue;
import com.amazonaws.services.dynamodbv2.model.ScanRequest;
import com.amazonaws.services.dynamodbv2.model.ScanResult;
import com.amazonaws.services.lambda.runtime.Context;
import com.amazonaws.services.lambda.runtime.RequestHandler;
import com.amazonaws.services.lambda.runtime.events.APIGatewayProxyRequestEvent;
import com.amazonaws.services.lambda.runtime.events.APIGatewayProxyResponseEvent;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

import java.util.Map;

Acknowledgements

This lab draws from many sources online, including Stack Overflow and the AWS tutorial used in last week’s lab. It was developed by multiple people including Dr. Yoder, Prof. Lewis, Prof. Porcaro, and Dr. Hasker.

Submission

Be prepared to demo your lab when requested. See Canvas for any additional submission requirements.