Deploy the asyncronous processing infrastructure

Time Estimate: 10 - 15 minutes

In this module, you will will deploy the document processing asynchronous architecture using Cloud9 and AWS CDK.

What is AWS CDK?
The AWS Cloud Development Kit (AWS CDK) is an open source software development framework to model and provision your cloud application resources using familiar programming languages.
It provides you with high-level components that preconfigure cloud resources with proven defaults, so you can build cloud applications without needing to be an expert. AWS CDK provisions your resources in a safe, repeatable manner through AWS CloudFormation.
For more information, see https://aws.amazon.com/cdk/.

Connect to Cloud9. (Ignore the instructions in steps 1-3 below if you are already connected to Cloud9, skip to step 4)

  1. Go to the AWS Cloud9 console and click on Your environments (you may need to expand the left sidebar).

    Cloud 9 environment
  2. Find the reInventCloud9 environment and click the Open IDE button as following:

    Cloud 9 environment 1

    If you have trouble opening Cloud9:

    • Ensure you are use either Chrome or Firefox browser.
    • Refer to the troubleshooting guide here to ensure third-party cookies is enabled.
  3. You should now see an integrated development environment (IDE) as shown below. You can view and edit files in the editor and run shell commands in the terminal section just like you would on a local computer.

    Cloud 9 start
  4. In Cloud9, run the following commands in the terminal:

You will run all terminal commands for this module from within Cloud9 (not your local machine). Keep in mind that Cloud9 is a fully fledged IDE running on an Amazon EC2 instance. You can edit code and create new files via the Cloud9 editor in your browser. You can also open multiple terminals if needed.

All of the terminal commands in these modules can be copy/pasted to your terminal for execution.

# Download this repo on your local machine: 
git clone https://github.com/aws-samples/amazon-textract-serverless-large-scale-document-processing.git 
# Install AWS Cloud Development Kit (CDK): 
npm install -g aws-cdk@0.28.0
# Navigate to the the folder below and install dependencies using the npm package manager:
cd amazon-textract-serverless-large-scale-document-processing/textract-pipeline/
npm install
# Run the AWS CDK bootstrap command: 
cdk bootstrap

Note: After executing cdk bootstrap you should see something similar to the below:

Bootstrapping environment **********/us-east-1...
CDKToolkit: creating CloudFormation changeset...
 0/2 | 8:21:50 PM | CREATE_IN_PROGRESS   | AWS::S3::Bucket | StagingBucket 
 0/2 | 8:21:51 PM | CREATE_IN_PROGRESS   | AWS::S3::Bucket | StagingBucket Resource creation Initiated
 1/2 | 8:22:12 PM | CREATE_COMPLETE      | AWS::S3::Bucket | StagingBucket 
 2/2 | 8:22:14 PM | CREATE_COMPLETE      | AWS::CloudFormation::Stack | CDKToolkit 
Environment **********/us-east-1 bootstrapped.
# Deploy the asynchronous document processing pipeline infrastructure: 
cdk deploy

Note: After executing cdk deploy when you are asked Do you wish to deploy these changes (y/n)? type y and then pres Enter.

Note: After pressing y to the cdk deploy command you should see something similar to the below:

  TextractPipeline: deploying…
  Updated: lambda/helper (zip)
  Updated: lambda/textractor (zip)
  Updated: lambda/s3processor (zip)
  Updated: lambda/s3batchprocessor (zip)
  Updated: lambda/documentprocessor (zip)
  Updated: lambda/syncprocessor (zip)
  Updated: lambda/asyncprocessor (zip)
  Updated: lambda/jobresultprocessor (zip)
  TextractPipeline: creating CloudFormation changeset…
  

Wait for the execution of the command to be completed, it should take a few minutes. Once completed proceed to the next step and test the infrastructure that you have deployed.