How to Setup Neo4j on AWS ECS (EC2)

We’ve already seen how to setup Neo4j on a public EC2 instance using the community edition AMI, but what if you want to run on Docker?

Neo4j also has a supported Docker image that can be used to run the graph database inside a container.

AWS offers support for containers through Amazon ECS (Elastic Container Service). You can use ECS to create a Cluster, within which you can run Tasks.

Amazon currently support two types of ECS Cluster:

Fargate
EC2 with Networking

In this guide, we’ll walk through the EC2 option, but we also have a guide for Fargate, if you prefer not to manage the EC2 servers yourself.

Note: One important caveat if you choose ECS Fargate is that you can only mount your graph data to EFS. This doesn’t give you much flexibility when it comes to performance, compared to if you used an EC2 cluster and EBS.

With EBS volumes, you can provision higher throughput and larger storage quickly and easily. You can also stop your EC2 instance when it’s not needed, and have the Neo4j graph data persist on the EBS volume until you’re ready to start your EC2 instance again.

So, let’s get started. This article assumes you’re starting from scratch, which means we’ll create the full network first (VPC, subnets, route tables, etc.).

The main guide will create Neo4j inside a private subnet, but if you need to run it in a public subnet, the steps are identical, but no need to create the private subnets or NAT Gateway.

Create the VPC

The first step in creating any network is to create the VPC. This will contain the subnets, within which we’ll be launching the EC2 instance that runs our Neo4j Docker container.

Open the Amazon VPC console
In the VPC Dashboard, select Your VPCs, then Create VPC
Provide a Name tag. In our example, we’ll use neo4j-vpc
For the IPv4 CIDR block, enter: 10.0.0.0/26
Click Create

Create the Subnets

The second step in our network creation is the subnets. Because we want to run our Docker container in a private subnet, but with internet access, we’ll create both private and public subnets.

Return to the Amazon VPC console
In the VPC Dashboard, select Subnets, then Create subnet
Create 4 subnets, using the following values:

Name tag	VPC	Availability Zone	IPv4 CIDR block
Neo4j Private a	neo4j-vpc	us-east-2a	10.0.0.32/28
Neo4j Private b	neo4j-vpc	us-east-2b	10.0.0.48/28
Neo4j Public a	neo4j-vpc	us-east-2a	10.0.0.0/28
Neo4j Public b	neo4j-vpc	us-east-2b	10.0.0.16/28

Because we created our VPC in the us-east-2 (Ohio) AWS region, the availability zones are also us-east-2. If you’re creating your resources in a different AWS region, please adjust your AZ choice accordingly.

Create the Internet Gateway (IGW)

Even though we’ll be running our Neo4j container in a private EC2 instance, we’ll still need internet access. The first step to achieve this will be creating an internet gateway and attaching it to our VPC.

Return to the Amazon VPC console
In the VPC Dashboard, select Internet Gateways, then Create internet gateway
In the Name tag field, enter: neo4j-igw
Click Create internet gateway
From the screen that now appears, click the Actions dropdown, then Attach to VPC
In the Available VPCs field, you should be able to select our neo4j-vpc VPC
Once this is selected, click Attach internet gateway

So, we should now have our VPC, subnets, and internet gateway. Next, we need to create our NAT gateway, so that our EC2 instance in the private subnet can get internet access.

Create the NAT Gateway

Return to the Amazon VPC console
In the VPC Dashboard, select NAT Gateways, then Create NAT Gateway
In the Subnet field, select subnet name: Neo4j Public a
Your NAT gateway requires an Elastic IP address. If you don’t already have one, click Allocate Elastic IP address
Click Create a NAT Gateway
Edit the name of the NAT gateway to be: Neo4j NAT

Create the Route Tables

The next step for creating our network is to create the route tables to ensure our public subnets can reach the internet gateway, and the private subnets can reach the NAT gateway.

Return to the Amazon VPC console
In the VPC Dashboard, select Route Tables, then Create route table
Name tag: neo4j-private, VPC: neo4j-vpc
Select the neo4j-private route table, then click the Routes tab and Edit routes
Add route, with Destination: 0.0.0.0/0 and Target: Neo4j NAT
Click Save routes
From Route Tables screen, click Create route table
Name tag: neo4j-public, VPC: neo4j-vpc
Select the neo4j-public route table, then click the Routes tab and Edit routes
Add route, with Destination: 0.0.0.0/0 and Target: Internet Gateway -> neo4j-igw
Click Save routes

Create a Bastion Host

The final step in our networking setup is the creation of a bastion host. This is a small EC2 instance in the public subnet that allows SSH access onto the EC2 instances in our private subnet.

Open the Amazon EC2 console
Click on Instances, then Launch Instance
Select an Amazon Linux 2 AMI and set the instance type as t2.micro
Launch into the Neo4j Public a subnet of our neo4j-vpc VPC
For the security group rules, just ensure that port 22 is open for SSH traffic
Launch instance

Later in the guide, we’ll show you how you can SSH onto our Neo4j EC2 server in the private subnet via this bastion host.

Create the ECS / EC2 Security Group

One of the most important steps in this guide is the creation of the security group for the EC2 instance that will run our Docker container.

Ports 7474, 7473, and 7687 are opened for the Neo4j application, while port 22 is required to allow SSH, and ports 80 and 443 are required for HTTP and HTTPS traffic.

Unless port 443 is open, when the ECS Cluster tries to create the EC2 instance, it won’t be able to communicate with the ECS service, so it won’t appear under the ECS Instances tab of the ECS cluster.

Open the Amazon EC2 console
Click on Security Groups, then Create Security Group
Set Security group name to be: neo4j
For VPC, select neo4j-vpc
Edit the inbound rules to allow ports: 22, 80, 443, 7473, 7474, and 7687 on TCP protocol
Click Create security group

Create the ECS Cluster

Open the Amazon ECS console
Click on Clusters, then Create Cluster
Select EC2 Linux + Networking

Click Next step

For Cluster name, set the value as: neo4j-ec2

Provisioning Model can be left as On-Demand Instance
EC2 instance type was set as m5n.large for this demo

For EC2 Ami Id, select the Amazon Linux 2 AMI. This is an ECS-optimized AMI

For Key pair, select an existing one or create a new one. This is so we can SSH into the instance created by the ECS cluster
Set the VPC as neo4j-vpc
For Subnets, select the Neo4j Private b subnet
Set Security group to be our neo4j security group that we just created
For Container instance IAM role, select ecsInstanceRole. If you don’t have this as an option, you can find steps to create it here
Click Create

ECS will now perform 3 tasks:

Create the ECS Cluster
Attach the ECS Instance IAM Policy for ecsInstanceRole
Create a CloudFormationStack

The CloudFormation Stack provisions a number of resources, including an Auto Scaling group and Launch Template. You can find the Stack that was created under the CloudFormation console.

Once all 3 steps are complete, you should be able to see your EC2 instance in the ECS Instances tab of your ECS cluster:
.

Create the ECS Task Definition

By now we’ve created our:

Network infrastructure
Two EC2 instances (one bastion and one to run our Docker container)
Security groups
ECS cluster

The next step in getting Neo4j running on ECS EC2 is creating an ECS Task Definition. This is where we’ll specify the container to host port mappings, allocate memory and CPU to our container, and specify the Docker image we want to run.

Open the Amazon ECS console
Click on Task Definitions, then Create new Task Definition
Set Task Definition Name to be: neo4j_ec2
Network Mode can be awsvpc
Task execution IAM role: ecsTaskExecutionRole
Click on Add container
Set the Container name to be: neo4j_ec2
For Image, specify: neo4j:latest. This will retrieve the latest Docker image of Neo4j Community Edition
Add Port mappings for the following Container ports: 7474, 7473, 7687

Scroll down to Environment variables, and set the Key to be NEO4J_AUTH, with a Value of neo4j/. can be whatever you want your password to be. This bypasses the default setting of having to reset the password when you first use the Neo4j database. The full list of Neo4j Docker environment variables can be found here.
Scroll down to Log configuration, and select the Auto-configure CloudWatch Logs checkbox
Click Add
At the main Task Definition creation window, click Create

You now have an ECS Task Definition that can be run on the EC2 container instances in your ECS cluster.

Running the Neo4j ECS Task

The final step of getting Neo4j setup in a Docker container running on an EC2 machine in a private subnet is to run the task.

This can be achieved by either creating an ECS Service, or simply running an ECS Task. For this guide, we’ll just perform Run Task.

Open the Amazon ECS console
Open the neo4j-ec2 cluster we created earlier
Click the Tasks tab, then Run new Task
Set Launch type as EC2
Set Task Definition Family to be neo4j_ec2
For Cluster VPC, select our neo4j-vpc VPC
For Subnets, select Neo4j Private b. This must be the same subnet that you launched the EC2 instance into when creating the cluster.
For Security groups, click Edit then Select existing security group, and choose the neo4j security group
Click Run Task

You should now be returned to the page for your ECS cluster, where the Tasks tab will show one new task with a Last status of PROVISIONING.

This should soon progress to RUNNING status.

From the Tasks tab, if you click on the ID of the task you just ran, then click the Logs tab, you should see that Neo4j has been started successfully.

So, now that we have Neo4j running in a Docker container on an EC2 instance in our private subnet, what’s next?

There are two final steps to this guide:

SSH into the Docker container and check the Neo4j logs
Connect a Lambda function to Neo4j and write a Cypher query to create a node

SSH into the Neo4j Docker Container

For this you’ll need a few pieces of information:

EC2 Key Pair file you chose when creating the ECS cluster
EC2 Key Pair file you chose when creating the EC2 bastion instance
IPv4 Public IP of the EC2 bastion instance
Private IP of the ECS-optimized EC2 instance

We follow similar steps to the one in this AWS guide.

Open a new Terminal window if on Mac, or CMD if Windows, then run the following commands (substitute your own values in as required):

ssh-add -K KEY_PAIR.pem
ssh -A ec2-user@BASTION_PUBLIC_IP
ssh ec2-user@ECS_EC2_PRIVATE_IP

Note: When you look at the Private IPs for your ECS EC2 instance, you’ll see two IP addresses. One is for the ECS Task that’s running on it, and one is for the EC2 instance itself. You’ll want to SSH using the one for the EC2 instance, not the one that belongs to the ECS Task.

You should now be on your ECS-optimized EC2 instance. You can confirm the Neo4j Docker container and ECS agent are both active by running: docker ps.

If necessary, we could then SSH into the container itself by running docker exec -it [container id] bash.

This allows you to look through all of the Neo4j files in the container, including debug logs.

Following the official Neo4j Docker documentation, it is at this stage that you can also access the Cypher Shell tool.

Run the following command to access the shell:
cypher-shell -u neo4j -p YOUR_PASSWORD

If you now run a Cypher query, such as MATCH (n) RETURN count(n);, you should have the result displayed in the window.

Create a Lambda to Query Neo4j

The Lambda code for this can be found in our earlier article, How to Setup Neo4j on AWS EC2. You just have to make sure that when you specify the bolt address for the uri to connect to Neo4j, you use the IP of your ECS Task.

For example, if your ECS Task IP is 10.0.0.56, set the uri to be: bolt://10.0.0.56:7687.

If you try to use the IP of your private EC2 instance, Neo4j will throw an error saying that it Failed to establish connection to IPv4Address.