Accessing a Private Redshift Cluster using an SSH Tunnel through your Cage's Bastion Server

How to securely access a private Redshift cluster from outside Nucleator

 

You may wish to create Redshift clusters without public accessibility to ensure that your private data stays private.  It is Nucleator's default behavior to limit access to a Redshift cluster to only requests inside of your Cage.  Redshift cluster login names and passwords are often insufficient to protect sensitive data and using the added protection of VPCs provides a desirable additional layer of security.  Even though direct access has be prevented, access from within the Cage and external access through secure tunnels can still be provided.

Private Redshift Cluster Connectivity from Within Cage

Access to such a private Redshift cluster can be achieved by other instances within your Cage.  You can test this connectivity with psql command line tool.  

First, log in to an ec2 instance with the cage (bastion in this example):

 

ssh -F ~/.nucleator/ssh-config/<customer>/<cage> bastion-<cage>

 

For demonstration purposes, install psql:

sudo yum install postgresql postgresql-contrib


Now you can test a connection to your Redshift cluster

psql -h {redshift-dns-name} -v schema=public -p 5439 -U {user_name} -d {db_name} -c "SELECT * from information_schema.tables limit 5;”

 

This command should produce table descriptions for the first 5 informational tables.  No user data needs to be added to Redshift to view this information.  If an error or no information is produced the connection to the Redshift cluster is not correct.

You can use the method above to test connections to Redshift clusters from other instances within the same Cage.  For some Redshift use models this level of connectivity may be all that is needed but for most use models Redshift connectivity from outside of the cage will be desired (see below).  Either way it is advised that you test Redshift connection within the cage to ensure correct functionality.

Private Redshift Cluster Connectivity from the Internet

Most BI (Business Intelligence) tools will not be running inside of a customer Cage and in many cases the additional layer of security provided by a VPCs will be desired.  It is still possible to connect to a private Redshift cluster in one of two ways - establishing a VPN (virtual private network) connection to the Cage or connecting through a secure ssh tunnel.  Nucleator currently provides direct support for connectivity through a secure ssh tunnel.  Each of these methods have their own pros and cons:

Establishing Internet connection to private Redshift cluster using VPN

This approach has not yet been implemented in Nucleator.  In the future, it may be supported through a vpn Stackset that introduces a VPN server(s) to the public subnets of your Cage.

Establishing Internet connection to private Redshift cluster using ssh tunnel

The remote login tool ssh and its Windows cousins putty and/or git bash provide secure login access and also allow for port tunneling on top of the login connection.  To establish such a connection an ssh session to the bastion server will be required.  The ssh tunnel to bastion can be established by:

ssh -L 15439:<redshift_dns_name>:5439 -i ~/.nucleator/<customer>-<account>-<region>.pem ec2-user@bastion.<cage>.<customer_domain>

This connection assumes that the Redshift cluster is on the standard Redshift port of 5439.  If you are using a different port the port number after the <redshift_dns_name> will also need to change.  The choice of the local port 15439 is complete arbitrary and just needs to be a port not in use by the local computer.

It is important to note that this tunnel will remain open as long as the ssh session is active.  If the ssh connection is closed the tunnel will also close.  

To connect your BI tools running outside of the cage to the Redshift cluster within the cage your BI tool will need to connect to the local port number (15439 in this example) on the computer that established the ssh connection.  You can test the ability to connect to the Redshift cluster through an established ssh tunnel by using psql connection to the local port on the tunnel:

psql -h localhost -v schema=public -p 15439 -U {user_name} -d {db_name} -c "SELECT * from information_schema.tables limit 5;”

Again this command should produce table descriptions for the first 5 informational tables.

In the same way other BI tools can be pointed to port 15439 on localhost to establish connection to the private Redshift cluster.

 


Next: Using Nucleator for Continuous Integration and Delivery

Installation Documentation Releases License Community