How To: Running cluster in private VPC

Description

Qubole supports clusters within a VPC with private subnets. 

VPC with public subnet is analogous to the scenario 1 of amazon VPC architecture docs:
http://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/VPC_Scenario1.html

VPC with private&public subnet is akin to the scenario 2 of amazon VPC architecture:
http://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/VPC_Scenario2.html

Running clusters in a private subnet is fully covered in Qubole Documentation.

This KB Article provides a summary of the required steps.

Creating the VPC in AWS (by customer)

  • To create such a VPC, provision a VPC with two subnets private and public. The private subnet is where qubole clusters will be launched. See the diagram attached.
  • Routes in Private Subnet: All outbound traffic (0.0.0.0/0) should go to the NAT gateway setup in the VPC
  • Routes in Public Subnet: All outbound traffic (0.0.0.0/0) should go to the internet gateway automatically setup by the VPC wizard
  • Amazon will also automatically open routes to allow communication between all hosts in the private and public subnets
  • We recommend using a NAT gw instead of a customer setup NAT instance for high reliability
  • Create a VPC endpoint to allow direct access to the s3 object store in the region the VPC is in and attach it to the private subnet
  • Create only Bastion Security Group:
    - This allows SSH access to the qubole tunnel server (inbound) in qubole/us-east-1 (Ask Qubole support for the tunnel server IP)
    - Port 7000 should be opened outbound to the private subnet
    - Allow outbound to everywhere (default amazon behaviour)
  • EnableDNSHostnames should be yes for the VPC
  • Bring up the bastion host in the public subnet, using the image in the amazon community labelled "qubole-pv-bastion-ami" which has altered SSH config/etc/ssh/sshd_config to allow Gateway Ports GatewayPorts yes

Qubole side changes (request to Qubole Support)

  • Make sure Qubole’s public ssh key is present on bastion node. Otherwise the ssh authentication from our tiers will not work
  • Update cluster configuration to register tunnel server used for bastion server setup

 

Have more questions? Submit a request

Comments

Powered by Zendesk