Chapter 4. Deploy Presto Cluster in AWS using Marketplace Starburst Presto Image

  • by

This lab is focused on deploying the Prestodb cluster using AWS marketplace starburst image. For the first 15 days, you can subscribe to the free trial where you just have to pay for the server you want to use.

Subscribe for starburst presto, and we are good to go for deploying our first Presto cluster on AWS.

Pr-requisite:

Create an IAM role for creating presto cluster and attach it below as a policy.

https://github.com/kshivani123/presto_template/blob/main/policy_for_presto_role

Steps to follow:

  1. Go to AWS console and click on CloudFormation service, then click on create stack:

2. Now download the below file :
https://github.com/kshivani123/presto_template/blob/main/persto_cloudformation_template

Now upload this file while creating a cloud formation stack to create Presto cluster.

3. Once the file is uploaded click on next.

You can select the VPC, subnet, security groups, Instance types, number of worker nodes in the cluster, number of master nodes required as per your requirement.
Rest configurations can be kept as default, except “Hive Connector Options”, here you need to select AWS glue connector.

4. Select the role which we created in the pre-requisite section.

Also, create a name tag here and provide a suitable name for this cluster.

Now, click next, check the capabilities box, and then hit the create stack button.

5. Once the stack is created, go to the auto-scaling group, you now must be having 2 auto-scaling groups functional, one is for the coordinator, and the other is for worker nodes.

6. Now, login to the coordinator node, and open the below file
/var/lib/presto/data/etc/catalog/hive.properties, and append the below line in the end.

hive.non-managed-table-writes-enabled=true

Save the file, and we are good to execute queries.

presto-cli --server localhost:8080 --catalog hive
presto> use default
     -> ;
USE
presto:default> select * from system.runtime.nodes;
       node_id       |         http_uri         | node_version | coordinator | state  
---------------------+--------------------------+--------------+-------------+--------
 i-0fd58357899ce5ce6 | http://<IP_of Worker>:8080  | 350-e.1      | false       | active 
 i-00f57d90fd4161d3c | http://IP of Worker:8080 | 350-e.1      | false       | active 
 i-0fcd54f5259429156 | http://IP of Coordinator:8080 | 350-e.1      | true        | active 
(3 rows)Query 20210415_104909_00393_zk47q, FINISHED, 2 nodes
Splits: 17 total, 17 done (100.00%)
0.21 [3 rows, 170B] [14 rows/s, 806B/s]

You can visualize your presto cluster by referring above image, this is a high-level overview, for a complete resource list, refer to the resource section of your cloud formation stack created for the presto cluster.

Note: To run more DML and DDL queries in this setup, you can follow
chapter 3

Hope this was helpful!
See you in next Chapter!
Happy Learning!
Shivani S.

Leave a Reply

Your email address will not be published. Required fields are marked *