This lab is all about executing queries in prestodb with amazon s3 in the back end. We will be executing DDL(data definition language) and DML(data manipulation language) statements. For configuring Presto Cluster, follow Chapter 2

Scenario 1:
In this, we will be creating a table that will act as a logical separation for the data, and then we will insert some records, define the format in which we want to insert the data, and finally will query the data.
Create a table :
presto-cli --server localhost:8080 --catalog hive presto> use default -> ; USEpresto:default> CREATE TABLE presto_table ( name varchar, hage varchar ) WITH (external_location = 's3a://<s3Bucket_name>/<folder_name>', format = 'CSV' ); -> ; CREATE TABLEpresto:default> desc presto_table; Column | Type | Extra | Comment --------+---------+-------+--------- name | varchar | | hage | varchar | | (2 rows)Query 20210414_022835_00009_zk47q, FINISHED, 2 nodes Splits: 19 total, 19 done (100.00%) 0.35 [2 rows, 162B] [5 rows/s, 460B/s]
Now, let’s insert some data into this table using the INSERT command:
presto> use default -> ; USE presto:default> insert into presto_table(name ,age) values ('Jack','25'); INSERT: 1 rowpresto:default> insert into presto_table(name ,age) values ('Deniel','25'); INSERT: 1 row
Let us now fetch these records using the SELECT command:
presto:default> select count(*) from presto_table; _col0 ------- 2 (1 row)Query 20210414_023336_00011_zk47q, FINISHED, 1 node Splits: 19 total, 19 done (100.00%) 0.41 [2 rows, 1.03KB] [4 rows/s, 2.52KB/s]
This is one way to insert data into s3 using the INSERT command in PrestoDB,
However, we can directly upload the CSV file and query the same way. Presto supports
Delimited, CSV, RCFile, JSON, SequenceFile, ORC, Avro, and Parquet file types.
Scenario 2:
In this scenario, we will be creating a table that will act as a logical separation to query data, here we will first upload a CSV file to s3 which will be having some records to query on.
Upload .csv file
You can use the below command to upload your CSV file, or can directly go to the AWS console and upload your CSV file/files
aws s3 cp /path_of_csv_file/presto.csv s3://<bucket_name>/directory_name/
Create a Table:
presto-cli --server localhost:8080 --catalog hive presto> use default -> ; USEpresto:default> CREATE TABLE presto_table_scv ( name varchar, hage varchar ) WITH (external_location = 's3a://<s3Bucket_name>/<folder_name>', format = 'CSV' ); -> ; CREATE TABLEpresto:default> desc presto_table_csv; Column | Type | Extra | Comment --------+---------+-------+--------- name | varchar | | hage | varchar | | (2 rows)Query 20210414_022835_00009_zk47q, FINISHED, 2 nodes Splits: 19 total, 19 done (100.00%) 0.35 [2 rows, 162B] [5 rows/s, 460B/s]
Lets now run the SELECT command on this table
presto:default> select count(*) from presto_table_csv; _col0 ------- 2 (1 row)Query 20210414_023336_00011_zk47q, FINISHED, 1 node Splits: 19 total, 19 done (100.00%) 0.41 [2 rows, 1.03KB] [4 rows/s, 2.52KB/s]
You can have your query run on one CSV file or multiple CSV files , just upload your CSVs in the path mentioned in create table command.
Note: Your DML queries wont work if you havent configured below flag in your hive.properties file:
hive.non-managed-table-writes-enabled=true
For more flags , that can be used in presto , you can visit
https://prestodb.io/docs/current/connector/hive.html#hive-configuration-properties
Hope this was helpful!
See you in next Chapter!
Happy Learning!
Shivani S.