Postgres Configuration (AWS)
For very large or long-running queries in Postgres, processing may take several minutes to start. To expedite this process, you can enable the "Export Data to S3 to Support Larger Data Loads" option. When selected, the query output is sent to S3, from where Qrvey retrieves the data. This option is available when the Host URL points to an AWS RDS Postgres instance located in the same AWS region as Qrvey.
Please note that this method requires additional configuration to enable Postgres to write to a Qrvey S3 bucket.
Note: For additional information, you may also refer to the official AWS documentation on Importing Amazon S3 data into an RDS for PostgreSQL DB instance.
Steps
Many of the steps below are necessary only when the Postgres RDS and Qrvey deployment are on different AWS accounts.
-
Configure Postgres for the export to S3 extension.
- a. Using the Postgres query tool, send the following psql command:
CREATE EXTENSION aws_s3 CASCADE;
- b. Grant access to queries used for monitoring and canceling other queries, replacing
myUser
with the Postgres Username that will be used for Qrvey Postgres Connections:GRANT SELECT ON pg_stat_activity TO myUser;
GRANT SELECT ON pg_terminate_backend TO myUser; - c. Grant access to
myUser
to use theaws_s3
schema andquery_export_to_s3
functions:GRANT USAGE ON SCHEMA aws_s3 TO "myUser";
GRANT EXECUTE ON FUNCTION aws_s3.query_export_to_s3(querytext, s3_info aws_commons._s3_uri_1, optionstext, OUT rows_uploaded bigint, OUT files_uploaded bigint, OUT bytes_uploaded bigint) TO "myUser";
GRANT EXECUTE ON FUNCTION aws_s3.query_export_to_s3(querytext, buckettext, file_pathtext, regiontext, optionstext, OUT rows_uploaded bigint, OUT files_uploaded bigint, OUT bytes_uploaded bigint) TO "myUser";
- a. Using the Postgres query tool, send the following psql command:
-
Create a new Policy.
- a. Open the AWS console.
- b. Switch to the Postgres RDS’s AWS account.
- c. Open AWS Identity and Access Management (IAM).
- d. Click Policies.
- e. Click Create policy.
- f. Select the JSON tab.
- g. Paste the policy below, replacing
QRVEY_DEPLOYMENT
with the Qrvey deployment name:{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "VisualEditor0",
"Effect": "Allow",
"Action": [
"s3:PutObject",
"s3:GetObject",
"s3:AbortMultipartUpload",
"s3:ListBucket",
"s3:PutObjectAcl"
],
"Resource": [
"arn:aws:s3:::QRVEY_DEPLOYMENT-dataload-drdatacommons/*",
"arn:aws:s3:::QRVEY_DEPLOYMENT-dataload-drdatacommons"
]
}
]
} - h. Click Next: Tags.
- i. Click Next: Review.
- j. Name the new Policy. For example:
PostgresDataAccessPolicy
.
-
Create a new Role.
- a. Continue in the Postgres AWS account, IAM.
- b. Click Roles.
- c. Click Create Role.
- d. Select AWS account.
- e. Use the filter to find the policy created above.
- f. Check its box.
- g. Click Next.
- h. Enter the Role name. For example:
PostgresDataAccessRole
. - i. Click Create Role.
- j. Note this Role’s ARN string for later use.
-
Add a trust relationship.
- a. Click Roles.
- b. Locate the newly created Role.
- c. Click the Trust Relationships tab.
- d. Click Edit trust relationship.
- e. Paste the trust relationship shown below, replacing
AWS_ACCOUNT_QRVEY
with the AWS account number for the Qrvey deployment:{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::AWS_ACCOUNT_QRVEY:root"
},
"Action": "sts:AssumeRole"
}
]
} - f. Click Update Trust Policy.
-
Set inline policies for the Kubernetes cluster.
- a. Use the Qrvey AWS account.
- b. Open "Amazon Elastic Kubernetes Service".
- c. Click on "Clusters".
- d. Select the Qrvey Kubernetes cluster name.
- e. Go to the "Access" tab.
- f. Go to the "Pod Identity associations" section.
- g. Open the IAM role associated.
- h. Click Add inline policy.
- i. Click the JSON tab.
- j. Paste the policy, replacing
AWS_ACCOUNT_POSTGRES
with the AWS account number for the Postgres cluster:{
"Version": "2012-10-17",
"Statement": {
"Effect": "Allow",
"Action": [
"sts:AssumeRole",
"sts:TagSession"
],
"Resource": [
"arn:aws:iam::AWS_ACCOUNT_POSTGRES:role/PostgresDataAccessRole"
]
}
} - k. Click Review policy.
- l. Name the policy. For example:
PostgresDataAccessAssumeRole
. - m. Click Create Policy. Verify that the new policy has been added.
-
Set policy for commons bucket.
- a. Use the Qrvey AWS account.
- b. Open the AWS S3 service and open the bucket called
QRVEY_DEPLOYMENT-dataload-drdatacommons
. - c. Click the Permissions tab, scroll down to the Bucket policy section, and add the following policy. Replace
POSTGRES_DATA_ACCESS_ROLE_ARN
with the access role ARN created in step #3 and replaceQRVEY_DEPLOYMENT
with the Qrvey deployment name:{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "Statement1",
"Effect": "Allow",
"Principal": {
"AWS": "POSTGRES_DATA_ACCESS_ROLE_ARN"
},
"Action": [
"s3:PutObject",
"s3:GetBucketLocation"
],
"Resource": [
"arn:aws:s3:::QRVEY_DEPLOYMENT-dataload-drdatacommons/*",
"arn:aws:s3:::QRVEY_DEPLOYMENT-dataload-drdatacommons"
]
}
]
}
-
Associate the Postgres cluster with the new Role.
- a. Go to AWS RDS.
- b. Click the desired Postgres cluster.
- c. Click the Connectivity and security tab.
- d. Go to the Manage IAM roles section.
- e. Select the role in the Add IAM roles to this instance combo box.
- f. Select s3Export in the feature combo box.
- g. Click Add Role.
-
Configure a Qrvey Connection.
- a. Open the Qrvey Composer application.
- b. Click Data.
- c. Click Connections.
- d. Create a new Connection for Postgres.
- e. Enter the Host URL, Username, and Password.
- f. Enter the Role ARN created above. (Different accounts only.)
- g. Test the new Connection.
- h. Create a Dataset using the Connection.
- i. Load the Dataset data from Postgres.