Redshift setup
profiles.yml
file is for dbt Core users onlyIf you're using dbt Cloud, you don't need to create a profiles.yml
file. This file is only for dbt Core users. To connect your data platform to dbt Cloud, refer to About data platforms.
- Maintained by: dbt Labs
- Authors: core dbt maintainers
- GitHub repo: dbt-labs/dbt-redshift
- PyPI package:
dbt-redshift
- Slack channel: #db-redshift
- Supported dbt Core version: v0.10.0 and newer
- dbt Cloud support: Supported
- Minimum data platform version: n/a
Installing dbt-redshift
Use pip
to install the adapter. Before 1.8, installing the adapter would automatically install dbt-core
and any additional dependencies. Beginning in 1.8, installing an adapter does not automatically install dbt-core
. This is because adapters and dbt Core versions have been decoupled from each other so we no longer want to overwrite existing dbt-core installations.
Use the following command for installation:
python -m pip install dbt-core dbt-redshift
Configuring dbt-redshift
For Redshift-specific configuration, please refer to Redshift configs.
Configurations
Profile field | Example | Description |
---|---|---|
type | redshift | The type of data warehouse you are connecting to |
host | hostname.region.redshift.amazonaws.com or workgroup.account.region.redshift-serverless.amazonaws.com | Host of cluster |
port | 5439 | |
dbname | my_db | Database name |
schema | my_schema | Schema name |
connect_timeout | None or 30 | Number of seconds before connection times out |
sslmode | prefer | optional, set the sslmode to connect to the database. Default prefer, which will use 'verify-ca' to connect. For more information on sslmode , see Redshift note below |
role | None | Optional, user identifier of the current session |
autocreate | false | Optional, default false. Creates user if they do not exist |
db_groups | ['ANALYSTS'] | Optional. A list of existing database group names that the DbUser joins for the current session |
ra3_node | true | Optional, default False. Enables cross-database sources |
autocommit | true | Optional, default True. Enables autocommit after each statement |
retries | 1 | Number of retries |
Authentication Parameters
The authentication methods that dbt Core supports on Redshift are:
Database
— Password-based authentication (default, will be used ifmethod
is not provided)IAM User
— IAM User authentication via AWS Profile
Click on one of these authentication methods for further details on how to configure your connection profile. Each tab also includes an example profiles.yml
configuration file for you to review.
- Database
- IAM User via AWS Profile (Core)
The following table contains the parameters for the database (password-based) connection method.
Profile field | Example | Description |
---|---|---|
method | database | Leave this parameter unconfigured, or set this to database |
user | username | Account username to log into your cluster |
password | password1 | Password for authentication |
Example profiles.yml for database authentication
company-name:
target: dev
outputs:
dev:
type: redshift
host: hostname.region.redshift.amazonaws.com
user: username
password: password1
dbname: analytics
schema: analytics
port: 5439
# Optional Redshift configs:
sslmode: prefer
role: None
ra3_node: true
autocommit: true
threads: 4
connect_timeout: None
The following table lists the authentication parameters to use IAM authentication.
To set up a Redshift profile using IAM Authentication, set the method
parameter to iam
as shown below. Note that a password is not required when using IAM Authentication. For more information on this type of authentication, consult the Redshift Documentation and boto3 docs on generating user credentials with IAM Auth.
If you receive the "You must specify a region" error when using IAM Authentication, then your aws credentials are likely misconfigured. Try running aws configure
to set up AWS access keys, and pick a default region. If you have any questions, please refer to the official AWS documentation on Configuration and credential file settings.
Profile field | Example | Description |
---|---|---|
method | IAM | use IAM to authenticate via IAM User authentication |
iam_profile | analyst | dbt will use the specified profile from your ~/.aws/config file |
cluster_id | CLUSTER_ID | Required for IAM authentication only for provisoned cluster, not for Serverless |
user | username | User querying the database, ignored for Serverless (but field still required) |
region | us-east-1 | Region of your Redshift instance |
Example profiles.yml for IAM
my-redshift-db:
target: dev
outputs:
dev:
type: redshift
method: iam
cluster_id: CLUSTER_ID
host: hostname.region.redshift.amazonaws.com
user: alice
iam_profile: analyst
region: us-east-1
dbname: analytics
schema: analytics
port: 5439
# Optional Redshift configs:
threads: 4
connect_timeout: None
retries: 1
role: None
sslmode: prefer
ra3_node: true
autocommit: true
autocreate: true
db_groups: ['ANALYSTS']
Specifying an IAM Profile
When the iam_profile
configuration is set, dbt will use the specified profile from your ~/.aws/config
file instead of using the profile name default
Redshift notes
sslmode
change
Before dbt-redshift 1.5, psycopg2
was used as the driver. psycopg2
accepts disable
, prefer
, allow
, require
, verify-ca
, verify-full
as valid inputs of sslmode
, and does not have an ssl
parameter, as indicated in PostgreSQL doc.
In dbt-redshift 1.5, we switched to using redshift_connector
, which accepts verify-ca
, and verify-full
as valid sslmode
inputs, and has a ssl
parameter of True
or False
, according to redshift doc.
For backward compatibility, dbt-redshift now supports valid inputs for sslmode
in psycopg2
. We've added conversion logic mapping each of psycopg2
's accepted sslmode
values to the corresponding ssl
and sslmode
parameters in redshift_connector
.
The table below details accepted sslmode
parameters and how the connection will be made according to each option:
sslmode parameter | Expected behavior in dbt-redshift | Actions behind the scenes |
---|---|---|
disable | Connection will be made without using ssl | Set ssl = False |
allow | Connection will be made using verify-ca | Set ssl = True & sslmode = verify-ca |
prefer | Connection will be made using verify-ca | Set ssl = True & sslmode = verify-ca |
require | Connection will be made using verify-ca | Set ssl = True & sslmode = verify-ca |
verify-ca | Connection will be made using verify-ca | Set ssl = True & sslmode = verify-ca |
verify-full | Connection will be made using verify-full | Set ssl = True & sslmode = verify-full |
When a connection is made using verify-ca
, will look for the CA certificate in ~/redshift-ca-bundle.crt
.
For more details on sslmode changes, our design choices, and reasoning — please refer to the PR pertaining to this change.
autocommit
parameter
The autocommit mode is useful to execute commands that run outside a transaction. Connection objects used in Python must have autocommit = True
to run operations such as CREATE DATABASE
, and VACUUM
. autocommit
is off by default in redshift_connector
, but we've changed this default to True
to ensure certain macros run successfully in your dbt project.
If desired, you can define a separate target with autocommit=True
as such:
profile-to-my-RS-target:
target: dev
outputs:
dev:
type: redshift
...
autocommit: False
profile-to-my-RS-target-with-autocommit-enabled:
target: dev
outputs:
dev:
type: redshift
...
autocommit: True
To run certain macros with autocommit, load the profile with autocommit using the --profile
flag. For more context, please refer to this PR.
Deprecated profile
parameters in 1.5
-
iam_duration_seconds
-
keepalives_idle
sort
and dist
keys
Where possible, dbt enables the use of sort
and dist
keys. See the section on Redshift specific configurations.
retries
If dbt-redshift
encounters an operational error or timeout when opening a new connection, it will retry up to the number of times configured by retries
. If set to 2+ retries, dbt will wait 1 second before retrying. The default value is 1 retry. If set to 0, dbt will not retry at all.