Encountered this bug trying to integrate Tableau with Amazon Redshift the other day, figured I should note it down somewhere…
Root cause:
Basically stated here:
If your client application uses an ODBC connection and your query creates a result set that is too large to fit in memory, you can stream the result set to your client application by using a cursor. When you use a cursor, the entire result set is materialized on the leader node, and then your client can fetch the results incrementally.
So, the bigger your Result Set Size per Cursor
is, the more time you save retrieving the query result. However, the smaller your cluster is, the lower Result Set Size
, lower number of concurrent query it can support. It’s basically a trade-off between speed and convenience. Different Redshift cluster type has different constraints on Result Set
, as described here:
Node Type Max RS per Cluster Max RS per Cursor Concurrent Cursors DW1 XL single node 64000 32000 2 DW1 XL multiple nodes 1800000 450000 4 DW1 8XL multiple nodes 14400000 960000 15 DW2 Large single node 16000 16000 1 DW2 Large multiple nodes 384000 192000 2 DW2 8XL multiple nodes 3000000 750000 4
Solution:
- Go to your Redshift cluster setting and edit your
Parameter Group
(You will need to create a new one if you haven’t done so before):
- Update
max_cursor_result_set_size
to a smaller number depending on your usage and cluster size:
- Modify your cluster - change to newly created
Parameter Group
and reboot: