JavaEar 专注于收集分享传播有价值的技术资料

Conecting AWS Lambda to Redshift - Times out after 60 seconds

I created an AWS Lambda function that:

  • logs onto Redshift via JDBC URL
  • runs a query

Locally, using Node, I can successfully connect to the Redshift instance via JDBC, and execute a query.

var conString = "postgresql://USER_NAME:PASSWORD@JDBC_URL”;
var client = new pg.Client(conString);
client.connect(function(err) {   
  if(err) {
            
      console.log('could not connect to redshift', err);
          
  }  
          
// omitted due to above error

However, when I execute the function on AWS Lambda (where it's wrapped in a async#waterfall block), AWS Cloudwatch logs tells me that the AWS Lambda function timed out after 60 seconds.

Any ideas on why my function is not able to connect?

3个回答

    最佳答案
  1. I find it's either you open your Redshift security group public to all sources, or none. Because a Lambda function isn't running on a fixed address or even a fixed range of IP addresses, which is completely transparent to users (AKA server-less).

    I just saw Amazon announced the new Lambda feature to support VPC yesterday. I guess if we can run a Redshift cluster in a VPC, this could solve the problem.

  2. 参考答案2
  3. If you are using serverless-framework v1.5.0, you should add:

    iamRoleStatements: - Effect: Allow Action: - ec2:CreateNetworkInterface Resource: '*' - Effect: Allow Action: - ec2:DeleteNetworkInterface - ec2:DescribeNetworkInterfaces Resource: 'arn:aws:ec2:${self:provider.region}:*:network-interface/*'

    Also should add all securityGroupIds to Inbounds Rules, like below: screenshot 2017-01-09 23 02 33

    More info: https://serverless.com/framework/docs/providers/aws/guide/functions/#vpc-configuration

  4. 参考答案3
  5. Going a step back, I would recommend to use Kinesis[1] firehose in order to connect lambda and redshift. This is better approach as suggested in docs[2].

    Kinesis can use s3 as intermediate storage to push data to redshift using copy command, automatically.

    "A COPY command is the most efficient way to load a table. You can also add data to your tables using INSERT commands, though it is much less efficient than using COPY"

    Footnotes: [1] http://docs.aws.amazon.com/firehose/latest/dev/what-is-this-service.html

    [2] http://docs.aws.amazon.com/redshift/latest/dg/t_Loading_data.html.