CloudFormation AutoScalingGroup does not wait for a signal when updating / scaling

I work with the CloudFormation template, which calls as many instances as I request, and I want to wait until their initialization is completed (via user data) before the creation / update of the stack is completed.

Expectation

Creating or updating a stack should wait for signals from all newly created instances to ensure that their initialization is complete.

I do not want the creation or updating of the stack to be considered successful if any of the created instances are not initialized.

Reality

CloudFormation only seems to be waiting for signals from instances when the stack is first created. Updating the stack and increasing the number of instances seem to ignore the signaling. The update operation succeeds very quickly, while the instances are still initializing.

Instances created as a result of updating the stack may not be initialized, but the update action is already considered successful.

Question

Using CloudFormation, how can I make reality live up to expectations?

I want the same behavior to apply when creating a stack when the stack is updated.

Related questions

I found only the following question that matches my problem: UpdatePolicy in Autoscaling group does not work correctly for updating CloudSormation AWS

It was open for a year and did not receive a response.

I am creating another question, as I have additional information to add, and I am not sure that this data will correspond to the data of the author in this question.

reproducing

To demonstrate the problem, I created a template based on an example under the Auto Scaling Group heading on this AWS documentation page that includes an alarm.

The created template was adapted as follows:

  • It uses AMI Ubuntu (in the area of ap-northeast-1 ). The cfn-signal command was loaded and called if necessary, taking into account this change.
  • The new parameter determines how many instances are started in the auto-scaling group.
  • Before the alarm, a sleep time of 2 minutes was added to simulate the time spent on initialization.

Here is the template saved in template.yml :

 Parameters: DesiredCapacity: Type: Number Description: How many instances would you like in the Auto Scaling Group? Resources: AutoScalingGroup: Type: AWS::AutoScaling::AutoScalingGroup Properties: AvailabilityZones: !GetAZs '' LaunchConfigurationName: !Ref LaunchConfig MinSize: !Ref DesiredCapacity MaxSize: !Ref DesiredCapacity CreationPolicy: ResourceSignal: Count: !Ref DesiredCapacity Timeout: PT5M UpdatePolicy: AutoScalingScheduledAction: IgnoreUnmodifiedGroupSizeProperties: true AutoScalingRollingUpdate: MinInstancesInService: 1 MaxBatchSize: 2 PauseTime: PT5M WaitOnResourceSignals: true LaunchConfig: Type: AWS::AutoScaling::LaunchConfiguration Properties: ImageId: ami-b7d829d6 InstanceType: t2.micro UserData: 'Fn::Base64': !Sub | #!/bin/bash -xe sleep 120 apt-get -y install python-setuptools TMP=`mktemp -d` curl https://s3.amazonaws.com/cloudformation-examples/aws-cfn-bootstrap-latest.tar.gz | \ tar xz -C $TMP --strip-components 1 easy_install $TMP /usr/local/bin/cfn-signal -e $? \ --stack ${AWS::StackName} \ --resource AutoScalingGroup \ --region ${AWS::Region} 

Now I create a single instance stack using:

 $ aws cloudformation create-stack \ --region=ap-northeast-1 \ --stack-name=asg-test \ --template-body=file://template.yml \ --parameters ParameterKey=DesiredCapacity,ParameterValue=1 

After waiting a few minutes to complete the creation, view a few events on the key stack:

 $ aws cloudformation describe-stack-events \ --region=ap-northeast-1 \ --stack-name=asg-test 

  ... { "Timestamp": "2017-02-03T05:36:45.445Z", ... "LogicalResourceId": "AutoScalingGroup", ... "ResourceStatus": "CREATE_COMPLETE", ... }, { "Timestamp": "2017-02-03T05:36:42.487Z", ... "LogicalResourceId": "AutoScalingGroup", ... "ResourceStatusReason": "Received SUCCESS signal with UniqueId ...", "ResourceStatus": "CREATE_IN_PROGRESS" }, { "Timestamp": "2017-02-03T05:33:33.274Z", ... "LogicalResourceId": "AutoScalingGroup", ... "ResourceStatusReason": "Resource creation Initiated", "ResourceStatus": "CREATE_IN_PROGRESS", ... } ... 

You can see that the auto-scaling group started its work at 05:33:33. At 05:36:42 (3 minutes after initiation) he received a signal of success. This allowed the auto-scaling group to achieve their success status only a few seconds after, at 05:36:45.

It's awesome - it works like a charm.

Now try increasing the number of instances in this auto-scaling group to 2 by updating the stack:

 $ aws cloudformation update-stack \ --region=ap-northeast-1 \ --stack-name=asg-test \ --template-body=file://template.yml \ --parameters ParameterKey=DesiredCapacity,ParameterValue=2 

After waiting for a much shorter time to complete the update, consider some of the new stack events:

 $ aws cloudformation describe-stack-events \ --region=ap-northeast-1 \ --stack-name=asg-test 

  { "ResourceStatus": "UPDATE_COMPLETE", ... "ResourceType": "AWS::CloudFormation::Stack", ... "Timestamp": "2017-02-03T05:45:47.063Z" }, ... { "ResourceStatus": "UPDATE_COMPLETE", ... "LogicalResourceId": "AutoScalingGroup", "Timestamp": "2017-02-03T05:45:43.047Z" }, { "ResourceStatus": "UPDATE_IN_PROGRESS", ..., "LogicalResourceId": "AutoScalingGroup", "Timestamp": "2017-02-03T05:44:20.845Z" }, { "ResourceStatus": "UPDATE_IN_PROGRESS", ... "ResourceType": "AWS::CloudFormation::Stack", ... "Timestamp": "2017-02-03T05:44:15.671Z", "ResourceStatusReason": "User Initiated" }, .... 

Now you can see that while the auto-scaling group started updating at 05:44:20, it ended at 05:45:43 - this is less than a minute and a half before completion, which should not be possible, given a sleep time of 120 seconds in user data.

The stack update then ends without the auto-scaling group receiving any signals.

A new instance does exist.

In my real case of using SSHed in one of these new instances, I found that it was still in the initialization process even after the stack update was completed.

What i tried

I read and re-read the documentation surrounding CreationPolicy and UpdatePolicy , but could not determine what I was missing.

Looking at the update policy above, I don’t understand what it actually does. Why is WaitOnResourceSignals true, but it does not wait? Is it used for other purposes?

Or are these new instances not covered by the rolling update policy? If they do not belong there, I would expect them to fall under the creation policy, but that also does not seem to be.

So I really don't know what else to try.

I have an eloquent feeling that it functions as designed / expected, but if that is what is the point of this WaitOnResourceSignals property and how can I satisfy the expectation set above?

+5
source share
2 answers

The AutoScalingRollingUpdate policy processes the entire set of instances in the auto-scaling group in response to changes in the underlying LaunchConfiguration . This does not apply to individual changes in the number of copies in an existing group. According to the UpdatePolicy Attribute ,

AutoScalingReplacingUpdate and AutoScalingRollingUpdate apply only when one or more of the following actions is performed:

  • Change the AWS::AutoScaling::LaunchConfiguration auto-scale group.
  • Change the Auto Scaling VPCZoneIdentifier
  • Refresh the auto-scaling group that contains instances that do not match the current LaunchConfiguration .

A change to the Auto Scaling group DesiredCapacity not in this list, so the AutoScalingRollingUpdate policy AutoScalingRollingUpdate not apply to this type of change.

As far as I know, it is impossible (using standard AWS CloudFormation resources) to delay the completion of the Stack Update DesiredCapacity update until all new instances added to the Auto Scaling Group are fully prepared.

Here are some alternatives:

  • Instead of changing only DesiredCapacity , change the LaunchConfiguration property at the same time. This will cause AutoScalingRollingUpdate to the required capacity (the disadvantage is that it will also update existing instances that may not need to be modified).
  • Add the AWS::AutoScaling::LifecycleHook resource to your auto-scaling group and call aws autoscaling complete-lifecycle-action in addition to cfn-signal to complete the completion of the life cycle. This does not delay the CloudFormation stack update as desired, but it delays individual instances with automatic scaling from entering the InService state until a lifecycle signal is received. (For more information, see the Lifecycle Hooks Documentation .)
  • As an extension to # 2, it should be possible to add a Lifecycle Hook to your Auto Scaling group, as well as a Custom Resource that polls your auto-scaling group and ends only when the auto-scaling group contains the number of DesiredCapacity instances of all in InService state.
+4
source

the current update only works for existing instances. The documentation states:

Movable updates let you specify whether instances of the AWS CloudFormation update the instances that are in the auto-scale group, or all at once.

So, to verify this, create a stack based on your template. than to make a small modification to the launch configuration (for example, set sleep 120 to 121) and update the stack. You should now see a rolling update.

0
source

Source: https://habr.com/ru/post/1263767/


All Articles