AWS Best Practices: Part 4

In this blog, we will cover Auto Scaling, CloudWatch, and Route53 best practices. You'll learn about scaling down with INSUFFICIENT_DATA, using ELB health checks for better instance management, and how to configure Availability Zones for optimized load balancing. We'll also highlight the importance of avoiding multiple scaling triggers in the same group and utilizing free and custom CloudWatch metrics for effective monitoring. Finally, we'll discuss the benefits of using ALIAS records in Route53 to save costs and efficiently map domains to AWS resources.

Auto Scaling

Scale Down : Scale down on INSUFFICIENT_DATA as well as ALARM.

For your scale-down action, make sure to trigger a scale-down event when there’s no metric data, as well as when your trigger goes off. For example, if you have an app which usually has very low traffic, but experiences occasional spikes, you want to be sure that it scales down once the spike is over and the traffic stops. If there’s no traffic, you’ll get INSUFFICIENT_DATA instead of ALARM for your low traffic threshold and it won’t trigger a scale-down action.
ELB Healthchecks : Use ELB health check instead of EC2 health checks.

This is a configuration option when creating your scaling group, you can specify whether to use the standard EC2 checks (is the instance connected to the network), or to use your ELB health check. The ELB health check offers way more flexibility. If your health check fails and the instance gets taken out of the load balancing pool, you’re pretty much always going to want to have that instance killed by auto-scaling and a fresh one take it’s place. If you don’t set up your scaling group to use the ELB checks, then that won’t necessarily happen. The AWS documentation on adding the health check has all the information you need to set this up.
Configured AZs : Only use the availability zones (AZs) your ELB is configured for.

If you add your scaling group to multiple AZs, make sure your ELB is configured to use all of those AZs, otherwise your capacity will scale up, and the load balancer won’t be able to see them.
Multiple Scaling Triggers : Don’t use multiple scaling triggers on the same group.

If you have multiple CloudWatch alarms which trigger scaling actions for the same auto-scaling group, it might not work as you initially expect it to. For example, let’s say you add a trigger to scale up when CPU usage gets too high, or when the inbound network traffic gets high, and your scale down actions are the opposite. You might get an increase in CPU usage, but your inbound network is fine. So the high CPU trigger causes a scale-up action, but the low inbound traffic alarm immediately triggers a scale-down action. Depending on how you’ve set your cooldown period, this can cause quite a problem as they’ll just fight against each other. If you want multiple triggers, you can use multiple auto-scaling groups.

Cloudwatch

CLI Tools : Use the CLI tools.

It can become extremely tedious to create alarms using the web console, especially if you’re setting up a lot of similar alarms, as there’s no ability to “clone” an existing alarm while making a minor change elsewhere. Scripting this using the CLI tools can save you lots of time.
Free Metrics : Use the free metrics.

CloudWatch monitors all sorts of things for free (bandwidth, CPU usage, etc.), and you get up to 2 weeks of historical data. This saves you having to use your own tools to monitor you systems. If you need longer than 2 weeks, unfortunately you’ll need to use a third-party or custom built monitoring solution.
Custom Metrics : Use custom metrics.

If you want to monitor things not covered by the free metrics, you can send your own metric information to CloudWatch and make use of the alarms and graphing features. This can not only be used for things like tracking diskspace usage, but also for custom application metrics too. The AWS page on publishing custom metrics has more information.
Detailed Monitoring : Use detailed monitoring.

It’s ~$3.50 per instance/month, and well worth the extra cost for the extra detail. 1 minute granularity is much better than 5 minute. You can have cases where a problem is hidden in the 5 minute breakdown, but shows itself quite clearly in the 1 minute graphs. This may not be useful for everyone, but it’s made investigating some issues much easier for me.

Route53

ALIAS Records : Use ALIAS records

An ALIAS record will link your record set to a particular AWS resource directly (i.e. you can map a domain to an S3 bucket), but the key is that you don’t get charged for any ALIAS lookups. So whereas a CNAME entry would cost you money, an ALIAS record won’t. Also, unlike a CNAME, you can use an ALIAS on your zone apex. You can read more about this on the AWS page for creating alias resource record sets.

Source: https://roadmap.sh/best-practices/aws

Stay Tuned!

Be sure to follow and subscribe for more updates and upcoming blogs.

Follow me on LinkedIn 🔗 and Hashnode ✍️!