Monitoring on AWS with CloudWatch Agent and Procstat

Objective: Install CloudWatch Agent with procstat on an EC2 instance and configure a metric alarm in CloudWatch

One of the first issues I ran into was with IAM Policies, or lack thereof . Specifically it was the managed policy CloudWatchAgentServerPolicy which needed to be added. The telltale sign that you forgot to add this Policy is an error message in the Agent logs, seen below

2020-08-17T22:46:18Z E! refresh EC2 Instance Tags failed: NoCredentialProviders: no valid providers in chain<br>caused by: EnvAccessKeyNotFound: failed to find credentials in the environment.

The procstat plugin fortunately is already part of the Agent from install, but it still needs to be configured. In order to do this you have to add a configuration file specific to your monitoring needs. For old school admins the easiest way to think of procstat is that it basically ties into the ps tool. It’s like doing a `ps -ef | grep` to find something about a running process.

[root@lab-master amazon-cloudwatch-agent.d]# pwd
/opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.d
[root@lab-master amazon-cloudwatch-agent.d]# cat processes
{
    "metrics": {
        "metrics_collected": {
            "procstat": [
                {
                    "pattern": "nginx: master process /usr/sbin/nginx",
                    "measurement": [
                        "pid_count"
                    ]
                }
            ]
        }
    }
}

This will get us far enough that now we can see values in the Metrics view of CloudWatch. Once we have data there its time to construct a metric alarm. My goal was to use Terraform however its less painful to do in the AWS console.

resource "aws_cloudwatch_metric_alarm" "nginx-master" {
  alarm_name = "nginx master alarm"
  comparison_operator = "LessThanThreshold"
  evaluation_periods = 1
  datapoints_to_alarm = 1
  metric_name = "procstat_lookup_pid_count"
  namespace = "CWAgent"
  period = "300"
  statistic = "Average"
  threshold = "1"
  alarm_description = "Checks for the presence of an nginx-master process"
  alarm_actions = [aws_sns_topic.pagerduty_standard_alarms.arn]
  insufficient_data_actions = []
  treat_missing_data = "missing"
  dimensions = {
    "AutoScalingGroupName" = "some-ASG-YXI8VDT6MBE3"
    "ImageId"       = "some-ami"
    "InstanceId"    = "some-instance-id"
    "InstanceType"  = "t3a.large"
    "pattern"       = "nginx: master process /usr/sbin/nginx"
    "pid_finder"    = "native"
  }
}

The alarm creation proved to be a lot harder than I had expected, taking up several hours. I had to re-create things in my lab setup twice and do a Terraform import. The problem turned out to be that the dimensions{} block is not optional despite what the Terraform docs say. Had it said the fields were all required I probably would have saved days of time.

Polish Work

In the process of working things out I hard coded a lot of values in the Dimensions {} block. Naturally that is not good practice, especially with IaaS so I will need to rework it to use variables instead. Also the alarm names should utilize the Terraform workspace values for better naming.

Hockey-Info Release v1.0

https://gitlab.com/dword4/hockey-info

This is probably as “complete” as the Hockey-Info project may ever get so I figured its worth an actual blog post (and git tag at v1.0 too). A little over a year ago I was fed up with the NHL’s mobile website and decided I wanted to create something that gave me all the information without the associated fluff and annoyance. I had already spent time documenting the NHL API and using that knowledge to build smaller things like a sopel plugin for hockey stats. So I started throwing things together with Flask and suddenly hockey-info.online came to be!

Features

  • News Headlines
  • Scores view (only shows today’s games/scores)
  • Standings (grouped by Conference/Division and sometimes showing Wildcards)
  • Schedule (day, week and month views, team specific or league wide)
  • Team Views
    • Regular Season
    • Playoffs
    • Previous Games
  • Game details
    • Box Score information
    • Goals by period with details
    • Shots on goal by period
    • Penalties by period with details

Important Details

Built in Python 3 with Flask, Requests, and some various other bits and bobs. It runs either by itself with the usual process to launch a Flask application or if you are so inclined there is a Dockerfile that can be used to launch it with a little less pain.

There is some caching going on with requests_cache however this is by no means optimal, I would like to eventually do more work with a proper web cache but for now this works since the site is so light weight. I make use of CDNs for all the JavaScript so that also helps speed things up (and more importantly move the burden off of wherever you choose to run it in the first place).

Timezone awareness is non-existent, I basically converted everything from whatever the NHL stores (UTC time/date) to Eastern Time since that is where I live. I try to be very privacy conscious and I couldn’t justify the time expenditure in methods of determining user location for time conversion processes, if someone wants to suggest it or contribute it PRs are welcome.

Legal Considerations

I have zero affiliation with the NHL beyond sometimes buying their branded merchandise and viewing their games when I get the chance. There are no ads served by the application (or even any decent way to add them without altering all the templates) so I make no money from anything (not even the blog).

Quick Code: Repo List

So I ran into an interesting problem over the weekend, I forgot my 2FA token for Gitlab at home while I was away. My laptop’s SSH key was already loaded into Gitlab so I knew I could clone any of my repositories if only I could remember the exact name. That of course turned out to be the problem: I couldn’t remember the name of a specific repository that I wanted to work on. I even tried throwing a bunch of things at git clone to try to guess it and still had no luck. Enter the Gitlab API:

#!/usr/bin/env python3
                                                                                                                    import requests                                                                                                     from tabulate import tabulate
                                                                                                                    personal_token = 'asdfqwerzxcv1234'                                                                             user_id = 'dword4'                                                                                                                                                                                                                      base_url = 'https://gitlab.com/api/v4/'                                                                             repo_url = 'users/'+user_id+'/projects'                                                                                                                                                                                                 full_url = base_url + repo_url + '?private_token=' + personal_token                                                                                                                                                                     res = requests.get(full_url).json()
table = []
for project in res:                                                                                                     name = project['name']
    name_spaced = project['name_with_namespace']
    path = project['path']
    path_spaced = project['path_with_namespace']
    if project['description'] is None:                                                                                      description = ''                                                                                                else:                                                                                                                   description = project['description']                                                                            #print(name,'|', description)                                                                                       table.append([name, description])                                                                                                                                                                                                   print(tabulate(table, headers=["name","description"]))

This is of course super simplistic and does virtually no error checking, fancy formatting, etc. However now with a quick alias I can get a list of my repositories even when I do flake out and forget my token at home.

Terraform – Reference parent resources

Sometimes things get complicated in Terraform, like when I touch it and make a proper mess of the code. Here is a fairly straight forward example of how to reference parent resources in a child.

├── Child
│   └── main.tf
└── main.tf

1 directory, 2 files
$ pwd
/Users/dword4/Terraform

First lets look at what should be in the top level main.tf file, the substance of which is not super important other than to have a rough idea of what you want/need

provider "aws" {
  region = "us-east-2"
  profile = "lab-profile"
}

terraform {
  backend "s3" {}
}

# lets create an ECS cluster

resource "aws_ecs_cluster" "goats" {
  name = "goat-herd"
}

output "ecs_cluster_id" {
  value = aws_ecs_cluster.goats.id
}

What this does simply is create an ECS cluster with the name “goat-herd” in us-east-2 and then outputs ecs_cluster_id which contains the ID of the cluster. While we don’t necessarily need the value outputted visually to us, we need the output because it makes the data available to other modules including child objects. Now lets take a look at what should be in Child/main.tf

provider "aws" {
  region = "us-east-2"
  profile = "lab-profile"
}

terraform {
  backend "s3" {}
}
module "res" {
  source = "../../Terraform"
}
output "our_cluster_id" {
  value = "${module.res.ecs_cluster_id}"
}

What is going on in this file is that it creates a module called res and sources it from the parent directory where the other main.tf file resides. This allows us to reference the module and the outputs it houses, enabling us to access the ecs_cluster_id value and use it within other resources as necessary.

Managing a Growing Project

I am no Project Manager in even the loosest sense of the word. Despite that I find myself learning more and more of the processes of PM. This is especially true when projects start to expand and grow. Specifically I am speaking about the NHL API project I started almost two years ago. This lead me to the rabbit hole that is permissions and how to manage the project overall going forward. The projects roots are very rough, even today I still generally commit directly to master. Now the repository has grown to over 70 commits, two distinct files and 17 contributors.

Balance

I am constantly trying to be cognizant of is becoming overly possessive of the project. While it may have started as a one-man show I want and enjoy contributions from others. The converse of worrying about becoming possessive is that there are times when steering is necessary. One of the instances that comes to mind is the suggestion of including example code. The goal of the project is documentation, so I declined such suggestions. Unmaintained code becomes a hindrance over time and I don’t want to add that complexity to the project.

Growth

There is often a pressure to grow projects, to make them expand over time and change. Its a common thing for businesses to always want growth and it seems that mentality has spread to software. Something like the NHL API is a very slow changing thing, just looking at the commit history shows this. Weeks and months will go by without new contributions or even me looking at the API itself. I dabbled with ideas such as using Swagger to generate more appealing documentation. Every time I tried to add something new and unique I realized it felt forced. This ultimately forced me to accept that growth will not be happening, the project has likely reached its zenith.

Looking Forward

The next steps are likely small quality-of-life things such as the recent Gitter.im badge. Things that make it easier for people to interact but don’t change the project overall. My knowledge of the API makes for fast answers so I try to help out when I am able.

Home Gardening in the Apocalypse

So if you listen to the news and social media we are in a very slow collapse it seems, where things are never going back to normal but we totally shouldn’t panic just yet because they are going to devalue the living hell out of our currency with multiple massive multi-trillion-dollar stimulus packages. Well this got me thinking that if it were to go as bad as that nagging little voice says then perhaps its time to actively start sustaining myself with food. This leads me to a home garden in my super limited space at the townhouse. Between the fiance and I we love eggs so we had about 3 cartons laying around that we repurposed into vessels for starting our seeds.

Redundancy is key, 3 pods of each seed type and multiple seeds per pod

This is only a portion of what we plan to plant, basically a phase one with the seeds we were able to source locally, the larger shipment of seeds has been slowly winding its way to us from across the US and should be here within a day or so. The overall plan will includes a mix of common herbs such as Rosemary, Cilantro and Basil alongside edibles like Kale, Cucumbers and Tomatoes to help reduce our costs at the grocery store; less time spent at the store means less potential exposure and saves us money while increasing freshness to something not really possible in a grocery store.

The current layout of the seed starting tray

And an aside to the garden project is my long term fruit tree effort, last year the fiance and I bought a Key Lime tree and a Myers Lemon tree at the local garden store and put them out front of my townhouse. They flourished since it was the middle of summer with tons of light and regular rains, but as I moved them inside the Myers tree took a turn for the worse, losing a lot of its leafs when I moved. I tried more intensive watering in case the dry conditions of the house were evaporating more water than I realized. I rotated it a few times in hopes of the sunlight coming through the window pulling it back to a normal vertical position but that also failed to improve its conditions. Eventually I even resorted to a boost from some fertilizer steaks a few weeks ago but those failed to really change things. Finally I stumbled upon my plant light and timer that I had packed up when I moved so I relocated the tree so that I could point the light at it and set the timer up for about a 12 hour sun cycle and within a few days I was greeted with unmistakably fresh shoots in that vibrant green you cant mistake as well as possibly more fruit developing!

It is amazing what a little supplemental sunlight can do and I am hoping that the 12h cycle I have the seeds on ushers forth even more green in the house so that eventually we will have fresh herbs, veggies and fruits in a few months

Selecting an AWS subnet by name in Terraform

One of my recent challenges has been to write tf code to select existing subnets and use them in new blocks of code (specifically in this case to create a Directory, Workspaces and add a few Security Group entries). Since I am relatively new to using Terraform to do this it took far longer to figure out than I would care to say and I figured it would be best to document what finally worked and had the concept click for me in my mind.

provider "aws" {
  region = "us-east-1"
}
variable "subnet_name" {
  default = "workspaces-private-us-east-1c"
}
data "aws_subnet" "selected" {
  filter {
    name = "tag:Name"
    values = ["${var.subnet_name}"]
  }
}

output "vpcid" {
  value = "${data.aws_subnet.selected.vpc_id}"
}

output "subnet_name" {
  value = "${var.subnet_name}"
}
output "subnet_id" {
  value = "${data.aws_subnet.selected.id}"
}

This will look up the named subnet “workspaces-private-us-east-1c” and obtain not only the VPC ID associated with it but the unique subnet id as well, the output should look something like the below sample provided the name you are looking up is unique

data.aws_subnet.selected: Refreshing state...

Apply complete! Resources: 0 added, 0 changed, 0 destroyed.

Outputs:

subnet_id = subnet-0299e079c90b20ea6
subnet_name = workspaces-private-us-east-1c
vpcid = vpc-04066bef0a56ebcc2

This is of course specific to things as of Terraform 0.12.20 and provider.aws 2.48.0 so naturally things may change over time, however this will get you close and provide you enough of a starting point to use these subnets in other things.

Youtube Essential Ripping Platform

So I have been a longtime user of youtube-dl for ages to archive some things (obscure music, recordings of tech talks) and figured it was worth taking some time to make a simple and easy to use way to achieve this that others could benefit from. More simply put I created a front-end with Python and Flask to sit on top of youtube-dl and make the process so easy non-technical people could use it. Thus YERP was born (https://gitlab.com/dword4/yerp) to fill that role. I know there are tons of other competiting ideas out there doing the exact same thing but I wanted to take a crack at it for my own home network and get it so simplified that all you had to do was run a Dockerfile and it would spring into existence without configuration.

The project is VERY green right now and things are moving around and changing alot (even in my head before code is committed to the repository) so don’t bank on things staying how they are. There are tons of little features I want to put in like folder organization, backups, flags for filetypes and the like which will take quite a while to figure out how I would like to implement them. So if you do run the program just beware and if you find something that can be done better feel free to submit a PR and I will gladly bring other code into the project since I am only one person and not exactly a professional at this to begin with.

Apple 7th Gen iPad + Smart Cover Fix

Perhaps the most frustrating part of using the Apple Smart Cover with the 7th Gen iPad is that the keyboard just randomly stops working from time to time with no real rhyme or reason as to why. After lots of searching around and not getting anywhere I finally found a solution that works like a charm

  1. Physically disconnect cover from the iPad
  2. Reconnect the cover
  3. use Cmd + Tab to switch through a few apps
  4. resume typing because Apple hasn’t fixed this in the software yet

Considering I have seen articles from like 2 years ago where this was an issue it really feels like Apple should have fixed this by now but nope, an up to date 7th generation iPad in 2020 is still having this issue. Makes me wonder what that premium price tag for most Apple goods go to if not fixing/refining the products.

Winter Improvements for Hockey-Info

Finally got around to a rather large update for this project, fixed some small bugs such as the L10 data being way wrong (was showing win-loss-ot for the entire seaseon) as well as added in missing stats for Goalies and made the display of previous game results more sensible. Also redid about 95% of the interface to use Bootstrap4 which has made the look more uniform throughout. If you are interested in seeing the code itself you can see that here, or if you just want to check out the live site which I host as well then you can head over to http://hockey-info.online.

Close Bitnami banner
Bitnami