Environment, system details, and tools:
- AWS EC2
- EC2 Instance Metadata service (i.e. http://169.254.169.254/latest/meta-data/)
- Cloudwatch
- wget
- bash
- Ubuntu 14.04
Has anyone ever seen this? We have a cron job that pushes Cloudwatch metrics from inside an instance by basically doing these steps:
- Get instanceId by running "wget -q -O - http://169.254.169.254/latest/meta-data/instance-id"
- Collecting some metric or other and building an AWS CLI query using
aws cloudwatch put-metric-data ... - Repeat
The weird thing we are seeing is, very infrequently, one of these runs will die after the wget query, with no output. As if the Metadata service just failed to respond.
Example end-of-script (we set bash -e and -x to die and to gather debug output):
++ wget -q -O - http://169.254.169.254/latest/meta-data/instance-id + INSTANCE_ID= The script ends there and exits because presumably wget exited with a non-zero exit status.
This is not reproducible, but it happens on the order of once every 2 weeks.
-q. The output from-O -goes toSTDOUTbut all the other nonsense wget normally spews on a successful request goes toSTDERR, so there isn't an actual need to suppress it when you're capturing its output.STDERRstream with something likeINSTANCE_ID=$(wget -O - http://169.254.169.254/latest/meta-data/instance-id 2> >(perl -pe 'undef $_ if /^Length/ || /^Saving\ to/ || /written\ to\ stdout/ || /100%/ || /^$/' >&2))... or just useINSTANCE_ID=$(ec2metadata --instance-id), which might also provide a useful error -- not sure, since I've never encountered this problem.-q, so we'll probably do that. Beyond that though, I'm going to wrap this call in a retry loop and call it a day. Thanks