For a saas scenario where separate instances exist per customer, we already use AWS tags to mark instances by customer and other metadata. How would we make this instance metadata in AWS Tags visible from Splunk? Ideally we'd be able to see these values as fields. Looks like the Splunk for AWS app has this concept, any advice for making it happen for app instances with forwarders installed?
Yes. The tags are really what differentiate instances. If that were visible as metadata about the source for instance, we could share identical forwarder setups and use instance tags to separate reports per client
You've got a couple of challenges to deal with. First, there isn't anything native in the forwarder to retrieve the tags. And second, the tags aren't available via the ec2 metadata.
That being said, there are probably a couple of ways you could add the tag data.
You can give your ec2 instances access to an iam role that allows them to query ec2 data (ie, ec2:describe*). Then you can run a script on the ec2 instance using the AWS cli or API and get the tag values. There's a discussion on how to do that at the link below (and plenty of other resources available online).
So now you have a script that retrieves the tag values. Next you need to get that into Splunk. A simple solution would be to run the script periodically as a scripted input.
Splunk can run the script on a schedule (say, every 5 minutes) and index the standard out from the script. Ideally this data would be formatted so the instance ID and the tag name/values are automatically extracted (ie name=value pairs, json, etc). You can run this on every forwarder, and have each forwarder query its own tag data, or run it on a single forwarder/splunk server and retrieve all of the tags for all of the instances.
Once you have the tag values in Splunk, you can populate a lookup file using a scheduled search. Here's an example of using inputlookup and outputlookup:
Then you apply this lookup table automatically, so the tag values are available when you search.
If you change your instance tags frequently, you could add a time field to your lookup. Then your search results would show the tag value at the time the of the event, rather than the most recent tag value. For example, if your instance was tagged with env=test on Monday, and then changed to env=prod on Tuesday, the events from Monday would have the tag value env=test and the events from Tuesday would have env=prod. Depending on how many instances and tags you have, how frequently they change, and how much historical tag data you need to keep, this file could grow pretty large.
Instead of populating a lookup with the instance data, you could instead use a script to pull the tags and modify your inputs.conf file to add the fields as indexed fields. This is probably something you'd do a single time when you spin up the instance. See the article below:
There are some caveats about creating new indexed fields: