All Apps and Add-ons

Splunk AWS Best Practices & Naming Conventions

thomastaylor
Communicator

As we're integrating our AWS logs into Splunk, I wanted to request from others what may be the best practices going forward. At the moment, we're going to implement the naming convention of companyname_aws_(prod|dev|qa); however, we do additionally have a cloudtrail sourcetype that ingests information from all three environments?

We also have the AWS Add-on for Splunk: do we use the search provided with that app, or should we create our knowledge objects in Search & Reporting? Is it better practice to limit users to specific apps, or limit users to specific indexes?

Any thoughts?

zonistj
Path Finder

I work with AWS data in Splunk, and I have a few tips.

First, I agree with everything that Rich Galloway said. That's great advice for all Splunk apps.

AWS can be a massive amount of data depending on how many accounts you have, what services you use, and other specifics for your use case. Here are my suggestions:

Account Lookup Table

One thing that I've found incredibly useful is to use a lookup table for human-friendly names for the accounts. I don't know what account ID 123456789 is just by the number, so I use the lookup table to enrich the account ID with something like "prod_application_stack." This is very useful for creating metrics and dashboards that you can present to people, but mostly my infrastructure admins are more responsive to a human-friendly name than an account ID.

AWS Description Data

When you setup your ingestion for AWS Description (sourcetype=aws:description) make sure to configure aws_description_tasks.conf to pull for ALL available services for ALL available regions, and not just the ones you think you use. I've identified instances where developers forgot to be in the correct region, and ended up putting resources somewhere they shouldn't. You could also detect account compromise this way (if you have a security inclination) because bad guys will sometimes create resources where they think you aren't looking.

I think the default ingestion rate for description data is five minutes; which we throttled back to reduce Splunk license utilization. We just don't need the data updated that often. Figure out what's right for you to balance your license usage.

If you wanted it to pull every 15 minutes, your stanza should look like this:

[desc:account_id]
account = Instance profile for your heavy-weight forwarder
aws_iam_role = IAM role that has access to the account (if using multiple accounts)
apis = ec2_instances/900,ec2_reserved_instances/900,ebs_snapshots/900,ec2_volumes/900,ec2_security_groups/900,ec2_key_pairs/900,ec2_images/900,ec2_addresses/900,elastic_load_balancers/900,classic_load_balancers/900,application_load_balancers/900,vpcs/900,vpc_subnets/900,vpc_network_acls/900,cloudfront_distributions/900,rds_instance/900,lambda_functions/900,s3_buckets/900,iam_users/900
index = AWS INDEX
regions = us-west-2,us-west-1,us-east-1,us-east-2,eu-central-1,ap-northeast-1,ap-northeast-2,ap-northeast-3,ap-south-1,ap-southeast-1,ap-southeast-2,ca-central-1,cn-north-1,cn-northwest-1,eu-west-1,eu-west-2,eu-west-3,sa-east-1
sourcetype = aws:description

Field Parsing

The field parsing out of the box with the app is okay, but I've had to customize it a bit for my purposes. I would validate that the fields work the way you want them to work. If they don't, there's no real harm in creating your own props.conf file and sticking it in a "local" directory in the AWS app.

Guard Duty

If you're not planning to use AWS Guard Duty, you should. AWS can detect things that you can't because they have access to underlying data that they don't expose to you. This will be a real lifesaver if anything funky happens. Trust me.

CloudTrail

CloudTrail is awesome, but it's also riddled with lots of nested JSON. This is one of the big areas that I forked from the main AWS app. I don't like working with dot notation in my field names, and I don't want to impose that on my users. It makes SPL unnecessarily long and ugly. At a minimum, I recommend using props.conf to field alias out of the JSON dot notation if you can.

Miscellany

I've found creating asset inventories to be invaluable. I'm currently using a method where I have a scheduled search that runs every few hours, and outputs my EC2 inventory data to a lookup table. I'm in the process of re-writing that to utilize the Inventory datamodel instead. I recommend you do something similar.

If you know for a fact that there are only certain regions you use, create some dashboards (or even better scheduled searches that notify someone) if new resources or CloudTrail API activity occurs in the regions you don’t use. There are things that happen by default, but you should be able to tune out the noise and identify when something that shouldn't be there gets created.

Audit your security groups. It's kind of a pain because of the nested JSON. This will give you a good view of your ingress security group rules:

index=* sourcetype=aws:description source=*ec2_security_groups 
| spath path=rules{} output=a 
| table account_id, description, id, name, owner_id, region, vpc_id, a 
| fields - _raw 
| mvexpand a 
| spath input=a 
| where from_port!="null" AND to_port!="null" AND 'grants{}.cidr_ip'!="null"
| fields - a

This will give you the same view of your egress security group rules:

index=* sourcetype=aws:description source=*ec2_security_groups
| spath path=rules_egress{} output=a 
| table account_id, description, id, name, owner_id, region, vpc_id, a 
| fields - _raw 
| mvexpand a 
| spath input=a 
| where from_port!="null" AND to_port!="null" AND 'grants{}.cidr_ip'!="null"
| fields - a

And if you want to specifically look for potentially insecure rules, I'd add a match for "/0" to the final where clause:

index=* sourcetype=aws:description source=*ec2_security_groups
| spath path=rules_egress{} output=a 
| table account_id, description, id, name, owner_id, region, vpc_id, a 
| fields - _raw 
| mvexpand a 
| spath input=a 
| where from_port!="null" AND to_port!="null" AND 'grants{}.cidr_ip'!="null" AND match('grants{}.cidr_ip', "/0")
| fields - a

That will check for rules that are set with "0.0.0.0/0"; which means any IP address. If you see that and it's not something like port 443 or port 80, you might want to investigate why that rule exists.

That's off the top of my head. I'll comment again if I think of anything else.

thomastaylor
Communicator

Thank you for all of this great insight! I wanted to ask you: do you still use the AWS app, or do you mainly use your own app that you ported; also, have you created a company app where you search for AWS resources, or did you have you own exclusive app for AWS? Although the AWS app out of the box includes a lot of features, some of it is unnecessary for our means and we'd like to just have our own non-cluttered app.

It's interesting that @richgalloway stated to not use the S&R. Following that advice, I'm going to create a company app where the general searching will be mandated (should we create another company app for AWS?). I want to ensure that the best practices are met now.

Also another thing: What do you name your AWS Indexes? We're stuck on that.
We want to do something like this:

companyname_aws_prod
companyname_aws_dev
companyname_aws_qa
companyname_aws_security

However, we have it setup where cloudtrail logs come from one environment, aws_security, that pull in from all the different environments; thus making it confusing as to how to delegate those logs since it acts as a repo for all the environments.

Any response would be greatly appreciated!

0 Karma

zonistj
Path Finder

do you still use the AWS app, or do you mainly use your own app that you ported

Both, but for different purposes. The AWS app is great for getting the data into Splunk in a clean way. Specifically, "inputs.conf", "aws_description_tasks.conf", "aws_account_ext.conf", and the other data ingestion configs are what we use.

We created our own app for reports, dashboards, field aliasing, and scheduled searches. Each AWS customer has a different setup and different things they want to do with the data, so it's hard for Splunk to develop one central app that can account for everyone's needs. As you astutely pointed out, the AWS app comes with a lot of content and not all of it is needed for you. For a different customer, they might want to use the stuff that you don't want to use. I think that's part of @richgalloway's logic for creating your own app for that stuff.

we'd like to just have our own non-cluttered app.

Exactly. You should go make your own that has exactly what you need / want.

Following that advice, I'm going to create a company app where the general searching will be mandated

This is a great idea. We do that too. We actually have two different company apps: One for the operations team (support, developers, infrastructure admins, etc...) and one for the security team. This lets us segregate content out in a clean way.

should we create another company app for AWS?

I would recommend that. Think of it like this, you can use the Splunk AWS app to handle ingestion of the data, create your company app to handle user knowledge objects like scheduled searches, dashboards, and reports, and your own company AWS app to handle any field aliasing or other data normalization you might want to do that meets your specific needs. You just have to make sure the permissions are setup correctly that the apps can all use the necessary resources across apps.

One major benefit of this setup is that it creates a blast radius between configurations and knowledge objects for each app. Say if you pivot away from AWS and start using Azure instead, you can just disable / delete / archive the AWS apps and not worry about having to surgically remove that content from your company app. Among other benefits... 🙂

What do you name your AWS Indexes?

We do something similar to what you are describing. We don't aggregate CloudTrail into a security index though. We just send those to the appropriate prod / QA / dev indexes, and the security team has access to those indexes. I don't consider CloudTrail to be security content per se, so there's no need for us to put that in the security index.

I think that answers all of your questions, but if I've missed something, or you have additional questions, please let me know.

Edit to add: Depending on your needs, make use of data models and/or accelerated reports for high-volume CloudTrail and ELB access log data. This might require a some tweaking to get exactly what you need, but it's worth it.

0 Karma

thomastaylor
Communicator

Thanks to both of you for your great answers! @richgalloway & @zonistj !

Is there a way I can accept multiple answers?

0 Karma

richgalloway
SplunkTrust
SplunkTrust

There can be only one accepted answer.

---
If this reply helps you, Karma would be appreciated.
0 Karma

zonistj
Path Finder

Thanks for accepting my answer, but feel free to accept Rich's answer if you can. Sharing knowledge with and helping other Splunk users is what's important to me. I don't pay attention to the Internet points.

0 Karma

richgalloway
SplunkTrust
SplunkTrust

I recommend creating another company app for AWS. This practice is tidier, makes it easier to find objects, and makes it much easier to replace if a new app comes along or you leave AWS.

Index names beginning with _ are reserved for use by Splunk.

Your CloudTrail logs should have your AWS account ID in them, which you might be able use to use to distinguish environments. If account ID doesn't help then try another field like instance ID. You may need a lookup file to map instance to environment.

---
If this reply helps you, Karma would be appreciated.
0 Karma

thomastaylor
Communicator

Thanks for your quick input! I have edited my original question to include the full index name. I didn't mean for it to start with an underscore.

Secondly, the cloudtrail logs definitely have an accountId in them, which is how we differentiate them now; however, we're confused as to what sort of index naming convention we should use. Following what I stated earlier, the indexes we're planning on using:

companyname_aws_prod
companyname_aws_qa
companyname_aws_dev

The only tricky part is where to ingest the cloudtrail logs? Since the cloudtrail logs contain information from all three of the different environments?

0 Karma

richgalloway
SplunkTrust
SplunkTrust

Where you put the cloudtrail logs is up to you. I would put it where it will be used/needed most, which probably is Prod.

---
If this reply helps you, Karma would be appreciated.
0 Karma

richgalloway
SplunkTrust
SplunkTrust

I'll let those more knowledgeable about AWS answer those parts of your question. I'll comment in general.

You should use the add-on if it meets yours needs. Don't re-invent the wheel if you don't need to.
Don't create local knowledge objects in the S&R app. Create your own companyname app, instead.
The only sure way to limit access to data is to put the data into it's own index and limit access to the index. Access rights and retention time are the two chief reasons for creating a new index.

---
If this reply helps you, Karma would be appreciated.
Get Updates on the Splunk Community!

Developer Spotlight with Paul Stout

Welcome to our very first developer spotlight release series where we'll feature some awesome Splunk ...

State of Splunk Careers 2024: Maximizing Career Outcomes and the Continued Value of ...

For the past four years, Splunk has partnered with Enterprise Strategy Group to conduct a survey that gauges ...

Data-Driven Success: Splunk & Financial Services

Splunk streamlines the process of extracting insights from large volumes of data. In this fast-paced world, ...