Splunk Search

How to search using a set of values?

olawalePS
Path Finder

Hello All,

I am relatively new to splunk and I am trying to search using sets. Sets here refers to a group of values that I import into splunk and then search the logs from a data source for values that match any of the values in the set. Something like a reference set in Qradar.

The usecase I am trying to implement is an alert for blacklisted applications. I have a .csv file that contains two columns, application name & application category. I want to import this data into Splunk and then use the values in the application name column to search against the processName field of the logs from the endpoint security solution. 

How do I achieve this on Splunk? I have read through the documentation for lookup but I did not understand how it would help me achieve my objective.

Labels (1)
0 Karma
1 Solution

bowesmana
SplunkTrust
SplunkTrust

Two ways to do this - the choice will depend on data volume, field cardinality. They are

1. Filter with subsearch

your_search [
  | inputlookup your_lookup.csv 
  | fields application_name
  | rename application_name as processName ]

this will filter the raw data that comes from the index to only those processName values that match the application_name from the lookup. If the lookup is large then this may be slower as first the subsearch runs and then the data returned is added as a constraint (A=1 OR A=2 OR A=3) etc to the outer search

 

2. Using a lookup

your_search 
| lookup your_lookup.csv application_name as processName OUTPUT processName as foundProcessName
| where isnotnull(foundProcessName)

 this pulls all the data from the index and then lookup up the application_name. The OUTPUT will create a new field 'foundProcessName' if the processName exists in the lookup, so then isnotnull will filter out only those found.

Performance will vary between the two, so look at the job inspector to see which one is the right one for your data.

 

View solution in original post

0 Karma

bowesmana
SplunkTrust
SplunkTrust

Two ways to do this - the choice will depend on data volume, field cardinality. They are

1. Filter with subsearch

your_search [
  | inputlookup your_lookup.csv 
  | fields application_name
  | rename application_name as processName ]

this will filter the raw data that comes from the index to only those processName values that match the application_name from the lookup. If the lookup is large then this may be slower as first the subsearch runs and then the data returned is added as a constraint (A=1 OR A=2 OR A=3) etc to the outer search

 

2. Using a lookup

your_search 
| lookup your_lookup.csv application_name as processName OUTPUT processName as foundProcessName
| where isnotnull(foundProcessName)

 this pulls all the data from the index and then lookup up the application_name. The OUTPUT will create a new field 'foundProcessName' if the processName exists in the lookup, so then isnotnull will filter out only those found.

Performance will vary between the two, so look at the job inspector to see which one is the right one for your data.

 

0 Karma

olawalePS
Path Finder

Screen Shot 2022-09-20 at 10.34.14.png
 Screen Shot 2022-09-20 at 12.32.43.png


@bowesmana I still do not understand how to run the search, The images above are a sample log from the EDR and the csv table. I want a search that will use wildcards to check if the process_cmdline value in the EDR logs matches any of the values in the application name column of the csv.  I can then save it as an alert to notify me if a user has one of those applications running.

will the search query you provided achieve that?

0 Karma

bowesmana
SplunkTrust
SplunkTrust

Yes.

However, if you need to handle wildcards in the lookup, then you will need to make a "lookup definition" that is based on your lookup file.csv. In that definition, advanced properties, set the match type to WILDCARD(ApplicationName).  I would suggest that you remove spaces from the field names to make this easier to do.

Then you will also need to add a leading and trailing * character to each of the application names, so the lookup can work as wildcard.

Then either of the solutions I gave should work - removing spaces from the field names will make life easier in general.

Try the options and let me know how it goes.

 

0 Karma
Get Updates on the Splunk Community!

Cloud Platform | Customer Change Announcement: Email Notification Will Be Available ...

The Notification Team is migrating our email service provider since currently there’s no support ...

Mastering Synthetic Browser Testing: Pro Tips to Keep Your Web App Running Smoothly

To start, if you're new to synthetic monitoring, I recommend exploring this synthetic monitoring overview. In ...

Splunk Edge Processor | Popular Use Cases to Get Started with Edge Processor

Splunk Edge Processor offers more efficient, flexible data transformation – helping you reduce noise, control ...