Splunk Search

Searching a Field with Large Number of Selectors - NOT

tom_porter
Explorer

I have a search for which I need to tune out a large number of values (about 25) in a proctitle command field.  Currently using:

NOT proctitle IN ("*<proc1>*", "*<proc2>*", ......., "*<proc25>*")

I'm worried about performance on the search head and am looking for ways to lower the CPU and memory burden.

I have two possible solutions:

1) Create a data model and place this search as a constraint.

2) Tag events on ingest with proctitle IN ("*<proc1>*", "*<proc2>*", ......., "*<proc25>*") and use this tag as a constraint in the data model.

I've played with #1.  Is #2 possible, and is there a more efficient way to do this?

Thanks in advance.

Labels (2)
0 Karma
1 Solution

tom_porter
Explorer

This worked.......I was able to develop a data model that included the following as a constraint:

 

NOT (TERM(proc1) OR TERM(proc2) OR ...........OR TERM(procn))

Thanks,

Tom

View solution in original post

0 Karma

bowesmana
SplunkTrust
SplunkTrust

This will partly depend on what proportion of the total data you are looking to exclude. If the excluded proctitles are a significant proportion of the data, then using a post process where or regex clause may not perform so well, but you will have to play with that.

Setting tags will still involve a search time extraction to evaluate the tag, so under the hood the search is being done.

You might want to look at the TERM directive - see this link 

https://conf.splunk.com/files/2020/slides/PLA1089C.pdf

You will need to understand what constitutes a TERM in your data and whether that will work for your use case, but that can significantly improve performance.

When you are looking at this type of performance issue, go look at the job properties in the job inspector - look at scan count values - the more you scan, the more data you are having to check.

You could go down the indexed extraction route where you set a field at index time, but that is somewhat static and if you need to exclude a new proctitle, then that won't help, but it will improve search performance at the cost of index performance and disk space.

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

Double wild-carded strings are not very efficient. Could you perhaps extract the "proc" values into a field and then use a where command to exclude to events with the undesired values?

0 Karma

tom_porter
Explorer

Hhmmm......here's my dilemma.   My field called proctitle has the entire command in it.  One example is where I have proctitle="/bin/chmod 440 /etc/sudoers" and I want to exclude the chmod term.   I have 32 such terms I need to exclude.

I'll share with you that I am attempting to develop a Linux auditd detection for Account Manipulation per the Mitre Attack Framework https://attack.mitre.org/techniques/T1098/.    This search will look for attempts to modify the sshd_config, passwd, groups, shadow and sudoer file.   In examining existing data, I have determined there are legitimate processes (the 32 terms mentioned) in the proctitle field for the event data that will trigger this alert.   (It was a tedious effort, but I traced through the parent process IDs to come justify this list.)  If I eliminate these 32, my noise is 99% filtered out.

Most of my terms are bounded by major breaks.  The example I used is not, but if I use /bin/chmod instead of chmod, it would work.   Let me try this and report back.

Tags (1)
0 Karma

tom_porter
Explorer

This worked.......I was able to develop a data model that included the following as a constraint:

 

NOT (TERM(proc1) OR TERM(proc2) OR ...........OR TERM(procn))

Thanks,

Tom

0 Karma
Get Updates on the Splunk Community!

Earn a $35 Gift Card for Answering our Splunk Admins & App Developer Survey

Survey for Splunk Admins and App Developers is open now! | Earn a $35 gift card!      Hello there,  Splunk ...

Continuing Innovation & New Integrations Unlock Full Stack Observability For Your ...

You’ve probably heard the latest about AppDynamics joining the Splunk Observability portfolio, deepening our ...

Monitoring Amazon Elastic Kubernetes Service (EKS)

As we’ve seen, integrating Kubernetes environments with Splunk Observability Cloud is a quick and easy way to ...