Splunk Search

Search set of strings from lookup and list count of occurance

ma_anand1984
Contributor

Hi,

How can i search several string got as input from look-up and display a table with number of occurrence of each string

I'm trying to find this out for quite some time
Thank you,

1 Solution

sideview
SplunkTrust
SplunkTrust

Restating the goal ( to check that I understood it right)

given a big set of simple terms like "fred" and "mildred"
-- search a bunch of indexed events for all of these terms at once
-- match events that contain one or more of the terms
-- end up with a table showing the number of occurrences of each term in those events (assuming some events match more than one term)

Here's my nutty idea. First, in english:

a) use inputlookup in a subsearch to generate the searchterms
b) use a second inputlookup command in a second subsearch to actually glue ALL of the terms from the entire lookup onto each row of matched events, as a field called foo, with each set of terms separated from the others by some safe character..
c) back in the outer search, use the eval command to split the giant superset of all foo values, by our safe character, generating a multivalue field called foo.
c) then use mvexpand foo and now we instead of N lookup rows and M events, we have N*M events. (It becomes important that your number of terms doesn't get terribly large. 100 should be fine)
d) use a where clause to filter it down to just the rows where the _raw value contains the foo value. (We'll have to lowercase both sides because the like() function is case-sensitive)
e) stats count by foo.

Here it is again, in the search language. Note you'll have to change the name of your lookup and the name of the field in your lookup.

[| inputlookup your_lookup_here | rename yourStringFieldName as search | fields search | format] 
| rename _raw as rawText
eval foo=[| inputlookup your_lookup_here | stats values(yourStringFieldName) as query | eval query=mvjoin(query,",") | fields query | format "" "" "" "" "" ""] 
| eval foo=split(foo,",") 
| mvexpand foo 
| eval foo=lower(foo)     
| eval rawText=lower(rawText) 
| where like(rawText,"%"+foo+"%")
| stats count by foo

View solution in original post

sideview
SplunkTrust
SplunkTrust

Restating the goal ( to check that I understood it right)

given a big set of simple terms like "fred" and "mildred"
-- search a bunch of indexed events for all of these terms at once
-- match events that contain one or more of the terms
-- end up with a table showing the number of occurrences of each term in those events (assuming some events match more than one term)

Here's my nutty idea. First, in english:

a) use inputlookup in a subsearch to generate the searchterms
b) use a second inputlookup command in a second subsearch to actually glue ALL of the terms from the entire lookup onto each row of matched events, as a field called foo, with each set of terms separated from the others by some safe character..
c) back in the outer search, use the eval command to split the giant superset of all foo values, by our safe character, generating a multivalue field called foo.
c) then use mvexpand foo and now we instead of N lookup rows and M events, we have N*M events. (It becomes important that your number of terms doesn't get terribly large. 100 should be fine)
d) use a where clause to filter it down to just the rows where the _raw value contains the foo value. (We'll have to lowercase both sides because the like() function is case-sensitive)
e) stats count by foo.

Here it is again, in the search language. Note you'll have to change the name of your lookup and the name of the field in your lookup.

[| inputlookup your_lookup_here | rename yourStringFieldName as search | fields search | format] 
| rename _raw as rawText
eval foo=[| inputlookup your_lookup_here | stats values(yourStringFieldName) as query | eval query=mvjoin(query,",") | fields query | format "" "" "" "" "" ""] 
| eval foo=split(foo,",") 
| mvexpand foo 
| eval foo=lower(foo)     
| eval rawText=lower(rawText) 
| where like(rawText,"%"+foo+"%")
| stats count by foo

ma_anand1984
Contributor
0 Karma

ma_anand1984
Contributor

Full query looks like



[| inputlookup your_lookup_here | rename yourStringFieldName as search | fields search | format]
| rename _raw as rawText
eval foo=[| inputlookup your_lookup_here | stats values(yourStringFieldName) as query | eval query=mvjoin(query,",") | fields query eval query = "\"".query."\""]
| eval foo=split(foo,",")
| mvexpand foo
| eval foo=lower(foo)

| eval rawText=lower(rawText)
| where like(rawText,"%"+foo+"%")
| stats count by foo

0 Karma

ma_anand1984
Contributor

please check this post, the issue i had is fixed in the post http://splunk-base.splunk.com/answers/31842/why-cant-i-use-subsearch-in-the-case-function-in-the-eva...

Eval part should be like below



eval foo=[| inputlookup your_lookup_here | stats values(yourStringFieldName) as query | eval query=mvjoin(query,",") | fields query | eval query = "\"".query."\""]

0 Karma

ma_anand1984
Contributor

For those who look at this question

I get error in EVAL command. But it works fine in Nick's system. He tried a great deal in helping me, but could not succeed. If i succeed, I will drop a note. If others got some error with the above command and fixed it, please post the same here for the benefit of the community

0 Karma

ma_anand1984
Contributor

@sideview
All simple strings without any space OR NOT

0 Karma

sideview
SplunkTrust
SplunkTrust

Are the strings just simple strings with no spaces, or are some of them multiple search terms (ie each one has spaces in it), or are they complex search expressions with parentheses, OR's and NOT's in them. And as a followup, if they do contain spaces, but not complex logic (parens, OR's, NOT's), then do you want the terms searched for as a single string, ie "foo bar baz", where it matches only if the exact string "foo bar baz" is in the raw event? or do you want it to match any event that has "foo", "bar" and "baz" as indexed terms anywhere in the event.

dart
Splunk Employee
Splunk Employee

Does
|inputlookup my_lookup | stats count by field
do what you need?

SarahBOA
Path Finder

eventtype won't work unfortunately. Due to permissions on the splunk instance and server etc, we need to be able to pull the list (which will be changed by non-splunk admin users) in from a csv file.

0 Karma

dart
Splunk Employee
Splunk Employee

Could you define the things you are searching for as eventtypes? Then count by eventtype

0 Karma

SarahWKarvenz
Path Finder
  1. The other issue is one of performance. It may be possible to have over 100 strings that we are searching for and wanting to count....but also that this query has to be run every 2 minutes on a large set of data - looping through the results set for each string is not possible.

I am wondering how the splunk highlighting works - essentially we would like to make those highlighted words into a single multivalued field which we could then count.

We would like the final output to look like:
search_string count_of_times_in_last_two_minutes

0 Karma

SarahWKarvenz
Path Finder

No, it doesn't. There are several issues that we are dealing with:

  1. First, the problem is that the search strings are not a field.
  2. It would be great if we could make the search strings display as a field or set of fields, but again, a few issues that we need to resolve first. a. The logs are free form and there is no standard format in order to find the specified text. b. It is possible to have more than one matching string in an event.
0 Karma

ma_anand1984
Contributor

I didnt get the answer i need. Thought my question was not clear enough. I closed the other question

0 Karma

Ayn
Legend
0 Karma
Get Updates on the Splunk Community!

.conf24 | Day 0

Hello Splunk Community! My name is Chris, and I'm based in Canberra, Australia's capital, and I travelled for ...

Enhance Security Visibility with Splunk Enterprise Security 7.1 through Threat ...

(view in My Videos)Struggling with alert fatigue, lack of context, and prioritization around security ...

Troubleshooting the OpenTelemetry Collector

  In this tech talk, you’ll learn how to troubleshoot the OpenTelemetry collector - from checking the ...