Splunk Search

Why is search not working properly on duplicate inner records?

user9025
Path Finder

I have a splunk query, in which my intention is to get all ipAddress for which "EVENT A" occurred in last 22 hours starting from 4 hours before,  but "EVENT B" is not there in last 24 hours for same IpAddress.

It is known that "Event A" will have one occurrence for Ip address,(if any), but "Event B" will have ,multiple occurrences.

Following is the query:

 

 

index=prod-* sourcetype="kube:service" "Event A"  earliest=-24h latest=-4h  |table IpAddress | search NOT [search index=prod-* sourcetype="kube:service" AND ("Event B")  earliest=-24h latest=-0h |table IpAddress ]

 

 

Why the first query is not working fine?

This does not work fine and return the results, even if, there is an Ip address for "Event A" and multiple events for same Ip address "Event B".

But if I add, dedup IpAddress to inner search not query, then it works fine.

Updated query:

 

 

index=prod-* sourcetype="kube:service" "Event A"  earliest=-24h latest=-4h  |table IpAddress | search NOT [search index=prod-* sourcetype="kube:service" AND ("Event B")  earliest=-24h latest=-0h |dedup IpAddress|table IpAddress ]

 

 

Labels (2)
Tags (2)
0 Karma
1 Solution

jdunlea
Contributor

If you have a lot of events with "EVENT B" in your data, then you might be hitting the event limit for the subsearch (10k events). Therefore the subsearch will return only the first 10k events, which might only have a small number of IP addresses (if many events have the same IP address).

 

Using dedup will make the result count much smaller and probably have less than 50k IP addresses, so the subsearch can return all of the IP addresses to the first search and then do the filtering. 

 

Side note: You might be able to do this using a single search (no subsearch) by doing something like the following (please note: you will need to create the event_flag field yourself using your own regex/match)

 

index=prod-* sourcetype="kube:service" ("Event A"  earliest=-24h latest=-4h) OR ("Event B" earliest=-24h latest=-0h)  | eval event_flag=if(match(_raw,"Event A"),"Event_A","Event_B")
| stats values(event_flag) as event_flag dc(event_flag) as event_count by IPAddress
| search event_count=1 event_flag="Event_A"

 

 

 

View solution in original post

jdunlea
Contributor

If you have a lot of events with "EVENT B" in your data, then you might be hitting the event limit for the subsearch (10k events). Therefore the subsearch will return only the first 10k events, which might only have a small number of IP addresses (if many events have the same IP address).

 

Using dedup will make the result count much smaller and probably have less than 50k IP addresses, so the subsearch can return all of the IP addresses to the first search and then do the filtering. 

 

Side note: You might be able to do this using a single search (no subsearch) by doing something like the following (please note: you will need to create the event_flag field yourself using your own regex/match)

 

index=prod-* sourcetype="kube:service" ("Event A"  earliest=-24h latest=-4h) OR ("Event B" earliest=-24h latest=-0h)  | eval event_flag=if(match(_raw,"Event A"),"Event_A","Event_B")
| stats values(event_flag) as event_flag dc(event_flag) as event_count by IPAddress
| search event_count=1 event_flag="Event_A"

 

 

 

johnhuang
Motivator

Subsearch have limitations including 10k results and 60 sec runtime. The dedup reduce the number of results to less than 10K.

Subsearch is also inefficient compared to other methods -- you should write a primary search that includes both event types and use stats, etc to filter. If you need help with this, you should provide the actual search terms/fields for Event A and B.

Get Updates on the Splunk Community!

Updated Team Landing Page in Splunk Observability

We’re making some changes to the team landing page in Splunk Observability, based on your feedback. The ...

New! Splunk Observability Search Enhancements for Splunk APM Services/Traces and ...

Regardless of where you are in Splunk Observability, you can search for relevant APM targets including service ...

Webinar Recap | Revolutionizing IT Operations: The Transformative Power of AI and ML ...

The Transformative Power of AI and ML in Enhancing Observability   In the realm of IT operations, the ...