Solved: How to create an hourly alert when never seen befo...

gesman · ‎02-13-2015

I need to build an hourly alert when never seen before events (with certain unique characteristics) appear in index.
In essence - this is the logic:

Get list of fresh events (with 3 fields of interest: field1,2,3): index=mydata earliest=-1h@h | dedup field1, field2, field3
For each found event, run this pseudo search: index=mydata latest=-1h@h field1 (field2 OR field3) | stats count
Alert if count=0 - in other words - alert if there were no previously seen events with field1 AND (field2 OR field3) found at step #1.
Return original event that generated the alert.

Suggestions are welcome.

gesman · ‎02-24-2015

I found the solution to the task.

First, to simplify the task definition: Be able to execute very custom query per each found event and collect the results.

Lookups allows you to find matches within lookup source based on field1 AND field2 AND field3 logic.
Subsearches allows you to pull subsearch results and then run the outer query either in field1 AND field2 AND field3 or field1 OR field2 OR field3 manner. You may control "outer" behavior of subsearch results with ... | format ... parameter somewhat, but not to the extend of having custom-crafted outer query based on the returned results.

So the solution to the task is based on the ability to craft very custom search as a string and then return it to the outer search as a single search field.
Outer search will take it as is and execute it.

The trick was to prevent Splunk from post-tweaking the search and getting confused by some elements of it.
For example Splunk would not allow search string to contain ...earliest=... latest=... elements. Splunk would get confused if returned search string contained aliases.
The solution to both was to code them within macros and then include macro within search= string to be returned to the outer search.

This is the blueprint of the solution.
Notes: index=NONEXISTENT - again is a trick to prevent Splunk from getting confused. Without it would try to search for everything before main subsearch. index=NONEXISTENT will solve it by causing Splunk to quickly return zero results and focus on subsearch business.

Below you'll notice that per each pre-found event the mind-boggling custom search query is crafted that contains double-nested subsearches by itself. While looking a bit scary - what it does is created an empty event first with previous_match_found=0. Then it will run custom search and if any results found (grabs only the first one to save time) - the previous_match_found value becomes 1.

Once the whole monster executes - it's easy to only filter on ... | where previous_match_found=0 ... to accomplish the final task - finding unique events that never occured before.

`index=NONEXISTENT
[search ...searching for special events... | fields field1, field2, field3

| eval COMMENT="Here we got events. Now per each event - craft custom search query:"

| eval COMMENT="Glue all queries together into the single one:"

| stats values(search_this) AS all_searches
| eval search=mvjoin(all_searches, " ")

| eval COMMENT="Finally, return main combined search query back to the outer search :"

| fields search
]
| where previous_match_found=0
...
`
On a final note - this search apparently runs pretty fast considering the volume of data.

View solution in original post

gesman · ‎02-24-2015