Trying to see when this search would've triggered an alert over the last few hours. The search normally runs every 10 minutes. It is analyzing standard Windows perfmon data.
The goal is to see when a server had ((Available_MBytes < 100 or Pages_sec > 80) and Committed_Bytes_In_Use>5000) over 3 times in a 10 minute bucket. I've reviewed many stats, bin, and bucket ask Splunk pages, and they all point to using bucket like it is below, but this search only ever returns the last 1 or 2 time buckets.
sourcetype="Perfmon:Memory" counter="Pages/sec" | rename Value as v2 | where v2>5000
| join host,_time [search sourcetype="Perfmon:Memory" counter="Available MBytes" | rename Value as v1]
| join host,_time [search sourcetype="Perfmon:Memory" counter="% Committed Bytes In Use" | rename Value as v3]
| where (v1<100 OR v3>80) AND v2>5000
| bucket _time span=10m
| stats count values(v1) as Available_MBytes values(v2) as Pages_sec values(v3) as Committed_Bytes_In_Use by _time host | where count>3
What am I doing wrong?
A new day, a fresh set of eyes. So it appears that I was running into a limit and that my original search does work once I specify one host.
[subsearch]: Subsearch produced 50000 results, truncating to maxout 50000.
Is _time value same for all three counters, at give instance?
yes, because of the join, if you run this search without the bucket it will show the values lined up in a table.
sourcetype="Perfmon:Memory" counter="Pages/sec" | rename Value as v2 | where v2>5000
| join host,_time [search sourcetype="Perfmon:Memory" counter="Available MBytes" | rename Value as v1]
| join host,_time [search sourcetype="Perfmon:Memory" counter="% Committed Bytes In Use" | rename Value as v3]
| where (v1<100 OR v3>80) AND v2>5000
| stats count values(v1) as Available_MBytes values(v2) as Pages_sec values(v3) as Committed_Bytes_In_Use by host | where count>3
Try this:
sourcetype="Perfmon:Memory" counter="Pages/sec" OR counter="Available MBytes" OR counter="% Committed Bytes In Use"
| eval v1=if(match(counter, "Pages/sec"), Value, null())
| eval v2=if(match(counter, "Available MBytes"), Value, null())
| eval v3=if(match(counter, "% Committed Bytes In Use"), Value, null())
| bucket _time span=10m
| search (v1<100 OR v3>80 OR v2>5000)
| stats values(v1) AS Available_MBytes values(v2) AS Pages_sec values(v3) AS Committed_Bytes_In_Use by _time host
| where (isnotnull(Available_MBytes) OR isnotnull(Pages_sec)) AND isnotnull(Committed_Bytes_In_Use)
Thank you. I modified the search to correct the v1, v2, v3 mapping.
sourcetype="Perfmon:Memory" counter="Pages/sec" OR counter="Available MBytes" OR counter="% Committed Bytes In Use"
| eval v1=if(match(counter, "Available MBytes"), Value, null())
| eval v2=if(match(counter, "Pages/sec"), Value, null())
| eval v3=if(match(counter, "% Committed Bytes In Use"), Value, null())
| bucket _time span=10m
| search (v1<100 OR v2>5000 OR v3>80)
| stats values(v1) AS Available_MBytes values(v2) AS Pages_sec values(v3) AS Committed_Bytes_In_Use by _time host
| where (isnotnull(Available_MBytes) OR isnotnull(Committed_Bytes_In_Use)) AND isnotnull(Pages_sec)
It returns results, it got me a lot closer to solving this, but because of the v1 OR v3, most of the alerts are when it trips v3(Committed_Bytes_In_Use), so the values for v1 are not present. It's a minor thing, but my original search with the join shows all 3 values regardless of if it was triggered from v1 or v3 when the search is run as an active alert search.
This search also needs one more thing, the where count>3. I cant do a count on the current stats line because the count is not accurate at that time, it needs to be done after the v1 or v3 and v2 logic but if I do another stats it will remove data.
Give this a try
sourcetype="Perfmon:Memory" counter="Pages/sec" OR counter="Available MBytes" OR counter="% Committed Bytes In Use"
| where (counter="Pages/sec" AND Value>5000) OR (counter="Available MBytes" AND Value<100) OR (counter="% Committed Bytes In Use" AND Value>80)
| eval counter=case(counter="Pages/sec","v2",counter="Available MBytes","v1",1=1,"v3") | eval temp=_time."-".host
| chart values(Value) over temp by type | where (v1<100 OR v3>80) AND v2>5000
| eval _time=mvindex(split(temp,"-"),0) | eval host=mvindex(split(temp,"-"),1)
| bucket _time span=10m
| stats count values(v1) as Available_MBytes values(v2) as Pages_sec values(v3) as Committed_Bytes_In_Use by _time host | where count>3
#Update
try this
sourcetype="Perfmon:Memory" counter="Pages/sec" OR counter="Available MBytes" OR counter="% Committed Bytes In Use"
| where (counter="Pages/sec" AND Value>5000) OR (counter="Available MBytes" AND Value<100) OR (counter="% Committed Bytes In Use" AND Value>80)
| eval v1=if(match(counter, "Available MBytes"), Value, null())
| eval v2=if(match(counter, "Pages/sec"), Value, null())
| eval v3=if(match(counter, "% Committed Bytes In Use"), Value, null())
| stats count values(v1) as V1 values(v2) as V2 values(v3) as V3 by _time host
| where (V1<100 OR V2>5000 OR V3>80)
| bucket _time span=10m
| stats count values(v1) as Available_MBytes values(v2) as Pages_sec values(v3) as Committed_Bytes_In_Use by _time host | where count>3
updated:
sourcetype="Perfmon:Memory" ((counter="Pages/sec" AND Value>5000) OR (counter="Available MBytes" AND Value<100) OR (counter="% Committed Bytes In Use" AND Value>80))
| eval v1=if(match(counter, "Available MBytes"), Value, null())
| eval v2=if(match(counter, "Pages/sec"), Value, null())
| eval v3=if(match(counter, "% Committed Bytes In Use"), Value, null())
| stats values(v1) as V1 values(v2) as V2 values(v3) as V3 values(_time) as When by _time host | eval When=strftime(When,"%m/%d/%y %H:%M:00")
| where (V1<100 OR V3>80) AND V2>5000
| bucket _time span=10m
| stats count values(V1) as Available_MBytes values(V2) as Pages_sec values(V3) as Committed_Bytes_In_Use values(When) as When by _time host | where count>3
Thank you. The updated search works well across multiple servers because it lets me know which ones would have alerted at that time, but because of the second stats the values for Available_MBytes, Pages_sec, and Committed_Bytes_In_Use are shown as blank column in the statistics tab.
Thank you, but this search doesn't return any data for me, not sure how to troubleshoot it.
Check if query till | where (v1<100 OR v3>80) AND v2>5000
returns anything.
My search as an ongoing alerting search (without the bucket) has triggered, I have a server and a time window that should return when I run this search historically.
If I run your search without the bucket, there are events that match but no statistics tab.
Getting the obvious out of the way - Are you running the search over the correct time range? Also, you don't really need AND v2>5000 later in the search (I'm guessing it is just an artifact of trail and error).
Yep, I've been testing the search with the time window set to the last 4 hours. The second v2>5000 was actually the original place, but then it was added to the first part of the search to try and speed up the search, correct that it doesn't need to be in both places, but it shouldn't affect the bucket functionality. Also, I've been running this on 6.3.3 and just tested it on 6.2.5 and it returns the same on both versions.