I have a search:
index=_internal sourcetype=scheduler | eval window_time = if(isnotnull(window_time), window_time, 0) | eval execution_latency = max(dispatch_time - (scheduled_time + window_time), 0) | timechart avg(execution_latency) AS avg_exec_latency, count(eval(status=="completed" OR status=="skipped")) AS total_exec, count(eval(status=="skipped")) AS skipped_exec | eval skip_ratio = round(skipped_exec / total_exec * 100, 2)
It works perfectly in search, but when I use this to create KPI inside ITIS service to set "skipped ratio" number.
It doesn't give the result running.
Do anyone have explanation or solution?
You need to select the service, open the KPI, edit the KPI, scroll to the bottom and open the "Generated Search" box. Open this up to see how ITSI is appending aggregation functions and macro to your search. Then hit ctrl + shift + E
to expand the macros, copy and paste. Look at the first stats command which ITSI applied and see whats causing it not to return values
You need to select the service, open the KPI, edit the KPI, scroll to the bottom and open the "Generated Search" box. Open this up to see how ITSI is appending aggregation functions and macro to your search. Then hit ctrl + shift + E
to expand the macros, copy and paste. Look at the first stats command which ITSI applied and see whats causing it not to return values
I find out the macro that doesn't work inside the query. After I update, how can I fix back to the query?. I cannot copy and past back to "General Search"
Thanks
I don't understand, please elaborate
like the long query back end: (Sorry it is long)
index=_internal sourcetype=scheduler
| eval window_time = if(isnotnull(window_time), window_time, 0)
| eval execution_latency = max(dispatch_time - (scheduled_time + window_time), 0)
| timechart avg(execution_latency) AS avg_exec_latency, count(eval(status=="completed" OR status=="skipped")) AS total_exec, count(eval(status=="skipped")) AS skipped_exec
| eval skip_ratio = round(skipped_exec / total_exec * 100, 2)
| eval avg_exec_latency = round(avg_exec_latency, 2)
| stats latest(skip_ratio) AS alert_value by host
| addinfo
| eval info_max_time=if(info_max_time="+Infinity",now() + 315569260,info_max_time)
| eval _time=info_max_time
| fields - info_min_time info_max_time info_search_time info_sid
| eval sec_grp = "default_itsi_security_group"
| lookup itsi_entities identifier.values as host, sec_grp as sec_grp OUTPUT title as entity_title, _key as entity_key, services._key as serviceid, sec_grp as entity_sec_grp
| eval maintenance_object_type = "entity", maintenance_object_key = entity_key
| lookup operative_maintenance_log maintenance_object_type, maintenance_object_key OUTPUT _key as maintenance_log_key
| eval in_maintenance = if(IsNull(maintenance_log_key), 0, 1)
| fields - maintenance_object_key, maintenance_object_type, maintenance_log_key
| eval is_entity_defined=if(isnull(entity_key), "0", "1"), entity_key=if(isnull(entity_key), "N/A", entity_key), entity_title=coalesce(entity_title,'host'), is_service_aggregate="0", is_entity_in_maintenance = in_maintenance
| fields - host, in_maintenance, entity_sec_grp
| eval serviceid = "a3b8a294-60e8-4517-9311-b89040cfa8bb"
| appendpipe
[ stats max(alert_value) AS alert_value by serviceid, is_entity_in_maintenance
| presort 0 auto(serviceid) auto(is_entity_in_maintenance)
| sort 0 serviceid is_entity_in_maintenance
| dedup consecutive=t serviceid
| eval is_all_entities_in_maintenance=is_entity_in_maintenance, is_service_aggregate="1", is_entity_defined="0", entity_key="service_aggregate", entity_title="service_aggregate"]
| addinfo
| eval info_max_time=if(info_max_time="+Infinity",now() + 315569260,info_max_time)
| eval _time=info_max_time
| fields - info_min_time info_max_time info_search_time info_sid
| eval maintenance_service_id = "a3b8a294-60e8-4517-9311-b89040cfa8bb"
| eval maintenance_object_type = "service", maintenance_object_key = maintenance_service_id
| lookup operative_maintenance_log maintenance_object_type, maintenance_object_key OUTPUT _key as maintenance_log_key
| eval in_maintenance = if(IsNull(maintenance_log_key), 0, 1)
| fields - maintenance_object_key, maintenance_object_type, maintenance_log_key
| eval is_service_in_maintenance = in_maintenance
| fields - in_maintenance, maintenance_service_id
| setseverityfields serviceid="a3b8a294-60e8-4517-9311-b89040cfa8bb", kpiid="8209f5d9ed8cd590a79e7174" handle_no_data=true generate_max_severity_event=true fill_data_gaps=true
| fields - is_all_entities_in_maintenance
| eval kpi="Skipped 2", urgency="5", alert_period="5", serviceid="a3b8a294-60e8-4517-9311-b89040cfa8bb"
| eval urgency = if (is_service_in_maintenance == 1, 0, urgency)
There are two lines that I bold need to fix. After I fix them, How can I bring it back to ITSI? I am confused
You have two issues here.. First one is your round, you're 100 is in the wrong spot, it should look like this
| eval skip_ratio = round(skipped_exec / total_exec, 2)*100
Next up, you're entity is host
but your timechart command doesn't pass host. ITSI best practices call for not using transformational commands, but rather let ITSI do all of that. If you still need to use timechart and think eventstats
wont work, then ditch timechart and pass host through your stats like this
| stats avg(execution_latency) AS avg_exec_latency, count(eval(status=="completed" OR status=="skipped")) AS total_exec, count(eval(status=="skipped")) AS skipped_exec by _time, host
Heres your total query which will work
index=_internal sourcetype=scheduler
| eval window_time = if(isnotnull(window_time), window_time, 0)
| eval execution_latency = max(dispatch_time - (scheduled_time + window_time), 0)
| stats avg(execution_latency) AS avg_exec_latency, count(eval(status=="completed" OR status=="skipped")) AS total_exec, count(eval(status=="skipped")) AS skipped_exec by _time, host
| eval skip_ratio = round(skipped_exec / total_exec, 2)*100
| eval avg_exec_latency = round(avg_exec_latency, 2)
| stats latest(skip_ratio) AS alert_value by host
Thank you for helping me figure out.
I just don't agree with round: I believe the correct I use is
| eval skip_ratio = round((skipped_exec / total_exec)*100 , 2)
and I also try with eventstats and split by host field inside, took out the last line about
| stats latest(skip_ratio) AS alert_value by host
and let ITSI do it (as you said)
index=_internal sourcetype=scheduler (status="completed" OR status="skipped" OR status="deferred")
| eval window_time = if(isnotnull(window_time), window_time, 0)
| eval execution_latency = max(dispatch_time - (scheduled_time + window_time), 0)
| eventstats avg(execution_latency) AS avg_exec_latency
count(eval(status=="skipped")) AS skipped_exec , count(eval(status=="completed" OR status=="skipped")) AS total_exec
| eval skip_ratio = round((skipped_exec / total_exec)*100,2)
| eval avg_exec_latency = round(avg_exec_latency, 2)
Your original round looks like this, which is incorrect
| eval skip_ratio = round(skipped_exec / total_exec 100, 2)
The way you wrote it, is correct along with mine.
Can you accept/upvote the answer since your issue is resolved?
| stats latest(skip_ratio) AS alert_value by host
+++++++
| presort 0 auto(serviceid) auto(is_entity_in_maintenance)*
these two lines I will fix and delete
anyone have thought. Can we replace timechart with any command? with Timechart, it doesn't work when place in ITSI