Splunk IT Service Intelligence

itsi_event_grouping job always fails, itsirulesengine looks involved

TorbjörnP
Engager

Hi Team,

Enterprise v8.0.6 on W10 platform (Swedish OS)
ITSI 4.4.5 on top of that.
Checked the Known Issues in rel notes for 4.4.5

Background:
Looking in ITSI Health Check dash board I noticed that the  itsi_event_grouping search always fail. (Starts to run but then fails)

After some troubleshooting I came across a java exception in itsi_rules_engine.log:

2020-10-15 09:59:30,365 INFO [itsi_re(reId=KJo1,reMode=RealTime)] [main] RulesEngineSearch:52 - RulesEngineTask=RealTimeSearch, Status=Stopped, FunctionMessage="java.lang.NumberFormatException: For input string: "1602698533,696"
at sun.misc.FloatingDecimal.readJavaFormatString(Unknown Source)
at sun.misc.FloatingDecimal.parseDouble(Unknown Source)
at java.lang.Double.parseDouble(Unknown Source)
at com.splunk.itsi.rule.engine.core.utils.CommonUtils.createGroup(CommonUtils.java:747)
at com.splunk.itsi.rule.engine.core.utils.CommonUtils.getRestorableGroupsFromEvents(CommonUtils.java:705)
at com.splunk.itsi.rule.engine.core.TaskManager.restoreGroupState(TaskManager.java:1199)
at com.splunk.itsi.rule.engine.core.TaskManager.preProcessing(TaskManager.java:1285)
at com.splunk.itsi.rule.engine.core.TaskManager.startStreaming(TaskManager.java:1329)
at com.splunk.itsi.search.chunk.RulesEngineSearch.main(RulesEngineSearch.java:50)

Ok, to find out where  the input string: "1602698533,696"  come from

Back to the itsi_rules_engine.log file.
Some lines above the ERROR there is a "groupInfosearch" started:

2020-10-15 09:59:29,954 INFO [itsi_re(reId=1zMs,reMode=RealTime)] [main] TaskManager:344 - FunctionName=RunSplunkSearch, SearchName=groupInfoSearch, Status=Started (Full SearchQueryText below) 

Stripping the search query I could find events from KPI alerts that had this value.

In the: 

itsi_first_event_time1602698533,696

Question: How can I get rid of this value? Or work around so the job can complete successfully?

Since it is there in an event and the itsi_event_group runs over All time(real-time)  my conclusion is that this job will always fail when it encounter this itsi_first_event_time value

Greatful for any inpput on this.

Kind Regards

TobbeP

 

---------------------

This is the SearchQueryText="earliest=-24h latest=now _index_earliest=null _index_latest=null allow_partial_results=false search `itsi_event_management_group_index_with_close_events` | stats max(itsi_group_count) as itsi_group_count values(itsi_is_last_event) as itsi_is_last_event max(itsi_last_event_time) as itsi_last_event_time first(itsi_parent_group_id) as itsi_parent_group_id first(itsi_policy_id) as itsi_policy_id first(itsi_split_by_hash) as itsi_split_by_hash first(itsi_first_event_id) as itsi_first_event_id min(itsi_first_event_time) as itsi_first_event_time min(itsi_earliest_event_time) as itsi_earliest_event_time latest(itsi_group_assignee) as itsi_group_assignee latest(itsi_group_description) as itsi_group_description latest(itsi_group_severity) as itsi_group_severity latest(itsi_group_status) as itsi_group_status latest(itsi_group_ace_template_id) as itsi_group_ace_template_id latest(itsi_group_title) as itsi_group_title by itsi_group_id | where itsi_is_last_event!="true" | sort 0 -itsi_last_event_time | lookup itsi_notable_group_user_lookup _key AS itsi_group_id OUTPUT owner severity status | lookup itsi_notable_group_system_lookup _key AS itsi_group_id OUTPUT is_active | where is_active=1 | eval itsi_group_assignee=coalesce(owner, itsi_group_assignee), itsi_group_severity=coalesce(severity, itsi_group_severity), itsi_group_status=coalesce(status, itsi_group_status)"

 

Labels (2)
0 Karma
1 Solution

TorbjörnP
Engager

Just wanted to give an update on this.

Reconfigured server and clients running universal forwarder to use en_US formatting and en_US for location and waited for data to age out and that seemed to have made the trick...

View solution in original post

0 Karma

TorbjörnP
Engager

Just wanted to give an update on this.

Reconfigured server and clients running universal forwarder to use en_US formatting and en_US for location and waited for data to age out and that seemed to have made the trick...

0 Karma
Get Updates on the Splunk Community!

What’s new on Splunk Lantern in August

This month’s Splunk Lantern update gives you the low-down on all of the articles we’ve published over the past ...

Welcome to the Future of Data Search & Exploration

You have more data coming at you than ever before. Over the next five years, the total amount of digital data ...

This Week's Community Digest - Splunk Community Happenings [8.3.22]

Get the latest news and updates from the Splunk Community here! News From Splunk Answers ✍️ Splunk Answers is ...