Splunk IT Service Intelligence

itsi_event_grouping job always fails, itsirulesengine looks involved

TorbjörnP
Engager

Hi Team,

Enterprise v8.0.6 on W10 platform (Swedish OS)
ITSI 4.4.5 on top of that.
Checked the Known Issues in rel notes for 4.4.5

Background:
Looking in ITSI Health Check dash board I noticed that the  itsi_event_grouping search always fail. (Starts to run but then fails)

After some troubleshooting I came across a java exception in itsi_rules_engine.log:

2020-10-15 09:59:30,365 INFO [itsi_re(reId=KJo1,reMode=RealTime)] [main] RulesEngineSearch:52 - RulesEngineTask=RealTimeSearch, Status=Stopped, FunctionMessage="java.lang.NumberFormatException: For input string: "1602698533,696"
at sun.misc.FloatingDecimal.readJavaFormatString(Unknown Source)
at sun.misc.FloatingDecimal.parseDouble(Unknown Source)
at java.lang.Double.parseDouble(Unknown Source)
at com.splunk.itsi.rule.engine.core.utils.CommonUtils.createGroup(CommonUtils.java:747)
at com.splunk.itsi.rule.engine.core.utils.CommonUtils.getRestorableGroupsFromEvents(CommonUtils.java:705)
at com.splunk.itsi.rule.engine.core.TaskManager.restoreGroupState(TaskManager.java:1199)
at com.splunk.itsi.rule.engine.core.TaskManager.preProcessing(TaskManager.java:1285)
at com.splunk.itsi.rule.engine.core.TaskManager.startStreaming(TaskManager.java:1329)
at com.splunk.itsi.search.chunk.RulesEngineSearch.main(RulesEngineSearch.java:50)

Ok, to find out where  the input string: "1602698533,696"  come from

Back to the itsi_rules_engine.log file.
Some lines above the ERROR there is a "groupInfosearch" started:

2020-10-15 09:59:29,954 INFO [itsi_re(reId=1zMs,reMode=RealTime)] [main] TaskManager:344 - FunctionName=RunSplunkSearch, SearchName=groupInfoSearch, Status=Started (Full SearchQueryText below) 

Stripping the search query I could find events from KPI alerts that had this value.

In the: 

itsi_first_event_time1602698533,696

Question: How can I get rid of this value? Or work around so the job can complete successfully?

Since it is there in an event and the itsi_event_group runs over All time(real-time)  my conclusion is that this job will always fail when it encounter this itsi_first_event_time value

Greatful for any inpput on this.

Kind Regards

TobbeP

 

---------------------

This is the SearchQueryText="earliest=-24h latest=now _index_earliest=null _index_latest=null allow_partial_results=false search `itsi_event_management_group_index_with_close_events` | stats max(itsi_group_count) as itsi_group_count values(itsi_is_last_event) as itsi_is_last_event max(itsi_last_event_time) as itsi_last_event_time first(itsi_parent_group_id) as itsi_parent_group_id first(itsi_policy_id) as itsi_policy_id first(itsi_split_by_hash) as itsi_split_by_hash first(itsi_first_event_id) as itsi_first_event_id min(itsi_first_event_time) as itsi_first_event_time min(itsi_earliest_event_time) as itsi_earliest_event_time latest(itsi_group_assignee) as itsi_group_assignee latest(itsi_group_description) as itsi_group_description latest(itsi_group_severity) as itsi_group_severity latest(itsi_group_status) as itsi_group_status latest(itsi_group_ace_template_id) as itsi_group_ace_template_id latest(itsi_group_title) as itsi_group_title by itsi_group_id | where itsi_is_last_event!="true" | sort 0 -itsi_last_event_time | lookup itsi_notable_group_user_lookup _key AS itsi_group_id OUTPUT owner severity status | lookup itsi_notable_group_system_lookup _key AS itsi_group_id OUTPUT is_active | where is_active=1 | eval itsi_group_assignee=coalesce(owner, itsi_group_assignee), itsi_group_severity=coalesce(severity, itsi_group_severity), itsi_group_status=coalesce(status, itsi_group_status)"

 

Labels (2)
0 Karma
1 Solution

TorbjörnP
Engager

Just wanted to give an update on this.

Reconfigured server and clients running universal forwarder to use en_US formatting and en_US for location and waited for data to age out and that seemed to have made the trick...

View solution in original post

0 Karma

TorbjörnP
Engager

Just wanted to give an update on this.

Reconfigured server and clients running universal forwarder to use en_US formatting and en_US for location and waited for data to age out and that seemed to have made the trick...

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...