Splunk Search

ARIMA convert integer machine learning

nsantiago17
Explorer

I'm trying to run this query below:

(index=A sourcetype=jobs_info JOB_NAME IN (ACQUA)) OR (index=B sourcetype=FIRE) OR (index=C sourcetype=EARTH)

| eval _time = strftime(_time, "%Y-%m-%d")
| eval START_TIME = strptime(START_TIME,"%Y%m%d%H%M%S")
| eval END_TIME = strptime(END_TIME,"%Y%m%d%H%M%S")
| eval EXECUTION_TIME = END_TIME-START_TIME

| eventstats avg(EXECUTION_TIME) as avg stdev(EXECUTION_TIME) as stdev

| eval lowerBound=(avg-stdev*exact(1.5)), upperBound=(avg+stdev*exact(1.5))
| eval isOutlier=if(EXECUTION_TIME < lowerBound OR EXECUTION_TIME > upperBound, 1, 0)

| stats values(EXECUTION_TIME) as EXECUTION_TIME sum(TNeg) as neg by _time
| where isnotnull(EXECUTION_TIME)
| table _time neg EXECUTION_TIME
| sort - _time

| fit RandomForestRegressor EXECUTION_TIME from "_time" "neg" n_estimators=15 into "teste"
| apply "teste"
| eval predicted(EXECUTION_TIME) = round('predicted(EXECUTION_TIME)', 2)

| stats values(neg) as neg, values(EXECUTION_TIME) as REALEXEC, values(predicted(EXECUTION_TIME)) as EXEC by _time
| eval erro = round(((EXEC/REALEXEC)-1)*100, 2)
| eval _time = tonumber(_time)
| table _time neg REALEXEC EXEC
| sort _time
| fit ARIMA _time EXEC holdback=3 conf_interval=95 order=12-0-1 forecast_k=5 as prediction | forecastviz(5, 3, "EXEC", 95)

And I'm having this error: Error in 'fit' command: Error while fitting "ARIMA" model: cannot convert float NaN to integer.
How can I can fix it and is there some easier way to run my code?

0 Karma

quincybatten
New Member

The ValueError: cannot convert float NaN to integer raised because of Pandas doesn't have the ability to store NaN values for integers. From Pandas v0.24, introduces Nullable Integer Data Types which allows integers to coexist with NaNs. This does allow integer NaNs . This is the pandas integer, instead of the numpy integer.

df['column_name'].astype(np.float).astype("Int32")

 

0 Karma

hkeswani_splunk
Splunk Employee
Splunk Employee

Either your _time or EXEC could be in float format which needs to be changed to the integer type.
Could you show the table for _time and EXEC just before the fit ARIMA command?

0 Karma
Get Updates on the Splunk Community!

Now Available: Cisco Talos Threat Intelligence Integrations for Splunk Security Cloud ...

At .conf24, we shared that we were in the process of integrating Cisco Talos threat intelligence into Splunk ...

Preparing your Splunk Environment for OpenSSL3

The Splunk platform will transition to OpenSSL version 3 in a future release. Actions are required to prepare ...

Easily Improve Agent Saturation with the Splunk Add-on for OpenTelemetry Collector

Agent Saturation What and Whys In application performance monitoring, saturation is defined as the total load ...