Splunk Search

ARIMA convert integer machine learning

nsantiago17
Explorer

I'm trying to run this query below:

(index=A sourcetype=jobs_info JOB_NAME IN (ACQUA)) OR (index=B sourcetype=FIRE) OR (index=C sourcetype=EARTH)

| eval _time = strftime(_time, "%Y-%m-%d")
| eval START_TIME = strptime(START_TIME,"%Y%m%d%H%M%S")
| eval END_TIME = strptime(END_TIME,"%Y%m%d%H%M%S")
| eval EXECUTION_TIME = END_TIME-START_TIME

| eventstats avg(EXECUTION_TIME) as avg stdev(EXECUTION_TIME) as stdev

| eval lowerBound=(avg-stdev*exact(1.5)), upperBound=(avg+stdev*exact(1.5))
| eval isOutlier=if(EXECUTION_TIME < lowerBound OR EXECUTION_TIME > upperBound, 1, 0)

| stats values(EXECUTION_TIME) as EXECUTION_TIME sum(TNeg) as neg by _time
| where isnotnull(EXECUTION_TIME)
| table _time neg EXECUTION_TIME
| sort - _time

| fit RandomForestRegressor EXECUTION_TIME from "_time" "neg" n_estimators=15 into "teste"
| apply "teste"
| eval predicted(EXECUTION_TIME) = round('predicted(EXECUTION_TIME)', 2)

| stats values(neg) as neg, values(EXECUTION_TIME) as REALEXEC, values(predicted(EXECUTION_TIME)) as EXEC by _time
| eval erro = round(((EXEC/REALEXEC)-1)*100, 2)
| eval _time = tonumber(_time)
| table _time neg REALEXEC EXEC
| sort _time
| fit ARIMA _time EXEC holdback=3 conf_interval=95 order=12-0-1 forecast_k=5 as prediction | forecastviz(5, 3, "EXEC", 95)

And I'm having this error: Error in 'fit' command: Error while fitting "ARIMA" model: cannot convert float NaN to integer.
How can I can fix it and is there some easier way to run my code?

0 Karma

quincybatten
New Member

The ValueError: cannot convert float NaN to integer raised because of Pandas doesn't have the ability to store NaN values for integers. From Pandas v0.24, introduces Nullable Integer Data Types which allows integers to coexist with NaNs. This does allow integer NaNs . This is the pandas integer, instead of the numpy integer.

df['column_name'].astype(np.float).astype("Int32")

 

0 Karma

hkeswani_splunk
Splunk Employee
Splunk Employee

Either your _time or EXEC could be in float format which needs to be changed to the integer type.
Could you show the table for _time and EXEC just before the fit ARIMA command?

0 Karma
Get Updates on the Splunk Community!

Join Us for Splunk University and Get Your Bootcamp Game On!

If you know, you know! Splunk University is the vibe this summer so register today for bootcamps galore ...

.conf24 | Learning Tracks for Security, Observability, Platform, and Developers!

.conf24 is taking place at The Venetian in Las Vegas from June 11 - 14. Continue reading to learn about the ...

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...