Splunk Search

ARIMA convert integer machine learning

nsantiago17
Explorer

I'm trying to run this query below:

(index=A sourcetype=jobs_info JOB_NAME IN (ACQUA)) OR (index=B sourcetype=FIRE) OR (index=C sourcetype=EARTH)

| eval _time = strftime(_time, "%Y-%m-%d")
| eval START_TIME = strptime(START_TIME,"%Y%m%d%H%M%S")
| eval END_TIME = strptime(END_TIME,"%Y%m%d%H%M%S")
| eval EXECUTION_TIME = END_TIME-START_TIME

| eventstats avg(EXECUTION_TIME) as avg stdev(EXECUTION_TIME) as stdev

| eval lowerBound=(avg-stdev*exact(1.5)), upperBound=(avg+stdev*exact(1.5))
| eval isOutlier=if(EXECUTION_TIME < lowerBound OR EXECUTION_TIME > upperBound, 1, 0)

| stats values(EXECUTION_TIME) as EXECUTION_TIME sum(TNeg) as neg by _time
| where isnotnull(EXECUTION_TIME)
| table _time neg EXECUTION_TIME
| sort - _time

| fit RandomForestRegressor EXECUTION_TIME from "_time" "neg" n_estimators=15 into "teste"
| apply "teste"
| eval predicted(EXECUTION_TIME) = round('predicted(EXECUTION_TIME)', 2)

| stats values(neg) as neg, values(EXECUTION_TIME) as REALEXEC, values(predicted(EXECUTION_TIME)) as EXEC by _time
| eval erro = round(((EXEC/REALEXEC)-1)*100, 2)
| eval _time = tonumber(_time)
| table _time neg REALEXEC EXEC
| sort _time
| fit ARIMA _time EXEC holdback=3 conf_interval=95 order=12-0-1 forecast_k=5 as prediction | forecastviz(5, 3, "EXEC", 95)

And I'm having this error: Error in 'fit' command: Error while fitting "ARIMA" model: cannot convert float NaN to integer.
How can I can fix it and is there some easier way to run my code?

0 Karma

quincybatten
New Member

The ValueError: cannot convert float NaN to integer raised because of Pandas doesn't have the ability to store NaN values for integers. From Pandas v0.24, introduces Nullable Integer Data Types which allows integers to coexist with NaNs. This does allow integer NaNs . This is the pandas integer, instead of the numpy integer.

df['column_name'].astype(np.float).astype("Int32")

 

0 Karma

hkeswani_splunk
Splunk Employee
Splunk Employee

Either your _time or EXEC could be in float format which needs to be changed to the integer type.
Could you show the table for _time and EXEC just before the fit ARIMA command?

0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

Quantify Your Splunk Investment Impact: Introducing Savings Metrics to Value Insights

Building on the foundation established in our initial Value Insights releases, we are introducing the Savings ...

Event Series: Telemetry Pipeline Management

Balancing Scale and Spend: Gaining Control Over High-Volume Metrics in Splunk Observability Cloud As ...

Kick the Tires Before You Commit: A Hands-On Tour of the Splunk Observability Cloud ...

Evaluating an enterprise observability platform usually goes like this: fill out a form, get a free trial with ...