Splunk Search

ARIMA convert integer machine learning

nsantiago17
Explorer

I'm trying to run this query below:

(index=A sourcetype=jobs_info JOB_NAME IN (ACQUA)) OR (index=B sourcetype=FIRE) OR (index=C sourcetype=EARTH)

| eval _time = strftime(_time, "%Y-%m-%d")
| eval START_TIME = strptime(START_TIME,"%Y%m%d%H%M%S")
| eval END_TIME = strptime(END_TIME,"%Y%m%d%H%M%S")
| eval EXECUTION_TIME = END_TIME-START_TIME

| eventstats avg(EXECUTION_TIME) as avg stdev(EXECUTION_TIME) as stdev

| eval lowerBound=(avg-stdev*exact(1.5)), upperBound=(avg+stdev*exact(1.5))
| eval isOutlier=if(EXECUTION_TIME < lowerBound OR EXECUTION_TIME > upperBound, 1, 0)

| stats values(EXECUTION_TIME) as EXECUTION_TIME sum(TNeg) as neg by _time
| where isnotnull(EXECUTION_TIME)
| table _time neg EXECUTION_TIME
| sort - _time

| fit RandomForestRegressor EXECUTION_TIME from "_time" "neg" n_estimators=15 into "teste"
| apply "teste"
| eval predicted(EXECUTION_TIME) = round('predicted(EXECUTION_TIME)', 2)

| stats values(neg) as neg, values(EXECUTION_TIME) as REALEXEC, values(predicted(EXECUTION_TIME)) as EXEC by _time
| eval erro = round(((EXEC/REALEXEC)-1)*100, 2)
| eval _time = tonumber(_time)
| table _time neg REALEXEC EXEC
| sort _time
| fit ARIMA _time EXEC holdback=3 conf_interval=95 order=12-0-1 forecast_k=5 as prediction | forecastviz(5, 3, "EXEC", 95)

And I'm having this error: Error in 'fit' command: Error while fitting "ARIMA" model: cannot convert float NaN to integer.
How can I can fix it and is there some easier way to run my code?

0 Karma

quincybatten
New Member

The ValueError: cannot convert float NaN to integer raised because of Pandas doesn't have the ability to store NaN values for integers. From Pandas v0.24, introduces Nullable Integer Data Types which allows integers to coexist with NaNs. This does allow integer NaNs . This is the pandas integer, instead of the numpy integer.

df['column_name'].astype(np.float).astype("Int32")

 

0 Karma

hkeswani_splunk
Splunk Employee
Splunk Employee

Either your _time or EXEC could be in float format which needs to be changed to the integer type.
Could you show the table for _time and EXEC just before the fit ARIMA command?

0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

Deep Dive: Accelerate threat investigation with Splunk’s AI Assistant in Security

AI is one of the biggest topics in the market today, and for security teams, its value goes far beyond the ...

Announcing Modern Navigation: A New Era of Splunk User Experience

We are excited to introduce the Modern Navigation feature in the Splunk Platform, available to both cloud and ...

Detection Engineering Office Hours: Real-World Troubleshooting & Q&A

[REGISTER HERE] This thread is for the Community Office Hours session on Detection Engineering Office Hours: ...