I've got this search
index=vpn sourcetype=vpn_device login
| bin _time span=1h
| stats count AS logins by _time
The output looks like this
2018-02-03 09:00 65
2018-02-03 10:00 123
2018-02-03 11:00 92
What I want to see is this instead
1512813600 65
1512817200 123
1512820800 92
I want to take the output of the search and send it into a forecast time series in the Splunk Machine Learning section. I know I can use this
index=vpn sourcetype=vpn_device login
| bin _time AS "Time" span=1h
| stats count AS logins by "Time"
to get output 2, but the model requires _time, count to work.
Forecast Time Series
Predict likely future values given past values of a metric (numerical time series).
Choose an example dataset or enter a search (should contain "_time" field with unix timestamp values)
So any thoughts on how I can do this?
Hi,
You don't need to convert "_time" values to Unix timestamps in order to use Time Series Forecasting in the Machine Learning Toolkit.
If you use "predict" command, then it needs to be preceded by the "timechart" command. See: https://docs.splunk.com/Documentation/Splunk/7.0.2/SearchReference/Predict
There is no such requirement for ARIMA algorithm: https://docs.splunk.com/Documentation/MLApp/3.1.0/User/ForecastTimeSeries
You can always modify the format of the time values after you perform the forecasting.
Please refer to the time-series forecasting showcases available in the app for more examples.
Hi @davpx - A few things are kind of confusing here, that might help clear things up...
Splunk will magically display an epoch time (e.g. 1512813600
) as a formatted string ( e.g. 2018-02-03 09:00
) when the field is named _time
.
Secondly, the MLTK is looking specifically for a field called _time
, so if you didn't rename your field in the bin
command, it probably would've worked. However...
The makecontinous
command is needed after bin
+ stats
because otherwise you may have empty bins (that can be misleading if not re-added).
These two things are for all practical purposes equivalent
| timechart count span=1h
and
| bin _time span=1h
| stats count by _time
| makecontinuous _time
The timechart command does add in some hidden fields and helps with formatting. So if you don't need to use stats in this scenario, timechart
is probably a lot easier.
Lastly, if you do have a string that needs to become an epoch time, checkout the strptime
eval function:
http://docs.splunk.com/Documentation/Splunk/7.0.1/SearchReference/DateandTimeFunctions
@aljohnson - Thanks for the additional information. That did clear up a few things in my mind.
Oops I tagged the other answer-er - Sorry @Davpx - I meant to tag you @jwhughes58. Glad it helped 🙂
Hi,
You don't need to convert "_time" values to Unix timestamps in order to use Time Series Forecasting in the Machine Learning Toolkit.
If you use "predict" command, then it needs to be preceded by the "timechart" command. See: https://docs.splunk.com/Documentation/Splunk/7.0.2/SearchReference/Predict
There is no such requirement for ARIMA algorithm: https://docs.splunk.com/Documentation/MLApp/3.1.0/User/ForecastTimeSeries
You can always modify the format of the time values after you perform the forecasting.
Please refer to the time-series forecasting showcases available in the app for more examples.
HI Akim,
Thanks. I was misreading that section as epoch time. All it has to be is a "Unix Timestamp." When I changed my search to
index=vpn sourcetype=juniper:sslvpn login
| timechart count as logins span=1h
the prediction worked correctly.
Regards,
Joe
_time is actually epoch translated for you in the UI, an easy way to strip the translation is to rename _time to something else or use an alias,
| eval epoch=_time
Hi Davpx,
I appreciate the reply, but as I wrote in the original question the model requires _time in epoch format. I'm trying to find out if I can get the epoch time value without having it translated to my TZ in the UI.
Regards,
Joe
Right and _time is actually in epoch format under the covers but it sounds like you got it resolved.