Archive
Highlighted

Why is Splunk time format different in CSV passed to R app than in exported CSV?

Explorer

Splunk shows the _time field as a string form of the date. For example:

"2014-11-25T16:23:49.535-05:00"

And when the results are exported to CSV, my R script sees the Xtime field as the value above. But when Splunk passes the CSV into the R App, my R script sees the Xtime field as the string form of a floating point value. For example:

"1416950629.535"

There are other barriers to making an R script that is truly generic so that it can be run and tested from the command line from a repeatable CSV export file, and then called from the R App to process the results in the Splunk pipeline. But this one really makes the R script much more complex.

I would appreciate the reasons behind the differing CSV data format that the R app sees as compared to when the same R script is processing the results that are exported as CSV from Splunk. The time / Xtime field is one; there are others such as floating point values seen without quotes in the exported CSV but within quotes as seen by the R App.

Thank you in advance for any answers or suggestions.,The Splunk _time field is in the following format:

"2014-11-25T16:23:49.535-05:00"

But the Splunk R app sees the X_time field (name change is OK) in the following format:

"1416950629.535"

It would be much nicer if the same R script could see the same CSV data from a Splunk export (in the first format) as it does from the CSV presented to the R app. But alas, I need to make every one of my R scripts even more complex by testing to see whether the X_time field is from a Splunk export or the Splunk R app.

Thank you in advance!

Highlighted

Re: Why is Splunk time format different in CSV passed to R app than in exported CSV?

Splunk Employee
Splunk Employee

X in field names

All internal fields (fields that begin with "_" underscore) are prefixed with an X. This is the default behaviour of the CSV reader:
http://stackoverflow.com/questions/9098245/r-why-are-xs-added-to-the-names-of-variables-in-my-data-f...

That's why I just uploaded a new version (0.3.10) of the R app which leaves the field names as they are in Splunk.

Time format

Internally (in Splunk) the _time field is represented by a number, which is the number of seconds since epoch. The visual representation (in a Splunk search result table) of the _time field is just to make it human readable.

If you rename the _time field to time like this:

index=_internal | head 5 | table _time | r "output=input" | rename _time AS time

... the values are shown is seconds since epoch.

Thanks for your feedback!

View solution in original post

Highlighted

Re: Why is Splunk time format different in CSV passed to R app than in exported CSV?

Explorer

Thanks for your answer. My question is only why the Splunk CSV export emits the _time field in "human-readable" format (which is also a deterministic ISO-8601 format), but the Splunk-R table export emits the time as a floating-point value (within a string, no less).

I don't mind the field renaming. In standard R, the CSV reader converts the Splunk time to Xtime, and converts an Elasticsearch ELK stack's @timestamp to X.timestamp. So with that in mind, I would much rather automate the detection of the time field and avoid the extra typing of options in the r command pipeline. To that end, my conversion from the Splunk/Elasticsearch ELK time field is:

time_fmt <- "%Y-%m-%dT%H:%M:%OS"
string_to_time <- function(v)
{
  return(strptime(v, time_fmt, tz="UTC"))
}

But if this function returns NA (as it will when processing the "123456.789" style of time exported by the Splunk R app, then I redefine this function thus:

string_to_time <- function(str)
{
  return(as.POSIXlt(as.numeric(str), origin="1970-01-01", "UTC"))
}

And all is well again.

I also automatically reverse the rows of the input data frame so that the earliest row is first, which is usually better for plotting. It will remain to be seen if the Splunk chart command is OK with this, or if I need to also further automate the reversal of rows.

This all sounds complex, but it really simplifies my R scripts. They can now accept data from ELK, Splunk CSV export, and the Splunk R app with no duplication of code!

Highlighted

Re: Why is Splunk time format different in CSV passed to R app than in exported CSV?

Splunk Employee
Splunk Employee

Nice conversion functions!

But the reason for showing the time in seconds since epoch and not in ISO-8601 is the name of the field.
So if the name is "time" (and not Xtime), then it's shown in ISO-8601 (internally, the representation is still in seconds since epoch).
As a result, your workaround shows a nice time format, but for some of Splunk's functionality it's key to have the "_time" field in seconds since epoch.

Highlighted

Re: Why is Splunk time format different in CSV passed to R app than in exported CSV?

Explorer

Thanks for the explanations! We have installed the 0.3.10 version. And I now realize that I won't need to convert the _time field into POSIXlt; that was only for the base plot and the ggplot2 functions but isn't required by the Splunk chart command.

An awesome plug-in! Thanks again! I'm relatively new to R, and it just keeps looking better every day.