Getting Data In

Why is Splunk time format different in CSV passed to R app than in exported CSV?

brian_from_fl
Explorer

Splunk shows the _time field as a string form of the date. For example:

"2014-11-25T16:23:49.535-05:00"

And when the results are exported to CSV, my R script sees the X_time field as the value above. But when Splunk passes the CSV into the R App, my R script sees the X_time field as the string form of a floating point value. For example:

"1416950629.535"

There are other barriers to making an R script that is truly generic so that it can be run and tested from the command line from a repeatable CSV export file, and then called from the R App to process the results in the Splunk pipeline. But this one really makes the R script much more complex.

I would appreciate the reasons behind the differing CSV data format that the R app sees as compared to when the same R script is processing the results that are exported as CSV from Splunk. The _time / X_time field is one; there are others such as floating point values seen without quotes in the exported CSV but within quotes as seen by the R App.

Thank you in advance for any answers or suggestions.,The Splunk _time field is in the following format:

"2014-11-25T16:23:49.535-05:00"

But the Splunk R app sees the X_time field (name change is OK) in the following format:

"1416950629.535"

It would be much nicer if the same R script could see the same CSV data from a Splunk export (in the first format) as it does from the CSV presented to the R app. But alas, I need to make every one of my R scripts even more complex by testing to see whether the X_time field is from a Splunk export or the Splunk R app.

Thank you in advance!

1 Solution

rfujara_splunk
Splunk Employee
Splunk Employee

X in field names

All internal fields (fields that begin with "_" underscore) are prefixed with an X. This is the default behaviour of the CSV reader:
http://stackoverflow.com/questions/9098245/r-why-are-xs-added-to-the-names-of-variables-in-my-data-f...

That's why I just uploaded a new version (0.3.10) of the R app which leaves the field names as they are in Splunk.

Time format

Internally (in Splunk) the _time field is represented by a number, which is the number of seconds since epoch. The visual representation (in a Splunk search result table) of the _time field is just to make it human readable.

If you rename the _time field to time like this:

index=_internal | head 5 | table _time | r "output=input" | rename _time AS time

... the values are shown is seconds since epoch.

Thanks for your feedback!

View solution in original post

brian_from_fl
Explorer

Thanks for your answer. My question is only why the Splunk CSV export emits the _time field in "human-readable" format (which is also a deterministic ISO-8601 format), but the Splunk-R table export emits the time as a floating-point value (within a string, no less).

I don't mind the field renaming. In standard R, the CSV reader converts the Splunk _time to X_time, and converts an Elasticsearch ELK stack's @timestamp to X.timestamp. So with that in mind, I would much rather automate the detection of the time field and avoid the extra typing of options in the r command pipeline. To that end, my conversion from the Splunk/Elasticsearch ELK time field is:

time_fmt <- "%Y-%m-%dT%H:%M:%OS"
string_to_time <- function(v)
{
  return(strptime(v, time_fmt, tz="UTC"))
}

But if this function returns NA (as it will when processing the "123456.789" style of time exported by the Splunk R app, then I redefine this function thus:

string_to_time <- function(str)
{
  return(as.POSIXlt(as.numeric(str), origin="1970-01-01", "UTC"))
}

And all is well again.

I also automatically reverse the rows of the input data frame so that the earliest row is first, which is usually better for plotting. It will remain to be seen if the Splunk chart command is OK with this, or if I need to also further automate the reversal of rows.

This all sounds complex, but it really simplifies my R scripts. They can now accept data from ELK, Splunk CSV export, and the Splunk R app with no duplication of code!

brian_from_fl
Explorer

Thanks for the explanations! We have installed the 0.3.10 version. And I now realize that I won't need to convert the _time field into POSIXlt; that was only for the base plot and the ggplot2 functions but isn't required by the Splunk chart command.

An awesome plug-in! Thanks again! I'm relatively new to R, and it just keeps looking better every day.

rfujara_splunk
Splunk Employee
Splunk Employee

Nice conversion functions!

But the reason for showing the time in seconds since epoch and not in ISO-8601 is the name of the field.
So if the name is "_time" (and not X_time), then it's shown in ISO-8601 (internally, the representation is still in seconds since epoch).
As a result, your workaround shows a nice time format, but for some of Splunk's functionality it's key to have the "_time" field in seconds since epoch.

rfujara_splunk
Splunk Employee
Splunk Employee

X in field names

All internal fields (fields that begin with "_" underscore) are prefixed with an X. This is the default behaviour of the CSV reader:
http://stackoverflow.com/questions/9098245/r-why-are-xs-added-to-the-names-of-variables-in-my-data-f...

That's why I just uploaded a new version (0.3.10) of the R app which leaves the field names as they are in Splunk.

Time format

Internally (in Splunk) the _time field is represented by a number, which is the number of seconds since epoch. The visual representation (in a Splunk search result table) of the _time field is just to make it human readable.

If you rename the _time field to time like this:

index=_internal | head 5 | table _time | r "output=input" | rename _time AS time

... the values are shown is seconds since epoch.

Thanks for your feedback!

Get Updates on the Splunk Community!

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...

Welcome to the Splunk Community!

(view in My Videos) We're so glad you're here! The Splunk Community is place to connect, learn, give back, and ...