Getting Data In

CSV date being replaced/not parsed correctly

jaware_splunk
Splunk Employee
Splunk Employee

Let's say I have a CSV with the following spanning 10 years:

Date | Time | Value
2020-05-01 4:00:00 PM 49.88

If I try to do a timechart it works fine for the last several years but if I select All Time then it incorrectly parses the timestamp and groups multiple days worth of values in a single day:

_time | values(Close)
2014-11-12 | 1.86
1.87
1.88
1.92

If I view the events, the parsed timestamp is incorrect now, but only for really old events:

Time (Splunk parsed): 11/12/14 4:00:00.000 PM

Full Event: 2010-05-04,4:00:00 PM,8.68,46458590,9.08,9.08,8.54

Time (Splunk parsed): 11/12/14 4:00:00.000 PM

Full Event: 2010-05-26,4:00:00 PM,8.22,37479000,8.39,8.59,8.18

I did this with the built-in CSV sourcetype as well as custom. Thanks for any help!

EDIT: Here's an example. Download the Max dataset from here: https://www.nasdaq.com/market-activity/stocks/amd/historical

Note it doesn't have the timestamp, so a new column was added with 16:00:00 (end of market close) called Time.

I used the default CSV sourcetype as a test and same issue.

Test search (All time):

source="filename.csv" index="test"
| timechart values("Close/Last") span=1d

Around 2014 starts mis-parsing (Statistics tab -> click on date -> view events -> _time is different than the event date).

0 Karma

to4kawa
SplunkTrust
SplunkTrust

yours:
TIME_FORMAT = %Y-%m-%d %H:%M:%S

but
2010-05-26,4:00:00 PM is %Y-%m-%d,%H:%M:%S %p

and your setting is INDEXED_EXTRACTIONS = csv
try TIMESTAMP_FIELDS

props.conf

0 Karma

jaware_splunk
Splunk Employee
Splunk Employee

Same result. It parses from Nov 2014 - today properly. But for some reason prior to Nov 2014 it doesn't.

0 Karma

to4kawa
SplunkTrust
SplunkTrust

https://wiki.splunk.com/Deploy:BucketRotationAndRetention

These are too old for index, I guess.

0 Karma

richgalloway
SplunkTrust
SplunkTrust

What props.conf settings did you specify for the sourcetype?

---
If this reply helps you, an upvote would be appreciated.
0 Karma

jaware_splunk
Splunk Employee
Splunk Employee

So this happens when using the default csv sourcetype as well as a custom one. Here's the custom one:

BREAK_ONLY_BEFORE_DATE =
DATETIME_CONFIG =
INDEXED_EXTRACTIONS = csv
KV_MODE = none
LINE_BREAKER = ([\r\n]+)
NO_BINARY_CHECK = true
SHOULD_LINEMERGE = false
TIME_FORMAT = %Y-%m-%d %H:%M:%S
category = Structured
description = Comma-separated value format. Set header and other settings in "Delimited Settings"
disabled = false
pulldown_type = true

0 Karma

richgalloway
SplunkTrust
SplunkTrust

The TIME_FORMAT setting does not match the sample data. It should be %Y-%m-%d %I:%M:%S %p. Also, add the TIMESTAMP_FIELDS attribute. Set MAX_DAYS_AGO = 4000.

---
If this reply helps you, an upvote would be appreciated.

jaware_splunk
Splunk Employee
Splunk Employee

Here's an example. Download the Max dataset from here: https://www.nasdaq.com/market-activity/stocks/amd/historical

Note it doesn't have the timestamp, so a new column was added with 16:00:00 (end of market close) called Time.

I used the default CSV sourcetype as a test and same issue.

Test search (All time):

source="filename.csv" index="test"
| timechart values("Close/Last") span=1d

Around 2014 starts mis-parsing (Statistics tab -> click on date -> view events -> _time is different than the event date).

0 Karma
.conf21 CFS Extended through 5/20!

Don't miss your chance
to share your Splunk
wisdom in-person or
virtually at .conf21!

Call for Speakers has
been extended through
Thursday, 5/20!