Solved: Why is CSV Timestamp recognition not working with ...

pkeller · ‎10-17-2016

I have 3 environments:

Laptop - Splunk 6.5.0
Test - Splunk 6.4.3
Prod - Splunk 6.3.2

In the first two environments, I am able to pull in a csv nightly and grab the timestamp from the first comma-separated field (in epoch form)

My props.conf:

[status_csv]
HEADER_FIELD_LINE_NUMBER = 1
INDEXED_EXTRACTIONS = csv
TIME_FORMAT =  %s
TIMESTAMP_FIELDS = collection_time
MAX_TIMESTAMP_LOOKAHEAD = 11
KV_MODE = none
SHOULD_LINEMERGE = false

Sample data:

collection_time,src_host,APstat,def_date,def_version,foo,bar,foobar
1476691203,xxx-osx1010-3,On,2016-10-16 00:00:00.000,2016-10-16 rev. 022,No,local,Not installed

And yet when I push these configs to our PROD indexer cluster, the extractions are created, but Splunk always stamps _time with the time that the event was indexed. ( Whereas, in both my Splunk free environment on my laptop and our UAT environment ( similar to Prod, just smaller and now running 6.4.3 ), the timestamp is appropriately extracted from the 'collection_time' field in the csv )

Either something must be overriding the props I've pushed, or something in the configuration is wrong.

acharlieh · ‎10-17-2016

I believe you need to distribute your props.conf to your UFs. I'm guessing that in your first two environments are you loading the file on the Splunk server itself as opposed to feeding it in using a UF?

While you are correct that most of the time timestamp parsing happens on the indexer or HWF tier, INDEXED_EXTRACTIONS is a huge exception to this, namely because the UF sends fully parsed events (this is because it builds index-time extracted fields, using the field names which come from the header of the file, and said header remains only on the source node and is not forwarded... if the UF were blindly sending the file then the header would only go with the first chunk of the file, and would not be available for later lines that come in). For a detailed diagram on the steps of parsing, check out this wiki page or Amrit's perennial .conf talk on How Splunkd Works... you'll see that with structured parsing, the aggregator processor (which is responsible for timestamp assignment) is run on the UF.

One interesting side effect to all of this is that you can actually nullQueue on the UF when using INDEXED_EXTRACTIONS, because the regexreplacement processor also runs as part of the structured parsing pipeline on the UF.

View solution in original post

acharlieh · ‎10-17-2016

I believe you need to distribute your props.conf to your UFs. I'm guessing that in your first two environments are you loading the file on the Splunk server itself as opposed to feeding it in using a UF?

While you are correct that most of the time timestamp parsing happens on the indexer or HWF tier, INDEXED_EXTRACTIONS is a huge exception to this, namely because the UF sends fully parsed events (this is because it builds index-time extracted fields, using the field names which come from the header of the file, and said header remains only on the source node and is not forwarded... if the UF were blindly sending the file then the header would only go with the first chunk of the file, and would not be available for later lines that come in). For a detailed diagram on the steps of parsing, check out this wiki page or Amrit's perennial .conf talk on How Splunkd Works... you'll see that with structured parsing, the aggregator processor (which is responsible for timestamp assignment) is run on the UF.

One interesting side effect to all of this is that you can actually nullQueue on the UF when using INDEXED_EXTRACTIONS, because the regexreplacement processor also runs as part of the structured parsing pipeline on the UF.

pkeller · ‎10-18-2016

Thank you. Yes. This issue is resolved now after deploying the INDEXED_EXTRACTIONS props to the UF. The latest ingestion of the CSV was properly timestamped. Appreciate the insight from everyone.

dsmc_adv · ‎05-05-2017

Worked for me too. It was turning month/day in the _time field and I didn't understand why

thisissplunk · ‎06-25-2018

Ugh, but it literally says INDEX in its name! Thank you guys.

lquinn · ‎10-17-2016

Have you had a look at the _internal index for any timestamp parsing errors?

pkeller · ‎10-17-2016

Thank you very much for your reply. Yes. I've done extensive searching in the splunkd logs and am not finding anything relevant to this datasource showing up on the indexers.

lquinn · ‎10-17-2016

Have you run a btool to check for configuration conflicts in props.conf?

pkeller · ‎10-17-2016

Yes. thank you *sudo /opt/splunk/bin/splunk btool props list --debug status_csv * returns the same properties that I'd expect to see

[Paul_Keller@Hostname ~]$ sudo /opt/splunk/bin/splunk btool props list --debug status_csv | grep local
/opt/splunk/etc/slave-apps/TA-gso_props/local/props.conf [status_csv]
/opt/splunk/etc/slave-apps/TA-gso_props/local/props.conf HEADER_FIELD_LINE_NUMBER = 1
/opt/splunk/etc/slave-apps/TA-gso_props/local/props.conf INDEXED_EXTRACTIONS = csv
/opt/splunk/etc/slave-apps/TA-gso_props/local/props.conf KV_MODE = none
/opt/splunk/etc/slave-apps/TA-gso_props/local/props.conf MAX_TIMESTAMP_LOOKAHEAD = 11
/opt/splunk/etc/slave-apps/TA-gso_props/local/props.conf SHOULD_LINEMERGE = false
/opt/splunk/etc/slave-apps/TA-gso_props/local/props.conf TIMESTAMP_FIELDS = collection_time
/opt/splunk/etc/slave-apps/TA-gso_props/local/props.conf TIME_FORMAT = %s

lquinn · ‎10-17-2016

Are you using a UF or HF to forward the data?

pkeller · ‎10-17-2016

This issue would be relevant to a UF.

pkeller · ‎10-17-2016

hmmm ... should the TIMESTAMP_FIELDS and TIME_FORMAT assignments be levied on the Universal Forwarder?

maciep · ‎10-17-2016

I think at least INDEXED_EXTRACTIONS needs to be on the uf, and so then the remaining parsing might need to be there as well. Not sure if I'm reading the caveats section correctly, but the answer is probably in here somewhere

http://docs.splunk.com/Documentation/Splunk/6.5.0/Data/Extractfieldsfromfileswithstructureddata

Why is CSV Timestamp recognition not working with my current props.conf for our production 6.3.2 indexer cluster?

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Think Like an Architect: Introducing the Splunk Certified Cybersecurity Defense ...

Best Practices: Splunk auto adjust pipeline queue

Announcing Modern Navigation: A New Era of Splunk User Experience

Join the Conversation