Getting Data In

Why is CSV Timestamp recognition not working with my current props.conf for our production 6.3.2 indexer cluster?

Contributor

I have 3 environments:

Laptop - Splunk 6.5.0
Test - Splunk 6.4.3
Prod - Splunk 6.3.2

In the first two environments, I am able to pull in a csv nightly and grab the timestamp from the first comma-separated field (in epoch form)

My props.conf:

[status_csv]
HEADER_FIELD_LINE_NUMBER = 1
INDEXED_EXTRACTIONS = csv
TIME_FORMAT =  %s
TIMESTAMP_FIELDS = collection_time
MAX_TIMESTAMP_LOOKAHEAD = 11
KV_MODE = none
SHOULD_LINEMERGE = false

Sample data:

collection_time,src_host,APstat,def_date,def_version,foo,bar,foobar
1476691203,xxx-osx1010-3,On,2016-10-16 00:00:00.000,2016-10-16 rev. 022,No,local,Not installed

And yet when I push these configs to our PROD indexer cluster, the extractions are created, but Splunk always stamps _time with the time that the event was indexed. ( Whereas, in both my Splunk free environment on my laptop and our UAT environment ( similar to Prod, just smaller and now running 6.4.3 ), the timestamp is appropriately extracted from the 'collection_time' field in the csv )

Either something must be overriding the props I've pushed, or something in the configuration is wrong.

1 Solution

Influencer

I believe you need to distribute your props.conf to your UFs. I'm guessing that in your first two environments are you loading the file on the Splunk server itself as opposed to feeding it in using a UF?

While you are correct that most of the time timestamp parsing happens on the indexer or HWF tier, INDEXED_EXTRACTIONS is a huge exception to this, namely because the UF sends fully parsed events (this is because it builds index-time extracted fields, using the field names which come from the header of the file, and said header remains only on the source node and is not forwarded... if the UF were blindly sending the file then the header would only go with the first chunk of the file, and would not be available for later lines that come in). For a detailed diagram on the steps of parsing, check out this wiki page or Amrit's perennial .conf talk on How Splunkd Works... you'll see that with structured parsing, the aggregator processor (which is responsible for timestamp assignment) is run on the UF.

One interesting side effect to all of this is that you can actually nullQueue on the UF when using INDEXED_EXTRACTIONS, because the regexreplacement processor also runs as part of the structured parsing pipeline on the UF.

View solution in original post

Influencer

I believe you need to distribute your props.conf to your UFs. I'm guessing that in your first two environments are you loading the file on the Splunk server itself as opposed to feeding it in using a UF?

While you are correct that most of the time timestamp parsing happens on the indexer or HWF tier, INDEXED_EXTRACTIONS is a huge exception to this, namely because the UF sends fully parsed events (this is because it builds index-time extracted fields, using the field names which come from the header of the file, and said header remains only on the source node and is not forwarded... if the UF were blindly sending the file then the header would only go with the first chunk of the file, and would not be available for later lines that come in). For a detailed diagram on the steps of parsing, check out this wiki page or Amrit's perennial .conf talk on How Splunkd Works... you'll see that with structured parsing, the aggregator processor (which is responsible for timestamp assignment) is run on the UF.

One interesting side effect to all of this is that you can actually nullQueue on the UF when using INDEXED_EXTRACTIONS, because the regexreplacement processor also runs as part of the structured parsing pipeline on the UF.

View solution in original post

Contributor

Thank you. Yes. This issue is resolved now after deploying the INDEXED_EXTRACTIONS props to the UF. The latest ingestion of the CSV was properly timestamped. Appreciate the insight from everyone.

0 Karma

Path Finder

Worked for me too. It was turning month/day in the _time field and I didn't understand why

0 Karma

Builder

Ugh, but it literally says INDEX in its name! Thank you guys.

0 Karma

Contributor

Have you had a look at the _internal index for any timestamp parsing errors?

0 Karma

Contributor

Thank you very much for your reply. Yes. I've done extensive searching in the splunkd logs and am not finding anything relevant to this datasource showing up on the indexers.

0 Karma

Contributor

Have you run a btool to check for configuration conflicts in props.conf?

0 Karma

Contributor

Yes. thank you *sudo /opt/splunk/bin/splunk btool props list --debug status_csv * returns the same properties that I'd expect to see

[Paul_Keller@Hostname ~]$ sudo /opt/splunk/bin/splunk btool props list --debug status_csv | grep local
/opt/splunk/etc/slave-apps/TA-gso_props/local/props.conf [status_csv]
/opt/splunk/etc/slave-apps/TA-gso_props/local/props.conf HEADER_FIELD_LINE_NUMBER = 1
/opt/splunk/etc/slave-apps/TA-gso_props/local/props.conf INDEXED_EXTRACTIONS = csv
/opt/splunk/etc/slave-apps/TA-gso_props/local/props.conf KV_MODE = none
/opt/splunk/etc/slave-apps/TA-gso_props/local/props.conf MAX_TIMESTAMP_LOOKAHEAD = 11
/opt/splunk/etc/slave-apps/TA-gso_props/local/props.conf SHOULD_LINEMERGE = false
/opt/splunk/etc/slave-apps/TA-gso_props/local/props.conf TIMESTAMP_FIELDS = collection_time
/opt/splunk/etc/slave-apps/TA-gso_props/local/props.conf TIME_FORMAT = %s
0 Karma

Contributor

Are you using a UF or HF to forward the data?

0 Karma

Contributor

This issue would be relevant to a UF.

0 Karma

Contributor

hmmm ... should the TIMESTAMP_FIELDS and TIME_FORMAT assignments be levied on the Universal Forwarder?

0 Karma

Champion

I think at least INDEXED_EXTRACTIONS needs to be on the uf, and so then the remaining parsing might need to be there as well. Not sure if I'm reading the caveats section correctly, but the answer is probably in here somewhere

http://docs.splunk.com/Documentation/Splunk/6.5.0/Data/Extractfieldsfromfileswithstructureddata

0 Karma