Getting Data In

Why are timestamps parsed correctly for only one of two inputs? Do TIME_FORMAT & TIME_PREFIX work on a Universal Forwarder?

flle
Path Finder

I stumbled across an interesting issue and need some advice / hints here.

I have two sourcetypes where I need some time_format and time_prefix mangling to correctly parse the time stamps.
When setting up the new data input (batch input for files) I did the configs in props.conf on the universal forwarder for the first input, and after some regex tuning for time_prefix, it worked fine.
For the second input however, I could not get it to work. The difference is, that I have indexed_extractions for the first input.
I then remembered that time_format & time_prefix are done in the parsing phase and thus can only be done on the indexers or heavy forwarders (see also: http://wiki.splunk.com/Where_do_I_configure_my_Splunk_settings).

So now I am confused on whether time_format & time_prefix also work on a universal forwarder and I am just having an error in my second props.conf I am not seeing or that Splunk miraculously fixed the time_stamp extraction on its own regardless of my props.conf changes :-).
Or time_* works in conjunction with indexed_extractions??

Log sample input 1: (desired timestamp is bold)

"hostname_10.1.2.3_**2015-07-22T15_01_43Z**","HKLM\SOFTWARE\Microsoft\Windows\CurrentVersion\Run","2014-05-28T10:28:33Z","expand_sz","SysTrayApp","C:\Program Files\IDT\WDM\sttray64.exe","TRUE","FALSE","2013-11-06T22:07:24Z","2014-05-28T09:03:46Z","2014-05-28T09:03:46Z","C:\Program Files\IDT\WDM\sttray64.exe","1703424","IDT, Inc.","IDT PC Audio","1.0.6496.0","IDT PCA","Copyright © 2004 - 2009 IDT, Inc.","sttray64.exe","IDT PC Audio","1.0.6496.0","FALSE","FALSE","TRUST_E_NOSIGNATURE","The file is not signed","","","1f918ddae59e246b8f48ce5aa400b3aa","8896809e855ae08b43e41b25a6bdca8ed1905bbfc59e7b779070eaa0bbc1b319"

Log sample input 2: (desired timestamp is bold)

"05.08.2015 10:22:36";"3";"Network connection detected:;SequenceNumber: 161522;UtcTime: **05.08.2015 08:17:49.149 AM**;ProcessGuid: {6B887E38-96AB-55AC-0000-0010EB030000};ProcessId: 4;Image: System;User: NT-AUTORIT\xC4T\SYSTEM;Protocol: udp;Initiated: false;SourceIsIpv6: false;SourceIp: 10.1.2.3;SourceHostname: ;SourcePort: 137;SourcePortName: netbios-ns;DestinationIsIpv6: false;DestinationIp: 10.1.2.3;DestinationHostname: myhostnamet;DestinationPort: 137;DestinationPortName: netbios-ns"

**Inputs.conf**
[batch://d:\Splunk\sysmon\]
disabled = 0
sourcetype = sysmon
move_policy = sinkhole
index=testing

[batch://d:\Splunk\regdump\]
disabled = 0
sourcetype = regdump
move_policy = sinkhole
crcSalt = <SOURCE>
index=testing

props.conf
[regdump]
INDEXED_EXTRACTIONS = CSV
TIME_FORMAT = %Y-%m-%dT%H_%M_%S%Z
TIME_PREFIX = ^([^_]*_){2}

[sysmon]
TIME_FORMAT = %d.%m.%Y %I:%M:%S.%3N %p
TIME_PREFIX = ^([^;]*;){4}UtcTime:\s+

[source::...sysmon*csv]
TZ = UTC

The regdump sourcetyp was the first I integrated and Splunk, by default, extracted the timestamp further down in the event (2013-11-06T22:07:24Z). After I configured a matching time_prefix regex and time_format, the desired timestamp is now extracted.
For the sysmon sourcetype however, it does not work.

So what is the deal here? What am I missing?

Thanks for any hints.

0 Karma
1 Solution

flle
Path Finder

woodcock, thanks for the update. I did figure out the issue in the meantime and there was more to it, hence I add an answer myself 🙂
If you foward structured data from a forwarder to an indexer, the indexer does NOT parse those events again (parsing, aggregation and typing queues are skipped). See the "Caveats" section here: http://docs.splunk.com/Documentation/Splunk/6.2.6/Data/Extractfieldsfromfileheadersatindextime

In my case, INDEXED_EXTRACTIONS on the Universal Forwarder transform data to structured date, so the indexer does ignore any props or transforms on the indexers for this data.
For the timestamp issue I could actually get around that with adding TIMESTAMP_FIELDS on the forwarder, but only if Splunk can auto-identify the time format.
As the parsing capabilities of a universal forwarder are limited to the parsing functions in the INPUT phase (see http://wiki.splunk.com/Where_do_I_configure_my_Splunk_settings%3F), and data is not being parsed again on the indexer when using indexed extractions on the UF this puts some constraints on my overall parsing capabilities. I basically loose all capabilities of the PARSING Phase.

Conclusion: When using INDEXEC_EXTRACTIONS on a UF, be sure that you can achieve all desired parsing with the capabilities of the INPUT phase. Otherwise you have to use a Heavy Forwarder and do all the parsing there. Or with a UF forward unparsed data from the UF to the indexer and to all the parsing on the indexer.

View solution in original post

0 Karma

flle
Path Finder

woodcock, thanks for the update. I did figure out the issue in the meantime and there was more to it, hence I add an answer myself 🙂
If you foward structured data from a forwarder to an indexer, the indexer does NOT parse those events again (parsing, aggregation and typing queues are skipped). See the "Caveats" section here: http://docs.splunk.com/Documentation/Splunk/6.2.6/Data/Extractfieldsfromfileheadersatindextime

In my case, INDEXED_EXTRACTIONS on the Universal Forwarder transform data to structured date, so the indexer does ignore any props or transforms on the indexers for this data.
For the timestamp issue I could actually get around that with adding TIMESTAMP_FIELDS on the forwarder, but only if Splunk can auto-identify the time format.
As the parsing capabilities of a universal forwarder are limited to the parsing functions in the INPUT phase (see http://wiki.splunk.com/Where_do_I_configure_my_Splunk_settings%3F), and data is not being parsed again on the indexer when using indexed extractions on the UF this puts some constraints on my overall parsing capabilities. I basically loose all capabilities of the PARSING Phase.

Conclusion: When using INDEXEC_EXTRACTIONS on a UF, be sure that you can achieve all desired parsing with the capabilities of the INPUT phase. Otherwise you have to use a Heavy Forwarder and do all the parsing there. Or with a UF forward unparsed data from the UF to the indexer and to all the parsing on the indexer.

0 Karma

woodcock
Esteemed Legend

In addition to the other 2 answers (both important), I do not see that you a TIMESTAMP_FIELDS= line. You need to add this to the [regdump] stanza (and deploy to the forwarder and restart splunk there) and it should work fine.

0 Karma

woodcock
Esteemed Legend

When you use INDEXED_EXTRACTIONS then your Universal Forwarder acts more like a Heavy Forwarder for this input in that some of the Indexing work is now done on the Forwarder instead of the Indexers which necessitates that you deploy your props.conf file to your Forwarder. But you still also need to deploy it to your Indexers so that your normal timestmaping functions (which have not moved) can be done properly. So put your props.conf both on your Forwarders and your Indexers and restart all the Splunk instances there and it should work.

0 Karma

somesoni2
Revered Legend

Yes the event breaking and the timestamp parsing happens only on INdexers /Heavy forwarder. Your first log got lucky as the date format there is matching one of Splunk's default parsed time format. Have the configuration moved to your indexers and try again.

0 Karma
Get Updates on the Splunk Community!

New Learning Videos on Topics Most Requested by You! Plus This Month’s New Splunk ...

Splunk Lantern is a customer success center that provides advice from Splunk experts on valuable data ...

How I Instrumented a Rust Application Without Knowing Rust

As a technical writer, I often have to edit or create code snippets for Splunk's distributions of ...

Splunk Community Platform Survey

Hey Splunk Community, Starting today, the community platform may prompt you to participate in a survey. The ...