Getting Data In

Timestamping difficulties on files with the same names but with different formats

Path Finder

Hey, We are having some difficulties getting accurate timestamping on files with the same names, which are being fowarded from multiple servers to a single indexer. We have differently formatted timestamps on various files with the same names scattered across the servers each called stderr.log, stdout.log etc. They are all being recognized as dates in the future (ie 7/1/11)

IE (each line is an example from a differently formatted stdout.log located on a different machine)

<Wednesday January  5 17:15:20 EST 2011> : 05/01/2011 17:15:20 <DispatchThread> DEBUG [MarketImpl] Processing BI9 heartbeat
<Wednesday January  5 17:18:03 EST 2011> : <5/01/2011 05:18:03 PM EST> <Debug> <HTTP> <BEA-101147> <HttpServer(12101878,n……..
1/05/11 04:40:37 PM EST [INFO] [IntroscopeAgent] Activating JMX Data Collection
<5/01/2011 04:40:23 PM EST> <Notice> <Security> <BEA-090082> <Security initializing using security realm myrealm.>

I have a feeling if I could ignore the “EST” string it would work, splunk thinks EST is US EST not Australian EST (were we are located). Do you have any ideas how I could accomplish this given that they all logs have the same file name.

I noticed that there is the option to prefix & lookahead, is there anyway to do a postfix (ie max position).


0 Karma


First, can you set a different sourcetype for each of these inputs? On the production servers, set the sourcetype for the inputs. For example, if you had the following in inputs.conf on a forwarder:


Then Splunk would automatically monitor and forward data from all the files in myLogFolder. You could set the sourcetype for individual files in props.conf on the forwarder


(Do this for each individual file that needs a new sourcetype.)

Now, on the indexer(s), where timestamps are actually parsed, you can specify timestamp processing based on the sourcetype. Again, use props.conf

TIME_FORMAT=%a %b %d %H:%M:%S

The way I specified this matches the first line of your example. It skips the < and then looks at the timestamp, but skips the timezone (and year). Splunk will take the current year if not specified in the timestamp, so this is probably okay if you are not loading older data. You would need to make a similar stanza for each sourcetype.

Look at the manual page Configure timestamp recognition for more timestamp settings.

Finally, if all these inputs are actually variations of the same type of data, you could create individual sourcetypes, but name them similarly. For exampe, mySourceType-1, mySourceType-2, etc. Then you could easily search them all by specifying sourcetype=mySourceType* in your search string.

0 Karma
Get Updates on the Splunk Community!

Devesh Logendran, Splunk, and the Singapore Cyber Conquest

At this year’s Splunk University, I had the privilege of chatting with Devesh Logendran, one of the winners in ...

There's No Place Like Chrome and the Splunk Platform

WATCH NOW!Malware. Risky Extensions. Data Exfiltration. End-users are increasingly reliant on browsers to ...

Customer Experience | Join the Customer Advisory Board!

Are you ready to take your Splunk journey to the next level? &#x1f680; We invite you to join our elite squad ...