Getting Data In

Timestamping difficulties on files with the same names but with different formats

pl123
Path Finder

Hey, We are having some difficulties getting accurate timestamping on files with the same names, which are being fowarded from multiple servers to a single indexer. We have differently formatted timestamps on various files with the same names scattered across the servers each called stderr.log, stdout.log etc. They are all being recognized as dates in the future (ie 7/1/11)

IE (each line is an example from a differently formatted stdout.log located on a different machine)

<Wednesday January  5 17:15:20 EST 2011> : 05/01/2011 17:15:20 <DispatchThread> DEBUG [MarketImpl] Processing BI9 heartbeat
<Wednesday January  5 17:18:03 EST 2011> : <5/01/2011 05:18:03 PM EST> <Debug> <HTTP> <BEA-101147> <HttpServer(12101878,n……..
1/05/11 04:40:37 PM EST [INFO] [IntroscopeAgent] Activating JMX Data Collection
<5/01/2011 04:40:23 PM EST> <Notice> <Security> <BEA-090082> <Security initializing using security realm myrealm.>

I have a feeling if I could ignore the “EST” string it would work, splunk thinks EST is US EST not Australian EST (were we are located). Do you have any ideas how I could accomplish this given that they all logs have the same file name.

I noticed that there is the option to prefix & lookahead, is there anyway to do a postfix (ie max position).

Cheers

0 Karma

lguinn2
Legend

First, can you set a different sourcetype for each of these inputs? On the production servers, set the sourcetype for the inputs. For example, if you had the following in inputs.conf on a forwarder:

#inputs.conf
[monitor:///opt/myLogFolder]

Then Splunk would automatically monitor and forward data from all the files in myLogFolder. You could set the sourcetype for individual files in props.conf on the forwarder

#props.conf
[source::/opt/myLogFolder/stdout.log]
sourcetype=mySourcetype

(Do this for each individual file that needs a new sourcetype.)

Now, on the indexer(s), where timestamps are actually parsed, you can specify timestamp processing based on the sourcetype. Again, use props.conf

#props.conf
[mySourceType]
TIME_PREFIX=\<
TIME_FORMAT=%a %b %d %H:%M:%S

The way I specified this matches the first line of your example. It skips the < and then looks at the timestamp, but skips the timezone (and year). Splunk will take the current year if not specified in the timestamp, so this is probably okay if you are not loading older data. You would need to make a similar stanza for each sourcetype.

Look at the manual page Configure timestamp recognition for more timestamp settings.

Finally, if all these inputs are actually variations of the same type of data, you could create individual sourcetypes, but name them similarly. For exampe, mySourceType-1, mySourceType-2, etc. Then you could easily search them all by specifying sourcetype=mySourceType* in your search string.

0 Karma
Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

Splunk is officially part of Cisco

Revolutionizing how our customers build resilience across their entire digital footprint.   Splunk ...

Splunk APM & RUM | Planned Maintenance March 26 - March 28, 2024

There will be planned maintenance for Splunk APM and RUM between March 26, 2024 and March 28, 2024 as ...