We are having some difficulties getting accurate timestamping on files with the same names, which are being fowarded from multiple servers to a single indexer. We have differently formatted timestamps on various files with the same names scattered across the servers each called stderr.log, stdout.log etc. They are all being recognized as dates in the future (ie 7/1/11)
IE (each line is an example from a differently formatted stdout.log located on a different machine)
<Wednesday January 5 17:15:20 EST 2011> : 05/01/2011 17:15:20 <DispatchThread> DEBUG [MarketImpl] Processing BI9 heartbeat
<Wednesday January 5 17:18:03 EST 2011> : <5/01/2011 05:18:03 PM EST> <Debug> <HTTP> <BEA-101147> <HttpServer(12101878,n……..
1/05/11 04:40:37 PM EST [INFO] [IntroscopeAgent] Activating JMX Data Collection
<5/01/2011 04:40:23 PM EST> <Notice> <Security> <BEA-090082> <Security initializing using security realm myrealm.>
I have a feeling if I could ignore the “EST” string it would work, splunk thinks EST is US EST not Australian EST (were we are located). Do you have any ideas how I could accomplish this given that they all logs have the same file name.
I noticed that there is the option to prefix & lookahead, is there anyway to do a postfix (ie max position).
The way I specified this matches the first line of your example. It skips the < and then looks at the timestamp, but skips the timezone (and year). Splunk will take the current year if not specified in the timestamp, so this is probably okay if you are not loading older data. You would need to make a similar stanza for each sourcetype.
Finally, if all these inputs are actually variations of the same type of data, you could create individual sourcetypes, but name them similarly. For exampe, mySourceType-1, mySourceType-2, etc. Then you could easily search them all by specifying sourcetype=mySourceType* in your search string.