A websphere server, in particular the websphere_trlog appear to be getting over indexed by a huge amount
Checking http://server:port/en-US/app/search/indexing_volume shows 30gb worth of data on a single /ntfs/kahobtwas39Jlog/PROD/XAG_3_1/SystemOut.log Looking at the log size in the dir has 30MB worth of logs, Splunk appears to have collected more then 30GB worth
Logs are rotating based on time, but they should still not be anywhere near 30gigs and the rotated logs are not whitelisted inputs.conf - settings
[monitor:///ntfs/kahobtwas38Jlog] disabled = false crcSalt = <SOURCE> host = kahobtwas38.kah.unitrininc.com sourcetype = websphere_trlog_sysout whitelist = SystemOut\.log|SystemErr\.log
They are getting a lot of DateParserVerbose errors so it's possible events are getting over indexed by failing date extraction?
07-01-2010 15:50:03.880 WARN DateParserVerbose - Time parsed (Sat Dec 1 14:50:15 2007) is too far away from the previous event's time (Thu Jul 1 15:50:15 2010) to be accepted. If this is a correct time, MAX_DIFF_SECS_AGO (3600) or MAX_DIFF_SECS_HENCE (604800) may be overly restrictive. Context="source::/ntfs/kahobtwas39Jlog/PROD/XAG_3_1/SystemOut.log|host::kahobtwas39.kah.unitrininc.com|websphere_trlog_sysout|"
Perhaps just turning off date extraction would help resolve this ala / or any other ideas?
/opt/splunk/etc/system/local/props.conf [websphere_trlog_sysout] DATETIME_CONFIG = CURRENT
I've tried monitoring with file inputs set to DEBUG but not seeing anything useful
Please indicate Splunk version of forwarder and indexer, if applicable, as well as type of indexer. Also indicate if there is a disparity between metrics logging and license volume.
sorry for the delay i was trying to recommend client using websphere app. Turns out it wont work for them.
The indexer is 4.1.3, Its monitoring network shares cifs/ntfs mounts such as:
[monitor:///ntfs/kahobtwas39Jlog] disabled = false crcSalt = < SOURCE > host = kahobtwas39.kah.unitrininc.com sourcetype = websphere_trlog_sysout _whitelist = (SystemOut\.log$|SystemErr\.log$) blacklist = (SystemOut_\d+.*|SystemErr_\d+.)
I tried adding those whitelist/blacklist entries to filter out the rotated logs. Still the same behavior
The first thing to look for in a case like this is duplicate events. If there are no duplicate events, where is the volume coming from? If there are, take a look at when these events are indexed by looking at
_indextime to see when the data was indexed.
As an aside, why is the crcSalt set? Also, setting
DATETIME_CONFIG here is a bad idea, the root problem is that event breaking isn't working properly and we need better configurations there.
The crcSalt was set because the websphere logs all have a really big header which is identical in all the rotated logs, and splunk wouldnt index in the next SystemOut.log when it rotated.
Ill check on the duplicate events w/_indextime value, Thanks