I'm running Splunk version 4.1.5, build 85165 on a Win2003 32-bit server with a dual-core CPU and 4GB RAM. I realize Win2003 32-bit is not recommended, but I'm guessing it might not be the root cause of this problem.
I have around 50 hosts sending syslog messages to it on UDP:514 and I realized that the server is not indexing around 50% of data with the right timestamp. There are arbitrary gaps of several minutes after which it starts indexing properly again.
Here are two samples of the tons of errors I'm seeing in splunkd.log. Interestingly, they all have the same incorrect parsed timestamp 'Sat Nov 27 00:02:19 2010'. What could possibly be causing this? I've compared the wireshark captures with messages that are indexed properly and I don't see any differences. It's making the server unusable for the monitoring I need it to do.
12-01-2010 19:44:42.449 WARN DateParserVerbose - The TIME_FORMAT specified is matching timestamps (Sat Nov 27 00:02:19 2010) outside of the acceptable time window. If this timestamp is correct, consider adjusting MAX_DAYS_AGO and MAX_DAYS_HENCE.
12-01-2010 19:44:42.449 WARN DateParserVerbose - Failed to parse timestamp for event. Context="source::udp:514|host::Splunk-Server|syslog|" Text="I:..."
I think there's a configuration problem with how your data is being fed to Splunk.
Your data's sourcetype is set to syslog, which is a single line sourcetype (meaning one event per line, in this case with a timestamp on each line). But if you look at your error message, you see 'text="I:..."', which means that the event's text is "I:", which is definitely not a syslog event with a timestamp! It's rightfully complaining.
It seems that perhaps you have different streams of data, non-syslog data, coming in over udp:514.
If you were to store a sample of your syslog events as a file, and have splunk index that, I believe it would process it rather uneventfully, so to speak.
Hope that helps.
Thanks for the pointer. It turns out each syslog message has one or more newlines (apparently not uncommon) and Splunk is treating them as separate events. Is there any way to make it ignore newlines or replace them with a different character that results in a single line message? Here's a sample message:
Dec 2 10:42:40 10.21.0.161 MGSIG:3318 <<< Recv from 188.8.131.52:2727 ---\nAUEP 337325746 00125116D951@[184.108.40.206] MGCP 1.0\nK: 337325674\nF: N,I,ES
I'm not sure whether storing the syslog events as a file will make this problem go away.