I have several groupwise servers running forwarders to a single index server. For the most part the data is arriving and being indexed but it is not being segmented properly. The entries are being indexed in 100+ line groups and lines are being split across indexed entries. For instance, this entry:
7/12/11 8:43:57.000 AM
/mslocal/mshold/po2be3/4/00065cef.001
08:43:57 496 MTP: panthera.macewan.ca: Transmitting file /gw/podom1/mslocal/mshold/po2be3/4/00065cef.001, Size: 22748
08:43:57 496 MTP: panthera.macewan.ca: End-of-file confirmation packet received
08:43:59 304 MTP: po5.podom1: Returning acknowledge (11)
08:43:59 304 MTP: po5.podom1: Returning acknowledge (11)
Show all 108 lines
The first line is the date/timestamp associated with the entry.
The next line (with no timestamp) comes the end of the previous entry and has been split off from it's timestamp.
The next 4 lines are properly formatted, However, if I were to "show all 108 lines" then the final line would be broken at a random spot and added at the start of the next entry.
It appears that splunk is not recognizing the timestamp as the start of an entry and is just grouping the data as it receives it from the forwarder into a single indexed entry.
How do I solve this???
It appears as though you should tell Splunk more about how you want to see the time stamp. It isn't uncommon for Splunk to need instruction in order to improve time stamp recognition. This configuration will be done on the indexer, where the data from your forwarder is parsed.
http://www.splunk.com/base/Documentation/latest/Data/Configuretimestamprecognition
You should probably use TIME_FORMAT and TIME_PREFIX in props.conf. Since this data looks like multi line data, you'd probably use a karat as the prefix. Something like this should work:
[mysourcetype]
TIME_PREFIX = ^
TIME_FORMAT = %m/%d/%y %H:%M:%S.%3N %p
MAX_TIMESTAMP_LOOKAHEAD = 22
You'll also probably need to use BREAK_ONLY_BEFORE and MUST_BREAK_AFTER in props.conf to define the beginning and end of your events. This will ensure that everything you'd like to be captured in a single event will be contained within that event.
http://www.splunk.com/base/Documentation/latest/admin/Propsconf
BREAK_ONLY_BEFORE = <regular expression>
* When set, Splunk creates a new event only if it encounters a new line that matches the
regular expression.
* Defaults to empty.
MUST_BREAK_AFTER = <regular expression>
* When set and the regular expression matches the current line, Splunk creates a new event for
the next input line.
* Splunk may still break before the current line if another rule matches.
* Defaults to empty.
There may be other line breaking settings which may work better in your instance, but this should give you a good place to start.