Issue with INDEXED_EXTRACTIONS for Microsoft Excha...

oshirnin · ‎03-11-2020

Hello, everybody!

I have Splunk Enterprise 7.3.2 infrastructure with Splunk UF's deployed particularly to our corporate on-premises Microsoft Exchange Server 2016 servers. We have 40 Exchange servers with really huge mail flow and we want to collect all the Message Tracking log into Splunk for future investigations. We do not have a license for Splunk App for Microsoft Exchange / Splunk Add-on for Microsoft Exchange, so I wrote a simple custom app for UF with inputs.conf / props.conf to reach my goal.

Microsoft Exchange Server Message Tracking log files are CSV-files with first 4 commented # lines, then 5th commented filed headers line preceded with #Fields:, next data lines. Detailed format description can be find here https://docs.microsoft.com/en-us/exchange/mail-flow/transport-logs/message-tracking?view=exchserver-...

I deployed the following inputs.conf to my UF's:

[monitor://C:\Queue\TransportLogs\MessageTracking]
disabled = 1
time_before_close = 30
sourcetype = my_exchange_logs_message_tracking
ignoreOlderThan = 1d
crcSalt = <SOURCE>
whitelist = \.log$|\.LOG$
# blacklist =

I deployed the following props.conf to my UF's:

[my_exchange_logs_message_tracking]
disabled = false
SHOULD_LINEMERGE = false
MAX_TIMESTAMP_LOOKAHEAD = 25
INDEXED_EXTRACTIONS = CSV
FIELD_HEADER_REGEX = ^#Fields:\s*(.*)
# HEADER_FIELD_LINE_NUMBER = 5
PREAMBLE_REGEX = ^#.*
HEADER_FIELD_DELIMITER = ,
FIELD_DELIMITER = ,
FIELD_QUOTE = "
TRUNCATE = 0

Everything seems to be working fine, but sometimes I see my INDEXED_EXTRACTIONS do not work for some source log file lines. I see two problems while looking to the events with SH:

For some source lines I do not see any fields according to headers extracted and indexed
For some other source lines I see fields extracted and indexed not to header line field names but to set of EXTRA_FIELD_## fields. Moreover, if one row fails configured extractions with described effect - all the future lines till the EOF. I carefully checked the "problem" lines and actually found no problems to these! I also put a test but the same configured inputs.conf / props.conf to another server, brought the "problem" source files here - and these got indexed with no problems.

With that written I expect my UF's on production Exchange Servers sometimes maybe have some problems, that prevent processing configured field extractions. I took a look into _internal but did not mention anything suspicious according to field extractions.

Does maybe anybody have any ideas what should I check next to catch and fix the problem?

saad_siddiqi · ‎09-08-2020

Hi There,

Were you able to get this sorted?

I have a similar issue where my CSVs are getting translated as a bunch of EXTRA_FIELD_##

oshirnin · ‎03-12-2020

Hello, again!

Just checked the incoming data for today and see the both problems again. Again I took one of problematic files to test re-indexing on another server and it worked like a charm. I see no problems with source data then.

Does maybe anyone have an idea what should I do?

oshirnin · ‎03-11-2020

I just found some interesting entries in splunkd.log and metrics.log of several UF's. These are, for example:

splunkd.log

03-11-2020 09:22:28.044 +0000 INFO  Metrics - group=queue, name=parsingqueue, blocked=true, max_size_kb=512, current_size_kb=511, current_size=1336, largest_size=1336, smallest_size=0

metrics.log

03-11-2020 09:22:32.247 +0000 WARN  TailReader - Could not send data to output queue (parsingQueue), retrying...
03-11-2020 09:30:30.390 +0000 WARN  TailReader - Could not send data to output queue (structuredParsingQueue), retrying...

These events are recorded quite often for several UF's but I can not say that they correlate well with previously described Exchange Server Message Tracking log files lines not being indexed. After some reading I increased parsingQueue size to 10MB for my UF's, significantly reduced parsingQueue fill ratio:

server.conf

[queue=parsingQueue]
maxSize = 10MB

Did not make any changes to structuredParsingQueue maxSize yet. It seems I should change one setting in a time and wait 🙂 I observe the coming Message Tracking events, no problems for now.

That's what bothers me - if blocked UF's queues are really the initial problem is it possible, that some of source events do not fall into parsingQueue and pass to Heavy Forwarder server without INDEXED_EXTRACTIONS being applied (raw, not cooked)? As I do not have INDEXED_EXTRACTIONS configuration for my_exchange_logs_message_tracking sourcetype anywhere except UF's in this case these events definitely reach Indexer servers without fields extracted. If so then in case of growing mail flow we can face the problem again. Should I maybe place my INDEXED_EXTRACTIONS configuration also to Heavy Forwarder and Indexer servers?

Or, if the queue is blocked, do original events wait in queue (what queue?) and get delayed on UF, but do pass the parsingQueue anyway?

Issue with INDEXED_EXTRACTIONS for Microsoft Exchange Server Message Tracking log files

Introducing the 2024 SplunkTrust!

Introducing the 2024 Splunk MVPs!

Splunk Custom Visualizations App End of Life