Getting Data In

How to configure Splunk to properly parse logs that contain one or more values with double quotes?

msboers
Engager

Hello Splunk community,

Currently I am doing research as an intern at a government agency if their Windows services written in C# can have their logging end up in a Splunk environment. All of these services use the Windows Enterprise Library which can be easily modified to fit the Splunk logging best practices with just configuration files. After changing these, I ended up with the following format:

2016-10-27 14:10:28.41 TZ DST, type=trace, level=Information, category=IN, threadid=12732, servicenaam="LogExample v2.vshost", machinenaam="DPCV74", berichttype="Berichttype x", eventid="10", bericht="Dit is een log met een ander bericht type en bericht id en logtype"

The problem is that the 'bericht' variable and possible other variables may contain quotes in the value. These could be from an XML message being logged or exception traces containing double quotes. This breaks Splunk and results in wrong parameters being detected and line breaks not working properly.

I've tried the following things;

  • Changing the quotes to double dollar sings for the 'bericht' variable. (eventid=x, bericht=$$testvalue with quotes " and single ' $$)
  • Overriding the Enterprise Library TextFormatter and stripping out quotes before writting to log
  • Write to JSON format

I successfully extracted the fields using the 'Extract new fields' function and Regular Expressions in Splunk Web with the first solution, but I don't think this is the best way for this problem. The second solution would be a lot cleaner, but requires much more changes to the system and therefore has a bigger impact. (Would require changes to code for 100+ services)
The third solution is impossible with the version of Enterprise Library that is being used (JsonLogFormatter is introduced in version 6+)

If code would have to be changed, would using the Splunk C# HTTP Event Collector help with this issue?

Sadly, I didn't find a lot on this issue other than 'just strip the quotes', so hopefully someone can help me.

Thanks,
Martijn

0 Karma

somesoni2
SplunkTrust
SplunkTrust

You reported two issues when your field bericht has quotes in them,
1) Splunk not breaking the logs properly- What is the line breaking configuration you have? I believe something like this in props.conf on Indexer/Heavy Forwarder would work even if there is an additional quote.

[yoursourcetype]
SHOULD_LINEMERGE=false
LINE_BREAKER=([\r\n]+)(?=\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}\.\d+)
TIME_PREFIX=^
TIME_FORMAT=%Y-%m-%d %H:%M:%S.%N
MAX_TIMESTAMP_LOOKAHEAD=22

2) The field bericht is not extracted correctly: If field bericht is the very last field (as shown in your sample event), you could create a custom extraction like this (Props.conf on Search Head) to capture everything till the end. Would something like that work for you?

[yoursourcetype]
EXTRACT-bericht=bericht=\"(?<bericht>.+)\"$

msboers
Engager

Changing the bericht (changed to message for language consistency) regex to: EXTRACT-message=(?:[^\$\n]*\$){2}(?P<message>[^\$]+) did the trick though. After testing I found that writing to the winlog quotes get parsed as well so that might also be an alternative.

0 Karma

msboers
Engager

Hello @somesoni2

Using your line breaker I am getting the correct line breaks but the bericht (changed its name to message for language consistency) is not extracted properly. I've restarted and refreshed Splunk, see image:

alt text

0 Karma

ddrillic
Ultra Champion

What an eloquent explanation @msboers.

XML is amazingly simple in its specifications. It defined just five entity references for five characters and double quote is one of them.

Maybe you should you use the entity reference for the double quote.

Splunk explains at Special characters in XML files

It says -

alt text

0 Karma

msboers
Engager

Thank you for your comment ddrillic, this would still require me to edit the message and escape these special characters right? That would be the same problem I had with my second solution.

0 Karma

ddrillic
Ultra Champion

I see - sorry about that...

0 Karma
Get Updates on the Splunk Community!

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...

Welcome to the Splunk Community!

(view in My Videos) We're so glad you're here! The Splunk Community is place to connect, learn, give back, and ...

Tech Talk | Elevating Digital Service Excellence: The Synergy of Splunk RUM & APM

Elevating Digital Service Excellence: The Synergy of Real User Monitoring and Application Performance ...