Splunk Search

How to extract variable length message field

jravida
Communicator

Hey folks,

So I have some logs coming in CEF format. Splunk is doing it's automatic field extraction, but when I look at the msg field, it only contains the first word of the message field.

So it looks like this:

msg=The

instead of

msg=The user was granted magical powers for 15 minutes.

It just kicks out the rest of the message. It can't be found within any other fields.
I've only set this up to ingest syslog data being put on my local server, and defined the index/sourcetype. Nothing fancy, yet.

Any help would be greatly appreciated!

Tags (3)
0 Karma
1 Solution

Lowell
Super Champion

Have you looked at the CEF (Common Event Format) Extraction Utilities app? http://apps.splunk.com/app/487/

View solution in original post

0 Karma

Lowell
Super Champion

I converted my comment to an answer. If it does in fact resolve your issue, click on the check mark icon.

0 Karma

jravida
Communicator

This is EXACTLY what I'm looking for! Splunk community to the rescue. Thanks Lowell!

0 Karma

Lowell
Super Champion

Have you looked at the CEF (Common Event Format) Extraction Utilities app? http://apps.splunk.com/app/487/

0 Karma

jravida
Communicator

I'd post them, but I would have to pull them from a lab environment - which I don't always have access to.

They are also to numerous and unique to give you guys anything meaningful by posting them.

I noticed that since the events are coming in CEF, all the field values that are pipe | delimited are extracted just fine, even if there is a space (such as |Delete Attibute| or |Microsoft Windows|).

Once the pipe delimitation ends, it seems to perform 'space delimitation' on the rest of the message - fields such as cs1, cs2,cs3, cs1label, msg, etc.

0 Karma

Lowell
Super Champion

Yeah, post a sample. Depending on what comes after each field, there may be other field extraction options. Just click Edit under your question to post additional content.

jravida
Communicator

Hey folks, thanks for the help so far.

On further inspection of the events, it appears that all fields in all events (ones I have coming in CEF format from a remote ArcSight connector, and being placed in a file via syslog on the splunk box) suffer from the same symptom.

Regardless of the log source or key field, if the variable has a space then the next words are ignored. Only the first word gets extracted.

I'm hoping there is a higher level than building regexes to process this, as that wouldn't be very scalable and it would be incredibly time consuming.

0 Karma

somesoni2
Revered Legend

Also, if you have control over the log file format, you may want to enclose string fields with multiple words within double quotes.

kristian_kolb
Ultra Champion

It looks like you'll be forced to create a field extraction by yourself. This will likely require some regex skills. You should post some sample events so that others can help you.

Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...