Extract information like time from NLog formatted ...

97WaterPolo · ‎10-25-2021

Hello,

So this is my first time trying to consolidate logs and use the data extraction and I am a little lost. I have the following payload below and I would like to extract the following fields from the "line" field in the json payload.

An example payload would be

{"line":"2021/10/25 18:49:52.982|DEBUG|GoogleHomeController|Recieved a request for broadcast: {\"Message\":\"Ring Ring Ring Niko and Xander, someone is at your front door and rung the doorbell!\",\"ExecuteTime\":\"0001-01-01T00:00:00\"}","source":"stdout","tag":"b5fcd8b8b5a4"}

Time - "2021/10/25 18:49:52.982"
Level - "DEBUG"
Controller - "GoogleHomeController"
Message - "Recieved a request for broadcast..."

It all follows the format "{TIME}|{LEVEL}|{CONTROLLER}|{MESSAGE}" basically the fields seperated by pipe characters.

I have all the information there formatted using NLog in my code, but how do I extract the fields that are within a field out that way I can search based on the Time (from the log msg), Log Level, Controller, and Message?

How would I go about pulling this information out? I tried going through the field extraction, but it only seems to let me do it at the highlest level, IE the line, source, and tag fields, not the fields within.

venkatasri · ‎10-25-2021

@97WaterPolo Try if this works!

<your_base_search>
| rex "\{\"line\":\"(?<time>.+)\|(?<log_level>.+)\|(?<Controller>.+)\|(?<Message>.+)\}\""

97WaterPolo · ‎10-25-2021

Oh wow that's pretty cool that it maps the fields from the Regex. That is like 80% complete. I still have the issue if it pulling both of the times

is it possible to get rid of the "_time" column and only have the "time" one from the Regex? Right now it has both.
How do you omit logs that don't meet that format? I have a few logs generated that don't meet that format so they are just showing up as empty across the page.
Is there a way to readjust the columns so that I can make some of them alrger than the others without it wrapping around to the next line?

Thank you very much! That was great for splitting it up and making it searchable by controller!!!

PickleRick · ‎10-26-2021

We have different things mixed here.

One is the field extraction, second is a search doing some field capturing. These are similar things but a bit different.

If you set up an extraction "system-wide" (i.e. in an app), you can match by fields in your search. If you use a search, well, you can only match after performing full search and manual rex command. And if you want to do many searches over the same data, you must do the manual extraction thingy separately in every search. And if you at some point decide you want to add additional field to the extraction or your data format changes, you'd have to manually change all searches. Not very convenient.

So it's better to define extraction to be performed automatically by splunk. rex command is useful for testing the extraction regexes though.

Last thing is the timestamp. While the "normal" field extraction is performed by splunk at search-time, the timestamp parsing is done at index-time - every event has to have some time associated with it. It is stored in the _time field. The best practice is to parse out the proper timestamp from the event so the events are properly indexed by _time and are properly searchable by _time. If the indexer is not able to parse the timestamp from the event, there are some rules that are used to generate a timestamp for the event (usually it boils down to the moment the indexer receives the event) but it may simply not be the "proper" timestamp for the event.

Why is it important? Because _time is an indexed field (one of the few default ones) and filtering by time is the most efficient way of limiting your search. If you search only for a particular time range, with properly assigned _time field, splunk only limits search to the events matching this time range. If you have to check some other field (unless it's another indexed field), splunk has to fetch every event, parse the requested field, convert it to a timestamp and compare with the requested time range.

So you should set your extraction to the proper regex but also should set the proper time parsing/detecting options (TIME_PREFIX, TIME_FORMAT) for your sourcetype (since it's index-time settings it won't affect the events already ingested though; it will only work for new events). And don't worry about having the same value in multiple fields. It doesn't matter that much (and you often have explicitly created aliases of one field to another name, for example to make your data compliant with a datamodel). If you want to _not display_ some fields, you can - as @venkatasri already showed - limit output from your search to particular set of fields using "table" or "fields" command.

venkatasri · ‎10-25-2021

for time field exclude the _time as they both having same timestamp at the end of search - | fields - _time OR do not add _time to table command
Find a pattern of logs that you wish to search - <your_base_search> <pattern> | rex "blabla" | fields - _time | table time log_level Controller Message
column values are automatically wrapped there is no out of the box approach, if the table is part of simple XML dashboard with custom JS/CSS that can be adjusted

Extract information like time from NLog formatted event?

subsearch

Prove Your Splunk Prowess at .conf25—No Prereqs Required!

Splunk Observability Cloud's AI Assistant in Action Series: Observability as Code

Splunk Answers Content Calendar, July Edition I

Are you a member of the Splunk Community?

Extract information like time from NLog formatted event?

subsearch

Prove Your Splunk Prowess at .conf25—No Prereqs Required!

Splunk Observability Cloud's AI Assistant in Action Series: Observability as Code

Splunk Answers Content Calendar, July Edition I