Splunk Search

Extract information like time from NLog formatted event?

97WaterPolo
Engager

Hello,

So this is my first time trying to consolidate logs and use the data extraction and I am a little lost. I have the following payload below and I would like to extract the following fields from the "line" field in the json payload.

An example payload would be

 

{"line":"2021/10/25 18:49:52.982|DEBUG|GoogleHomeController|Recieved a request for broadcast: {\"Message\":\"Ring Ring Ring Niko and Xander, someone is at your front door and rung the doorbell!\",\"ExecuteTime\":\"0001-01-01T00:00:00\"}","source":"stdout","tag":"b5fcd8b8b5a4"}

 

  • Time - "2021/10/25 18:49:52.982"
  • Level - "DEBUG"
  • Controller - "GoogleHomeController"
  • Message - "Recieved a request for broadcast..."

It all follows the format "{TIME}|{LEVEL}|{CONTROLLER}|{MESSAGE}" basically the fields seperated by pipe characters.

I have all the information there formatted using NLog in my code, but how do I extract the fields that are within a field out that way I can search based on the Time (from the log msg), Log Level, Controller, and Message?

How would I go about pulling this information out? I tried going through the field extraction, but it only seems to let me do it at the highlest level, IE the line, source, and tag fields, not the fields within.

 

Labels (1)
0 Karma

venkatasri
SplunkTrust
SplunkTrust

@97WaterPolo  Try if this works!

<your_base_search>
| rex "\{\"line\":\"(?<time>.+)\|(?<log_level>.+)\|(?<Controller>.+)\|(?<Message>.+)\}\""

 

97WaterPolo
Engager

Oh wow that's pretty cool that it maps the fields from the Regex. That is like 80% complete. I still have the issue if it pulling both of the times

  • is it possible to get rid of the "_time" column and only have the "time" one from the Regex? Right now it has both.
  • How do you omit logs that don't meet that format? I have a few logs generated that don't meet that format so they are just showing up as empty across the page.
  • Is there a way to readjust the columns so that I can make some of them alrger than the others without it wrapping around to the next line?

97WaterPolo_0-1635224779501.png

 

Thank you very much! That was great for splitting it up and making it searchable by controller!!!

0 Karma

PickleRick
SplunkTrust
SplunkTrust

We have different things mixed here.

One is the field extraction, second is a search doing some field capturing. These are similar things but a bit different.

If you set up an extraction "system-wide" (i.e. in an app), you can match by fields in your search. If you use a search, well, you can only match after performing full search and manual rex command. And if you want to do many searches over the same data, you must do the manual extraction thingy separately in every search. And if you at some point decide you want to add additional field to the extraction or your data format changes, you'd have to manually change all searches. Not very convenient.

So it's better to define extraction to be performed automatically by splunk. rex command is useful for testing the extraction regexes though.

Last thing is the timestamp. While the "normal" field extraction is performed by splunk at search-time, the timestamp parsing is done at index-time - every event has to have some time associated with it. It is stored in the _time field. The best practice is to parse out the proper timestamp from the event so the events are properly indexed by _time and are properly searchable by _time. If the indexer is not able to parse the timestamp from the event, there are some rules that are used to generate a timestamp for the event (usually it boils down to the moment the indexer receives the event) but it may simply not be the "proper" timestamp for the event.

Why is it important? Because _time is an indexed field (one of the few default ones) and filtering by time is the most efficient way of limiting your search. If you search only for a particular time range, with properly assigned _time field, splunk only limits search to the events matching this time range. If you have to check some other field (unless it's another indexed field), splunk has to fetch every event, parse the requested field, convert it to a timestamp and compare with the requested time range.

So you should set your extraction to the proper regex but also should set the proper time parsing/detecting options (TIME_PREFIX, TIME_FORMAT) for your sourcetype (since it's index-time settings it won't affect the events already ingested though; it will only work for new events). And don't worry about having the same value in multiple fields. It doesn't matter that much (and you often have explicitly created aliases of one field to another name, for example to make your data compliant with a datamodel). If you want to _not display_ some fields, you can - as @venkatasri already showed - limit output from your search to particular set of fields using "table" or "fields" command.

venkatasri
SplunkTrust
SplunkTrust
  • for time field exclude the _time as they both having same timestamp  at the end of search -  | fields - _time OR do not add _time to table command
  • Find a pattern of logs that you wish to search - <your_base_search> <pattern> | rex "blabla" | fields - _time | table time log_level Controller Message
  • column values are automatically wrapped there is no out of the box approach, if the table is part of simple XML dashboard with custom JS/CSS that can be adjusted
Get Updates on the Splunk Community!

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...