Hello,
So this is my first time trying to consolidate logs and use the data extraction and I am a little lost. I have the following payload below and I would like to extract the following fields from the "line" field in the json payload.
An example payload would be
{"line":"2021/10/25 18:49:52.982|DEBUG|GoogleHomeController|Recieved a request for broadcast: {\"Message\":\"Ring Ring Ring Niko and Xander, someone is at your front door and rung the doorbell!\",\"ExecuteTime\":\"0001-01-01T00:00:00\"}","source":"stdout","tag":"b5fcd8b8b5a4"}
It all follows the format "{TIME}|{LEVEL}|{CONTROLLER}|{MESSAGE}" basically the fields seperated by pipe characters.
I have all the information there formatted using NLog in my code, but how do I extract the fields that are within a field out that way I can search based on the Time (from the log msg), Log Level, Controller, and Message?
How would I go about pulling this information out? I tried going through the field extraction, but it only seems to let me do it at the highlest level, IE the line, source, and tag fields, not the fields within.
@97WaterPolo Try if this works!
<your_base_search>
| rex "\{\"line\":\"(?<time>.+)\|(?<log_level>.+)\|(?<Controller>.+)\|(?<Message>.+)\}\""
Oh wow that's pretty cool that it maps the fields from the Regex. That is like 80% complete. I still have the issue if it pulling both of the times
Thank you very much! That was great for splitting it up and making it searchable by controller!!!
We have different things mixed here.
One is the field extraction, second is a search doing some field capturing. These are similar things but a bit different.
If you set up an extraction "system-wide" (i.e. in an app), you can match by fields in your search. If you use a search, well, you can only match after performing full search and manual rex command. And if you want to do many searches over the same data, you must do the manual extraction thingy separately in every search. And if you at some point decide you want to add additional field to the extraction or your data format changes, you'd have to manually change all searches. Not very convenient.
So it's better to define extraction to be performed automatically by splunk. rex command is useful for testing the extraction regexes though.
Last thing is the timestamp. While the "normal" field extraction is performed by splunk at search-time, the timestamp parsing is done at index-time - every event has to have some time associated with it. It is stored in the _time field. The best practice is to parse out the proper timestamp from the event so the events are properly indexed by _time and are properly searchable by _time. If the indexer is not able to parse the timestamp from the event, there are some rules that are used to generate a timestamp for the event (usually it boils down to the moment the indexer receives the event) but it may simply not be the "proper" timestamp for the event.
Why is it important? Because _time is an indexed field (one of the few default ones) and filtering by time is the most efficient way of limiting your search. If you search only for a particular time range, with properly assigned _time field, splunk only limits search to the events matching this time range. If you have to check some other field (unless it's another indexed field), splunk has to fetch every event, parse the requested field, convert it to a timestamp and compare with the requested time range.
So you should set your extraction to the proper regex but also should set the proper time parsing/detecting options (TIME_PREFIX, TIME_FORMAT) for your sourcetype (since it's index-time settings it won't affect the events already ingested though; it will only work for new events). And don't worry about having the same value in multiple fields. It doesn't matter that much (and you often have explicitly created aliases of one field to another name, for example to make your data compliant with a datamodel). If you want to _not display_ some fields, you can - as @venkatasri already showed - limit output from your search to particular set of fields using "table" or "fields" command.