Splunk Search

How to extract fields from Json from selected log and use it as table column?

rnach
Explorer

Hi all, I am new to Splunk. Right now I am trying to make a table out of a log, which contains different fields like Level = INFO etc., there's a field 

 

 

 Log = {"objects":[object1, object2 ...], "info": "some strings", "id1": someInt, "id2": someInt} 

Log = {"objects":[object1, object2 ...], "info": "some other strings", "id1": someOtherInt, "id2": someOtherInt} 

Log = { "info": "some log strings"} 

Log = "some string"

 

 

I have tried a few rex and spath but it seems that it's not working well


I would like to extract "objects" field by different "info", for example, I need objects from Log but sometimes I need objects from the first Log above, and sometimes I need them from second Log ( for different panels in dashboard), and the way to separate them is by using "info"
And need to display objects in it in a chart under a column. Any help/hints are appreciated!

Labels (4)
0 Karma
1 Solution

yuanliu
SplunkTrust
SplunkTrust

In that case, you will need to define how Splunk should behave first.  For example, from the line where Log = "Some string", do you expect some use?  Such lines will simply give null values for info, objects{}, etc.

_raw_timeid1id2info
objects{}
Log = {"objects":["object1", "object2"], "info": "some strings", "id1": "someInt", "id2": "someInt"}2022-08-05 23:41:37someIntsomeIntsome strings
object1
object2
Log = {"objects":["object1", "object2", "object3"], "info": "some other strings", "id1": "someOtherInt", "id2": "someOtherInt"}2022-08-05 23:41:37someOtherIntsomeOtherIntsome other strings
object1
object2
object3
Log = { "info": "some log strings"}2022-08-05 23:41:37  some log strings 
Log = "some string"2022-08-05 23:41:37    

Based on your original description, the objective is really just to operate on those two fields.

After spath, you can definitely select objects{} from whichever info value.  For example,

 

| spath input=Log
| where info == "some strings"

 

will give

_raw_timeid1id2info
objects{}
Log = {"objects":["object1", "object2"], "info": "some strings", "id1": "someInt", "id2": "someInt"}2022-08-05 23:41:37someIntsomeIntsome strings
object1
object2

 

| spath input=Log
| where info == "some other strings"​

 

gives

_raw_timeid1id2info
objects{}
Log = {"objects":["object1", "object2", "object3"], "info": "some other strings", "id1": "someOtherInt", "id2": "someOtherInt"}2022-08-05 23:41:37someOtherIntsomeOtherIntsome other strings
object1
object2
object3

and so on.

Is that what you described?

View solution in original post

yuanliu
SplunkTrust
SplunkTrust

Regex should be the last thing to try when extracting information from a structured data set.  Use spath.

 

| spath input=Log

 

Logid1id2info
objects{}
{"objects":["object1", "object2"], "info": "some strings", "id1": "someInt", "id2": "someInt"}someIntsomeIntsome strings
object1
object2
{"objects":["object1", "object2", "object3"], "info": "some other strings", "id1": "someOtherInt", "id2": "someOtherInt"}someOtherIntsomeOtherIntsome other strings
object1
object2
object3
{ "info": "some log strings"}  some log strings 
Tags (1)

rnach
Explorer

Sorry, I need to rephrase my question. For Log, not  every one of them is standard JSON, some of them are just a string, and sometimes the Log is a totally different structure of Json

0 Karma

yuanliu
SplunkTrust
SplunkTrust

In that case, you will need to define how Splunk should behave first.  For example, from the line where Log = "Some string", do you expect some use?  Such lines will simply give null values for info, objects{}, etc.

_raw_timeid1id2info
objects{}
Log = {"objects":["object1", "object2"], "info": "some strings", "id1": "someInt", "id2": "someInt"}2022-08-05 23:41:37someIntsomeIntsome strings
object1
object2
Log = {"objects":["object1", "object2", "object3"], "info": "some other strings", "id1": "someOtherInt", "id2": "someOtherInt"}2022-08-05 23:41:37someOtherIntsomeOtherIntsome other strings
object1
object2
object3
Log = { "info": "some log strings"}2022-08-05 23:41:37  some log strings 
Log = "some string"2022-08-05 23:41:37    

Based on your original description, the objective is really just to operate on those two fields.

After spath, you can definitely select objects{} from whichever info value.  For example,

 

| spath input=Log
| where info == "some strings"

 

will give

_raw_timeid1id2info
objects{}
Log = {"objects":["object1", "object2"], "info": "some strings", "id1": "someInt", "id2": "someInt"}2022-08-05 23:41:37someIntsomeIntsome strings
object1
object2

 

| spath input=Log
| where info == "some other strings"​

 

gives

_raw_timeid1id2info
objects{}
Log = {"objects":["object1", "object2", "object3"], "info": "some other strings", "id1": "someOtherInt", "id2": "someOtherInt"}2022-08-05 23:41:37someOtherIntsomeOtherIntsome other strings
object1
object2
object3

and so on.

Is that what you described?

rnach
Explorer

Yes, thank you! Thank you that works very well. As you mentioned, I do expect some use. For example, if I want to display the _time in a column where "some strings" and _time in another column where "some other strings" in a chart(essentially the time range of two Logs). I tried use spath twice in the search but it seems that values are not returned correctly while "info" is different strings.

0 Karma

yuanliu
SplunkTrust
SplunkTrust

OK, so those events in which Log do not equal to a valid JSON do not matter.  Your requirements are

  1. Extract fields such as "info" from JSON.
  2. Use field value as new column name.

The first is achieved by spath.  I haven't found a general approach to the second.  However, if you can enumerate values of info, here is a cheat:

| spath input=Log
| foreach "some strings" "some other strings" "some log strings"
    [ eval <<FIELD>> = if(info == "<<FIELD>>", _time, null()) ]

 Alternatively, if you only want to tabulate _time by info value,

| spath input=Log
| table _time info
| transpose header_field=info

rnach
Explorer

Oh, sorry for any confusion. I am not trying to use values as column name rather than values.

What I’m trying to do here is to find the time stamp of log with certain info( for example: some strings) and find the time stamp of another log with certain info( for example, some other strings) and trying to display the duration in between. After spath, i have something like:

| eval session_start=if(searchmatch("some string"),min(_time),null()) 
| eval session_end=if(searchmatch("some other string"),max(_time),null())
| stats values(session_start) as start, values(session_end) as end | eval Duration= end-start | table Duration start end

 But it's not displaying the duration, but session_start and session_end are correct if I put them under table, trying to calculate diff but it seems that it's not calculating. 

 

 

Update: 
I figured it out 

| eval session_start=if(searchmatch("some string"),_time,null()) 
| eval session_end=if(searchmatch("some other string"),_time,null())
| stats values(session_start) as ss, values(session_end) as se | eval dur=se-ss | table dur

 

yuanliu
SplunkTrust
SplunkTrust

You just invented transaction😉.  For reference, you can achieve the same with something like

 

| transaction startswith=eval(info=="some strings") endswith=eval(info=="some otherstrings")
| table duration

 

 But transaction is expensive.  Using stats is usually preferred.

Get Updates on the Splunk Community!

Webinar Recap | Revolutionizing IT Operations: The Transformative Power of AI and ML ...

The Transformative Power of AI and ML in Enhancing Observability   In the realm of IT operations, the ...

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...