Hi all, I am new to Splunk. Right now I am trying to make a table out of a log, which contains different fields like Level = INFO etc., there's a field
Log = {"objects":[object1, object2 ...], "info": "some strings", "id1": someInt, "id2": someInt}
Log = {"objects":[object1, object2 ...], "info": "some other strings", "id1": someOtherInt, "id2": someOtherInt}
Log = { "info": "some log strings"}
Log = "some string"
I have tried a few rex and spath but it seems that it's not working well
I would like to extract "objects" field by different "info", for example, I need objects from Log but sometimes I need objects from the first Log above, and sometimes I need them from second Log ( for different panels in dashboard), and the way to separate them is by using "info"
And need to display objects in it in a chart under a column. Any help/hints are appreciated!
In that case, you will need to define how Splunk should behave first. For example, from the line where Log = "Some string", do you expect some use? Such lines will simply give null values for info, objects{}, etc.
_raw | _time | id1 | id2 | info | objects{} |
Log = {"objects":["object1", "object2"], "info": "some strings", "id1": "someInt", "id2": "someInt"} | 2022-08-05 23:41:37 | someInt | someInt | some strings | object1 object2 |
Log = {"objects":["object1", "object2", "object3"], "info": "some other strings", "id1": "someOtherInt", "id2": "someOtherInt"} | 2022-08-05 23:41:37 | someOtherInt | someOtherInt | some other strings | object1 object2 object3 |
Log = { "info": "some log strings"} | 2022-08-05 23:41:37 | some log strings | |||
Log = "some string" | 2022-08-05 23:41:37 |
Based on your original description, the objective is really just to operate on those two fields.
After spath, you can definitely select objects{} from whichever info value. For example,
| spath input=Log
| where info == "some strings"
will give
_raw | _time | id1 | id2 | info | objects{} |
Log = {"objects":["object1", "object2"], "info": "some strings", "id1": "someInt", "id2": "someInt"} | 2022-08-05 23:41:37 | someInt | someInt | some strings | object1 object2 |
| spath input=Log
| where info == "some other strings"
gives
_raw | _time | id1 | id2 | info | objects{} |
Log = {"objects":["object1", "object2", "object3"], "info": "some other strings", "id1": "someOtherInt", "id2": "someOtherInt"} | 2022-08-05 23:41:37 | someOtherInt | someOtherInt | some other strings | object1 object2 object3 |
and so on.
Is that what you described?
Regex should be the last thing to try when extracting information from a structured data set. Use spath.
| spath input=Log
Log | id1 | id2 | info | objects{} |
{"objects":["object1", "object2"], "info": "some strings", "id1": "someInt", "id2": "someInt"} | someInt | someInt | some strings | object1 object2 |
{"objects":["object1", "object2", "object3"], "info": "some other strings", "id1": "someOtherInt", "id2": "someOtherInt"} | someOtherInt | someOtherInt | some other strings | object1 object2 object3 |
{ "info": "some log strings"} | some log strings |
Sorry, I need to rephrase my question. For Log, not every one of them is standard JSON, some of them are just a string, and sometimes the Log is a totally different structure of Json
In that case, you will need to define how Splunk should behave first. For example, from the line where Log = "Some string", do you expect some use? Such lines will simply give null values for info, objects{}, etc.
_raw | _time | id1 | id2 | info | objects{} |
Log = {"objects":["object1", "object2"], "info": "some strings", "id1": "someInt", "id2": "someInt"} | 2022-08-05 23:41:37 | someInt | someInt | some strings | object1 object2 |
Log = {"objects":["object1", "object2", "object3"], "info": "some other strings", "id1": "someOtherInt", "id2": "someOtherInt"} | 2022-08-05 23:41:37 | someOtherInt | someOtherInt | some other strings | object1 object2 object3 |
Log = { "info": "some log strings"} | 2022-08-05 23:41:37 | some log strings | |||
Log = "some string" | 2022-08-05 23:41:37 |
Based on your original description, the objective is really just to operate on those two fields.
After spath, you can definitely select objects{} from whichever info value. For example,
| spath input=Log
| where info == "some strings"
will give
_raw | _time | id1 | id2 | info | objects{} |
Log = {"objects":["object1", "object2"], "info": "some strings", "id1": "someInt", "id2": "someInt"} | 2022-08-05 23:41:37 | someInt | someInt | some strings | object1 object2 |
| spath input=Log
| where info == "some other strings"
gives
_raw | _time | id1 | id2 | info | objects{} |
Log = {"objects":["object1", "object2", "object3"], "info": "some other strings", "id1": "someOtherInt", "id2": "someOtherInt"} | 2022-08-05 23:41:37 | someOtherInt | someOtherInt | some other strings | object1 object2 object3 |
and so on.
Is that what you described?
Yes, thank you! Thank you that works very well. As you mentioned, I do expect some use. For example, if I want to display the _time in a column where "some strings" and _time in another column where "some other strings" in a chart(essentially the time range of two Logs). I tried use spath twice in the search but it seems that values are not returned correctly while "info" is different strings.
OK, so those events in which Log do not equal to a valid JSON do not matter. Your requirements are
The first is achieved by spath. I haven't found a general approach to the second. However, if you can enumerate values of info, here is a cheat:
| spath input=Log
| foreach "some strings" "some other strings" "some log strings"
[ eval <<FIELD>> = if(info == "<<FIELD>>", _time, null()) ]
Alternatively, if you only want to tabulate _time by info value,
| spath input=Log
| table _time info
| transpose header_field=info
Oh, sorry for any confusion. I am not trying to use values as column name rather than values.
What I’m trying to do here is to find the time stamp of log with certain info( for example: some strings) and find the time stamp of another log with certain info( for example, some other strings) and trying to display the duration in between. After spath, i have something like:
| eval session_start=if(searchmatch("some string"),min(_time),null()) | eval session_end=if(searchmatch("some other string"),max(_time),null())
| stats values(session_start) as start, values(session_end) as end | eval Duration= end-start | table Duration start end
But it's not displaying the duration, but session_start and session_end are correct if I put them under table, trying to calculate diff but it seems that it's not calculating.
Update:
I figured it out
| eval session_start=if(searchmatch("some string"),_time,null()) | eval session_end=if(searchmatch("some other string"),_time,null())
| stats values(session_start) as ss, values(session_end) as se | eval dur=se-ss | table dur
You just invented transaction😉. For reference, you can achieve the same with something like
| transaction startswith=eval(info=="some strings") endswith=eval(info=="some otherstrings")
| table duration
But transaction is expensive. Using stats is usually preferred.