Hello!
As the subject of the question says, I'm trying to create SPL queries for several visualizations but it has become very tedious since spath does not work with the outputted events, as they come in a string format, making it very hard to work with more complex operations
The event contents are in a valid json format (checked using jsonformatter)
Here's the event output:{"time":"time_here","kubernetes":{"host":"host_name_here","pod_name":"pod_name_here","namespace_name":"namespace_name_here","labels":{"app":"app_label"}},"log":{"jobId":"job_id_here","dc":"dc_here","stdout":"{ \"Componente\" : \"componente_here\", \"channel\" : \"channel_here\", \"timestamp\" : \"timestamp_here\", \"Code\" : \"code_here\", \"logId\" : \"logid_here\", \"service\" : \"service_here\", \"responseMessage\" : \"responseMessage_here\", \"flow\" : \"flow_here\", \"log\" : \"log_here\"}","level":"info","host":"host_worker_here","flow":"flow_here","projectName":"project_name_here","caller":"caller_here"},"cluster_id":"cluster_id_here"}
It seem that Splunk already gives you fields like cluter_id, log.projectName, and log.stdout. log.stdout is embedded JSON. Not sure why you say "spath does not work with outputted events." It certainly does. As @richgalloway demonstrated, you just need to use spath's input parameter.
| spath input=log.stdout
Your mock event gives you these extra fields
Code | Componente | channel | flow | log | logId | responseMessage | service | timestamp |
code_here | componente_here | channel_here | flow_here | log_here | logid_here | responseMessage_here | service_here | timestamp_here |
Play with the emulation @richgalloway gives and compare with your real data.
It seem that Splunk already gives you fields like cluter_id, log.projectName, and log.stdout. log.stdout is embedded JSON. Not sure why you say "spath does not work with outputted events." It certainly does. As @richgalloway demonstrated, you just need to use spath's input parameter.
| spath input=log.stdout
Your mock event gives you these extra fields
Code | Componente | channel | flow | log | logId | responseMessage | service | timestamp |
code_here | componente_here | channel_here | flow_here | log_here | logid_here | responseMessage_here | service_here | timestamp_here |
Play with the emulation @richgalloway gives and compare with your real data.
I'm sorry I didn't see your reply sooner, thank you so much! You're a hero!!
Please explain what you mean by "spath does not work". It works for me in this run-anywhere example (escape characters added to satisfy the SPL parser). What is your query? What results do you expect and what do you get?
| makeresults | eval data="{\"time\":\"time_here\",\"kubernetes\":{\"host\":\"host_name_here\",\"pod_name\":\"pod_name_here\",\"namespace_name\":\"namespace_name_here\",\"labels\":{\"app\":\"app_label\"}},\"log\":{\"jobId\":\"job_id_here\",\"dc\":\"dc_here\",\"stdout\":\"{ \\\"Componente\\\" : \\\"componente_here\\\", \\\"channel\\\" : \\\"channel_here\\\", \\\"timestamp\\\" : \\\"timestamp_here\\\", \\\"Code\\\" : \\\"code_here\\\", \\\"logId\\\" : \\\"logid_here\\\", \\\"service\\\" : \\\"service_here\\\", \\\"responseMessage\\\" : \\\"responseMessage_here\\\", \\\"flow\\\" : \\\"flow_here\\\", \\\"log\\\" : \\\"log_here\\\"}\",\"level\":\"info\",\"host\":\"host_worker_here\",\"flow\":\"flow_here\",\"projectName\":\"project_name_here\",\"caller\":\"caller_here\"},\"cluster_id\":\"cluster_id_here\"}"
| spath input=data
| transpose
And the results
Hello!! Thank you for your response! And I'm sorry I explained myself so poorly!
spath does not work: What I meant with this was, having the previous event string as an example, I am unable to use SPL queries such as
index="my_index" logid="log_id_here" service="service_here" responseMessage="response_message_here"
instead I gotta use
index="my_index" "log_id_here" "service_here" "response_message_here" or index="my_index" "log_id_here" service logid responseMessage
This is because no data is found when using "variables" such as
responseMessage="response_message_here"
Instead I must search for specific string fragments within the event outputs... This is because the output is formatted as string instead of json making the SPL query creation a real pain.
What is your query: One example would be to individually get each responseMessage as such:
index="my_index" "log_id_here" logid service responseMessage \\\"responseMessage\\\" : \\\"null\\\" Instead of the normal way which would be index="my_index" logid="log_id_here" service responseMessage | stats count by responseMessage | dedup responseMessage
What results do I expect: Currently I'm trying to get unique services and order them desc based on the error count for each (which is based on the responseMessage)
What results do I get: Currently I'm able to get the count of each service by using string literals such as \\\"service\\\" : \\\"desk\\\" , other than that I'm stuck here. (I'm guessing this could be done with something like
index="my_index" "logid" | stats count by service, responseMessage | eval isError=if(responseMessage!="success",1 ,0) | stats sum(isError) as errorCount by service
I apologize in advance in case I've missed once again important details or if i've given wrong queries, I haven't been able to try them out as the documentation shows :C thank you very much for your time!!
I find it interesting that you claim the spath command does not work yet none of your searches use spath. The command won't work if it isn't invoked. See my example above.
Once the spath command has extracted the fields, then you can reference those fields in other commands.
I see, should I copy and paste the event data into the search bar to do as the example you provided?
Edit: I used:
index="my_index" "log_id_here" logid responseMessage | spath input=data | transpose
Strangely most if not all vital data was stored inside _raw as a single str
OK. I think I see where it is going.
You have your data as JSON structure and want to search it calling the fields by names in the base search and it doesn't work. But it will parse your fields if you search for your events another way (for example just by searching for the content, regardless of where in the event it is) and then pushing it through the spath command.
Am I right?
In other words - your events are not automatically interpreted as JSON structures.
There are three separate levels on which Splunk can handle JSON data.
1. On ingest - it can treat the JSON with INDEXED_EXTRACTIONS and parse your data into indexed fields. You generally don't want that as indexed fields are not really what Splunk is typically about.
2. Manual invocation of spath command - that can be useful if you have your json data as only a part of your whole event (for example - json structure forwarded as a syslog message and prepended with a syslog header; in such case you'd want to cut extract the part after syslog header and manually call the spath command to extract fields from that part).
3. Automatic search-time extraction - it's triggered by proper configuration of your sourcetype. By default, unless explicitly disabled by setting AUTO_KV_JSON to false, Splunk will extract your json fields when (and only then) the whole _raw event is a well-formed json structure. JSON extraction can be also (still, only when the whole event is a well-formed json) explicitly triggered by properly configuring KV_MODE in your sourcetype.
Mind you that netiher 1st nor the 3rd option will extract data if you have - for example - a JSON structure as a string field within another json structure - in such case you have to manually use spath to extract the json data from such string.
So - as you can see - json is a bit tricky to work with.
PS: There is an open idea about extracting only part of the event as json structure - feel free to support that 😉 https://ideas.splunk.com/ideas/EID-I-208
Hello!!
THanks for your answer! You are indeed correct! The event has some level that is treated as a Json, but nested in the "log" variable, the "stdout" variable has another dictionary within it that is being treated as a string, making it difficult to be worked with SPL.
I did my research and it seems this might be an issue with the way the data is being parsed before arriving to splunk, before checking that I guess I'm stuck with searching for string literals 💔
Thank you for your time and help!!
So you need to do
<your search>
| spath input=stdout
This way you'll parse the contents of the stdout field.
The field name is log.stdout.
| spath input=log.stdout
See my earlier comment https://community.splunk.com/t5/Splunk-Search/How-to-use-spath-with-string-formatted-events/m-p/6707...
I added data to the SPL because I don't have your data indexed in my Splunk. Since you have the data indexed, you can skip that part of my example query. You may need to change the spath command argument to match your events.
I see, I tried with different variables but _raw seems to hold all vital data in all cases, mabe I'm not doing something right, perhaps the part that is not in json format is the output inside the "stdout" variable.
EDIT: Here's the event in log format
{ [-]
cluster_id: cluster_id
kubernetes: { [+]
}
log: { [-]
caller: caller_here
dc: dc_here
flow: flow_here
host: gatling_worker_here
jobId: jobid_here
level: info
projectName: project_name_here
stdout: { "Componente" : "componente_here", "channel" : "channel_here", "timestamp" : "timestamp_here", "Code" : "code_here", "logId" : "logid_here", "service" : "service_here", "responseMessage" : "responsemessage_here", "flow" : "flow_here", "log" : "log_here"}
}
time: time_here
}
stdout is the issue it seems
The _raw field is where Splunk stores the raw event. Many commands default to that field and a few work only on that field. The spath command defaults to _raw, but you can use spath input=_raw, if you wish.
The example event looks fine to me and passes checks at jsonlint.com.
I see...
Well it seems like spath (and spl functionality in general) is working fine with the events, except for the contents in stdout... I spoke with an acquaintance and it looks like it's most likely due to the way the data is parsed before arriving to splunk.
I can't thank you enough for your time and effort helping me!! It looks like this has to be checked outside of splunk tho, I'll close the ticket and come back with updates if I'm able to find a solution.
Look at my explanation above - your stdout field is not a json structure - it's a string containing a json structure so it cannot be automatically parsed as json structure. You have to take the stdout field and manually use stdout on this field to parse out the fields from it.
Excellent! Is there a way of doing this directly with SPL?