Hi All
I have 2 questions.
Question-1:
I wanted to write auto extract KVs from JSON portion of the events.
Here is the JSON portion of sample event --> "actionId":47533796,"clientIP":"127.0.0.1","actions":["Action1","Action2","Action3"],"user":"alia",...
I wrote regex --> \"([\w]+)\":([^,}]+)
Here, the complexity is value type is sometime array of values with comma separator with variable number of members.
So for the key "actions", the above regex gives the values as ["Action1",
Since "," is also delimiter for separating JSON KVs, the array value also uses "," as delimiter, hence the problem.
how can I auto extract all the KVs from JSON string for the above case? I want to put in transforms.conf for auto extraction of all the KVs from JSON string.......
Question-2:
If I want to replace the "," only for array value using SED command, how can I do it?
E,g: I want to replace comma delimiter with semi-colon only for the values of array type[a,brrrr,c2,d,ehh,.....,w] with [a;brrrr;c2;d;ehh;....;w]
Try this regex:
"([\w]+)":(\[[^\]]+\]|[^,}]+)
It optionally matches a:[b,c,d] or a:b
Ah understood, in that case maybe a scripted input would be a better option?
http://docs.splunk.com/Documentation/Splunk/6.0/Data/Setupcustominputs
You could write some python (or other script) to pre-process the input files and splunk would take the script output as its input.
I'm not sure that the second part is do-able with a regex. Taking a step back though, have you looked at the spath command (which parses JSON from parts of events). I did this:
| stats count | fields - count | eval jsonpart="{\"actionId\":47533796,\"clientIP\":\"127.0.0.1\",\"actions\":[\"Action1\",\"Action2\",\"Action3\"],\"user\":\"alia\"}" | spath input=jsonpart
where I mimic an event where one field contains the json. spath then does a great job of parsing the json bit. The array comes out as a multi valued field, the other fields are plain fields in the same event. Perhaps that will help?
Hi charltones
Thanks for the response. Yes, I'm very well aware of spath usage and I have implement the above problem in manual search query. My question is related to config for auto extraction of the JSON portion in the transforms.conf, which you have responded earlier. Thanks again.
Try this regex:
"([\w]+)":(\[[^\]]+\]|[^,}]+)
It optionally matches a:[b,c,d] or a:b
Thanks. 1st question is solved.
Any idea on 2nd question?