Splunk Search

How to extract field in json format?

vineela
Path Finder

Hi All,

      I have a log which is in Json format. I used spath and extracted the fields. But there is no field value pair for the value which i need to get it extracted.

This is the sample log.
{"log":"100.64.12.88 - idp-psu-int-sanctions-listener-app-npd|696534fc-2f4a-a078-e053-071bf40a21a6|7762ee4c-a769-6413-e053-1d1bf40a3e8e| [03/Apr/2023:15:10:57 +1000] \"GET https://pds-event-api.msaas-badev/payments/history/v3/payments/events/rawContents?receiptNumber=MP10403051048&eventTypeCode=CLRG.RECEIVE_NEW_MSG&messageDefinitionId=pain.001.001.10 HTTP/1.1\" 200 14127 \"-\" \"Java/1.8.0_342\" 190\n","stream":"stdout","docker":{"container_id":"9ce8070c8f3bdde9fd0374a295922ef64e34fa7a007241d528b72286187dc8fe"},"kubernetes":{"container_name":"pds-event-api-psu-api","namespace_name":"msaas-badev","pod_name":"pds-event-api-psu-api-3.17.16-68cfc5f9c6-52fhl","container_image":"pso.docker.internal.cba/pds-event-microservice:3.17.16","container_image_id":"docker-pullable://pso.docker.internal.cba/pds-event-microservice@sha256:44cf819cc3c8b88f6794cac17dbcd775de2a2e4b40cad33418d2ba20d642ef28","pod_id":"3c9080dd-82c6-44f9-9bb4-9e0d7843a8f3","pod_ip":"100.64.15.33","host":"ip-10-3-196-184.ap-southeast-2.compute.internal","labels":{"app":"pds-event-api","app.kubernetes.io/instance":"pds-event-api","app.kubernetes.io/managed-by":"Helm","app.kubernetes.io/name":"pds-event-api","helm.sh/chart":"psu-api-1.7.5","heritage":"Helm","pod-template-hash":"68cfc5f9c6","release":"pds-event-api"},"master_url":"https://172.20.0.1:443/api","namespace_id":"25c93690-5c3b-4f2b-a967-8d0355ea90f2","namespace_labels":{"argocd.argoproj.io/instance":"appspaces","ci":"CM0953076","kubernetes.io/metadata.name":"msaas-badev","name":"msaas-badev","platform":"PSU","service_owner":"somersd","spg":"CBA_PAYMENTS_TEST_COORDINATION"}},"hostname":"ip-10-3-196-184.ap-southeast-2.compute.internal","host_ip":"10.3.196.184","cluster":"nonprod/pmn02"}

      The one which i highlighted in the log "190" is the response time which is getting captured in logs. I need to fetch it under value called response time and calculate the percentage of responsetime. Can anyone please help me out  in writting regex to fetch that value out from log field which is in json format.

Thanks in Advance.

Labels (1)
Tags (1)
0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @vineela,

you have three ways to extract fields from a file in json format:

  • add INDEXED_EXTRACTIONS=json to your props.conf, in this way the file is correctly parsed and you have all the fields, remember that this configuration must be located in the Universal Forwarders, on Heavy Forwarders  (if present), on Indexers, and on Search Heads,
  • using "spath" command (https://docs.splunk.com/Documentation/Splunk/9.0.4/SearchReference/Spath) in your searches,
  • as @tscroggins said, using regexes in your searches, but you have to extract one by one all fields.

Ciao.

Giuseppe

tscroggins
Influencer

Hi,

With default JSON field extraction settings, Splunk should extract a field named log from your events. You can use the rex command to extract the response time from the log field:

| rex field=log ".* (?<response_time>[0-9]+)"
0 Karma

yuanliu
SplunkTrust
SplunkTrust

That is a good strategy.  But you want to be more specific than just picking up numerals.  The following is borrowed from Splunk's default transformation access-extractions which handles standard NCSA/Apache httpd access logs.

| rex field=log "^[[nspaces:clientip]]\s++[[nspaces:ident]]\s++[[nspaces:user]]\s++[[sbstring:req_time]]\s++[[access-request]]\s++[[nspaces:status]]\s++[[nspaces:bytes]](?:\s++\"(?<referer>[[bc_domain:referer_]]?+[^\"]*+)\"(?:\s++[[qstring:useragent]](?:\s++[[qstring:cookie]])?+)?+)?[[all:other]]"

Standard transformation is more robust.

tscroggins
Influencer

Alternatively:

| eval _raw=log
| extract access-extractions

which admittedly discards other fields, depending on the search mode.

There's a sea of regular expression syntax to swim through, though, and that may not provide the most value in this context.

yuanliu
SplunkTrust
SplunkTrust

That will only discard the original _raw event, not any JSON node that is already extracted.  If you want to preserve _raw, you can swap.

 

| rename _raw AS temp, log AS _raw
| extract access-extractions
| rename _raw AS log, temp AS _raw

Oh, I forgot to mention, access-extractions extracts the field of interest as "other".  If you are certain that it represents response time, you can rename that, too.

| rename _raw AS temp, log AS _raw
| extract access-extractions
| rename other AS response_time
| rename _raw AS log, temp AS _raw

 

tscroggins
Influencer

All of our answers encourage exploration of JSON parsing, field extraction, and regular expression syntax (with bonus inconsistent escape sequence handling and multiple engines!) in Splunk, but I suspect @vineela just wants to skip ahead to statistical and/or time series analysis of response times. 😉

0 Karma
Get Updates on the Splunk Community!

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...