Getting Data In

Pulling a multivalue field from a JSON array

brajaram
Communicator

Hi,

I'm trying to pull a multivalue field from a JSON array to get statistics from it. The data looks as follows:

"APIHEADERRESPONSE":{...specific information...
    },
"SERVICES": [
  {
    "HTTPStatus":"200",
    "ServRespCode":"Success",
    "ServRespTime":"1200",
    "ServRespCached":"Y",
    "ServiceShortName":"Service1",
    "DataSource":"Source1",
    "ServiceURI":"Service1Url"
    },
    {
    "HTTPStatus":"200",
    "ServRespCode":"Success",
    "ServRespTime":"1200",
    "ServRespCached":"Y",
    "ServiceShortName":"Service2",
    "DataSource":"Service2Source",
    "ServiceURI":"Service2URL"
    },
    {
    "HTTPStatus":"200",
    "ServRespCode":"Success",
    "ServRespTime":"12",
    "ServRespCached":"Y",
    "ServiceShortName":"Service3",
    "DataSource":"Service3Source",
    "ServiceURI":"Service3URL"
    }
  ]

Our data follows the structure of an initial block of identifying information, followed by specific details of further backend calls our APIs make. We are printing the results of those backend calls into an array in the JSON (services block) we print out to splunk.

However, it is not a standardized amount of elements within the array. We can have 1 to many different services printed, and it varies from event to event. Further, the order is not normalized either, so we have no idea which service will print first.

Is there a way for us to extract the details of a specific service without using regex? If we had multiple events, how would I get something like the median response time of service1? Is there a way we can use SPATH to get this information without relying on the expensive nature of regex?

Tags (3)
0 Karma
1 Solution

somesoni2
Revered Legend

Give this a try (everything before spath is to generate sample data, replace it with your base search)

| gentimes start=-1 | eval _raw="{
\"SERVICES\": [
{
  \"HTTPStatus\":\"200\",  \"ServRespCode\":\"Success\",  \"ServRespTime\":\"1200\",  \"ServRespCached\":\"Y\",  \"ServiceShortName\":\"Service1\",   \"DataSource\":\"Source1\",   \"ServiceURI\":\"Service1Url\"
   },
   {
   \"HTTPStatus\":\"200\",   \"ServRespCode\":\"Success\",   \"ServRespTime\":\"1200\",   \"ServRespCached\":\"Y\",   \"ServiceShortName\":\"Service2\",  \"DataSource\":\"Service2Source\",
   \"ServiceURI\":\"Service2URL\"
   },
   {
   \"HTTPStatus\":\"200\",   \"ServRespCode\":\"Success\",   \"ServRespTime\":\"12\",   \"ServRespCached\":\"Y\",   \"ServiceShortName\":\"Service3\",   \"DataSource\":\"Service3Source\",
   \"ServiceURI\":\"Service3URL\"
   }
 ]
 }
" 
| spath | rename SERVICES{}.* as *| eval temp=mvzip(HTTPStatus,mvzip(ServRespCode,mvzip(ServRespTime,mvzip(ServiceShortName,mvzip(DataSource,ServiceURI,"#"),"#"),"#"),"#"),"#")| mvexpand temp | rex field=temp "(?<HTTPStatus>.+)#(?<ServRespCode>.+)#(?<ServRespTime>.+)#(?<ServiceShortName>.+)#(?<DataSource>.+)#(?<ServiceURI>.+)" | fields - temp

View solution in original post

0 Karma

somesoni2
Revered Legend

Give this a try (everything before spath is to generate sample data, replace it with your base search)

| gentimes start=-1 | eval _raw="{
\"SERVICES\": [
{
  \"HTTPStatus\":\"200\",  \"ServRespCode\":\"Success\",  \"ServRespTime\":\"1200\",  \"ServRespCached\":\"Y\",  \"ServiceShortName\":\"Service1\",   \"DataSource\":\"Source1\",   \"ServiceURI\":\"Service1Url\"
   },
   {
   \"HTTPStatus\":\"200\",   \"ServRespCode\":\"Success\",   \"ServRespTime\":\"1200\",   \"ServRespCached\":\"Y\",   \"ServiceShortName\":\"Service2\",  \"DataSource\":\"Service2Source\",
   \"ServiceURI\":\"Service2URL\"
   },
   {
   \"HTTPStatus\":\"200\",   \"ServRespCode\":\"Success\",   \"ServRespTime\":\"12\",   \"ServRespCached\":\"Y\",   \"ServiceShortName\":\"Service3\",   \"DataSource\":\"Service3Source\",
   \"ServiceURI\":\"Service3URL\"
   }
 ]
 }
" 
| spath | rename SERVICES{}.* as *| eval temp=mvzip(HTTPStatus,mvzip(ServRespCode,mvzip(ServRespTime,mvzip(ServiceShortName,mvzip(DataSource,ServiceURI,"#"),"#"),"#"),"#"),"#")| mvexpand temp | rex field=temp "(?<HTTPStatus>.+)#(?<ServRespCode>.+)#(?<ServRespTime>.+)#(?<ServiceShortName>.+)#(?<DataSource>.+)#(?<ServiceURI>.+)" | fields - temp
0 Karma

brajaram
Communicator

Hi,

Thanks for the response. That doesn't get quite what I need...in fact it seems to result in duplicating events? However, I was able to build off of what you gave me and various other questions(which actually led me to other answers you have given!) to get this query:

 initial search  |  table SERVICES{}.* | eval respTable=mvzip('SERVICES{}.ServRespTime', 'SERVICES{}.ServiceShortName', "&&") | eval ResponseTime=mvindex(split(mvfilter(match(respTable,".*&&Service1Name")),"&&"),0) | table ResponseTime

Since I don't use spath, are there any concerns with the cost of using this type of query extensively?

0 Karma

somesoni2
Revered Legend

With just the base search ran on Smart Mode, what all fields do you see(either on the left field sidebar OR by adding | fieldsummary in your base search)??

0 Karma

brajaram
Communicator

I can see all of the fields inside the service array properly - they are labeled like:

Services{}.DataSource
Services{}.ServiceRespTime

etc.

Splunk is correctly identifying the fields as multivalue - it captures multiple response times and lists them all under Services{}.ServiceRespTime.

However, there is nothing in the fields that seem to indicate which block each response time corresponds to. So while it captures all the response times, I don't know how to(without using regex) get the response time of service1 specifically.

0 Karma
Get Updates on the Splunk Community!

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...