Getting Data In

Splitting header and each entry from a nested JSON array into separate events at index time

madstop99
Explorer

I have a JSON (all in one line when fed into Splunk):

{
    "customerName": "Patrick",
    "customerId": "123456",
    "customerCity": "New York",
    "host": "host1",
    "path": "/store/key",
    "sourceType": "purchase",
    "sourceName": "Store",
    "data": [{
        "store": "Store 23",
        "time": "2016/05/06 10:20:20",
        "spending": "$100-$200",
        "category": ["Grocery", "Toys"]
    }, {
        "store": "Store 40",
        "time": "2016/05/20 12:20:30",
        "spending": "$25-$50",
        "category": ["Cloths"]
    }]
}

I want to generate two events at index time, with a result like this:

Event 1: 
{
    "customerName": "Patrick",
    "customerId": "123456",
    "customerCity": "New York",
    "host": "host1",
    "path": "/store/key",
    "sourceType": "purchase",
    "sourceName": "Store",
        "store": "Store 23",
        "time": "2016/05/06 10:20:20",
        "spending": "$100-$200",
        "category": ["Grocery", "Toys"]
}

Event 2:
{
    "customerName": "Patrick",
    "customerId": "123456",
    "customerCity": "New York",
    "host": "host1",
    "path": "/store/key",
    "sourceType": "purchase",
    "sourceName": "Store",
        "store": "Store 40",
        "time": "2016/05/20 12:20:30",
        "spending": "$25-$50",
        "category": ["Cloths"]
}

Question 1:
I tried to do this with transforms.conf and props.conf, and couldn't get it to work. Any thought or suggestion?

Question 2:
I am expecting up to a few thousand entries in "data". Given that the timestamp is within each entry of JSON array, is this something I should do at Index time or search time?

Tags (1)

wjk5828
New Member

Did you find a solution on this problem? I have a similar non-JSON data set where I would just prefer to "flatten" the tree while indexing.

0 Karma

rshoward
Path Finder

This can be done at search time with some spath after mv expansion on data , but you may consider writing function to unroll the "data" array on ingest if you need the individual events broken out. It really depends on how the often the data will be ad-hoc searched.

0 Karma
Get Updates on the Splunk Community!

Optimize Cloud Monitoring

  TECH TALKS Optimize Cloud Monitoring Tuesday, August 13, 2024  |  11:00AM–12:00PM PST   Register to ...

What's New in Splunk Cloud Platform 9.2.2403?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.2.2403! Analysts can ...

Stay Connected: Your Guide to July and August Tech Talks, Office Hours, and Webinars!

Dive into our sizzling summer lineup for July and August Community Office Hours and Tech Talks. Scroll down to ...