Getting Data In

Splitting header and each entry from a nested JSON array into separate events at index time

madstop99
Explorer

I have a JSON (all in one line when fed into Splunk):

{
    "customerName": "Patrick",
    "customerId": "123456",
    "customerCity": "New York",
    "host": "host1",
    "path": "/store/key",
    "sourceType": "purchase",
    "sourceName": "Store",
    "data": [{
        "store": "Store 23",
        "time": "2016/05/06 10:20:20",
        "spending": "$100-$200",
        "category": ["Grocery", "Toys"]
    }, {
        "store": "Store 40",
        "time": "2016/05/20 12:20:30",
        "spending": "$25-$50",
        "category": ["Cloths"]
    }]
}

I want to generate two events at index time, with a result like this:

Event 1: 
{
    "customerName": "Patrick",
    "customerId": "123456",
    "customerCity": "New York",
    "host": "host1",
    "path": "/store/key",
    "sourceType": "purchase",
    "sourceName": "Store",
        "store": "Store 23",
        "time": "2016/05/06 10:20:20",
        "spending": "$100-$200",
        "category": ["Grocery", "Toys"]
}

Event 2:
{
    "customerName": "Patrick",
    "customerId": "123456",
    "customerCity": "New York",
    "host": "host1",
    "path": "/store/key",
    "sourceType": "purchase",
    "sourceName": "Store",
        "store": "Store 40",
        "time": "2016/05/20 12:20:30",
        "spending": "$25-$50",
        "category": ["Cloths"]
}

Question 1:
I tried to do this with transforms.conf and props.conf, and couldn't get it to work. Any thought or suggestion?

Question 2:
I am expecting up to a few thousand entries in "data". Given that the timestamp is within each entry of JSON array, is this something I should do at Index time or search time?

Tags (1)

wjk5828
New Member

Did you find a solution on this problem? I have a similar non-JSON data set where I would just prefer to "flatten" the tree while indexing.

0 Karma

rshoward
Path Finder

This can be done at search time with some spath after mv expansion on data , but you may consider writing function to unroll the "data" array on ingest if you need the individual events broken out. It really depends on how the often the data will be ad-hoc searched.

0 Karma
Get Updates on the Splunk Community!

New in Observability Cloud - Explicit Bucket Histograms

Splunk introduces native support for histograms as a metric data type within Observability Cloud with Explicit ...

Updated Team Landing Page in Splunk Observability

We’re making some changes to the team landing page in Splunk Observability, based on your feedback. The ...

New! Splunk Observability Search Enhancements for Splunk APM Services/Traces and ...

Regardless of where you are in Splunk Observability, you can search for relevant APM targets including service ...