Getting Data In

Help extracting JSON into multiple events

rmorrison6
Engager

I'm attempting to extract JSON into multiple events. I've read some other answers and attempted to test configurations using the Add Data feature and tweaking the settings. It seems like from what I've read, SHOULD_LINEMERGE should be set to false and there should be a LINE_BREAKER value to identify where to break the lines. Based on the sample data below I think it should break on the associatedItems field. I've attempted to modify + apply different values in the Add Data wizard but it doesn't seem to make a difference.
Any ideas?

This is the source type config I've attempted to use:

[ _json ]
SHOULD_LINEMERGE=false
LINE_BREAKER={"associatedItems"
NO_BINARY_CHECK=true
CHARSET=UTF-8
INDEXED_EXTRACTIONS=json
KV_MODE=none
TRUNCATE=1000000
category=Structured
description=JavaScript Object Notation format. For more information, visit http://json.org/
disabled=false
pulldown_type=true



{
  "limit": 1000,
  "offset": 0,
  "records": [
    {
      "associatedItems": [
        {
          "id": "557058:8bc118d1-552f-4613-9b34-c15add9aad17",
          "name": "removed58",
          "parentId": "10000",
          "parentName": "IDP Directory",
          "typeName": "USER"
        }
      ],
      "category": "user management",
      "changedValues": [
        {
          "changedFrom": "Active",
          "changedTo": "Inactive",
          "fieldName": "Active / Inactive"
        }
      ],
      "created": "2020-03-27T13:27:30.207+0000",
      "eventSource": "",
      "id": 17808,
      "objectItem": {
        "id": "557058:8bc118d1-552f-4613-9b34-c15add9aad17",
        "name": "removed58",
        "parentId": "10000",
        "parentName": "IDP Directory",
        "typeName": "USER"
      },
      "summary": "User updated"
    },
    {
      "associatedItems": [
        {
          "id": "qm:6d90bc78-fd3b-4401-8720-31989454f2b7:5b1d2af1-eb27-4451-8461-575ae3c4a9a5",
          "name": "qm:6d90bc78-fd3b-4401-8720-31989454f2b7:5b1d2af1-eb27-4451-8461-575ae3c4a9a5",
          "parentId": "10000",
          "parentName": "IDP Directory",
          "typeName": "USER"
        }
      ],
      "authorKey": "testuser",
      "category": "user management",
      "changedValues": [
        {
          "changedTo": "Active",
          "fieldName": "Active / Inactive"
        }
      ],
      "created": "2020-03-27T04:55:07.336+0000",
      "eventSource": "",
      "id": 17807,
      "objectItem": {
        "id": "qm:6d90bc78-fd3b-4401-8720-31989454f2b7:5b1d2af1-eb27-4451-8461-575ae3c4a9a5",
        "name": "qm:6d90bc78-fd3b-4401-8720-31989454f2b7:5b1d2af1-eb27-4451-8461-575ae3c4a9a5",
        "parentId": "10000",
        "parentName": "IDP Directory",
        "typeName": "USER"
      },
      "summary": "User created"
    },
0 Karma
1 Solution

to4kawa
Ultra Champion
|makeresults
| eval _raw="{\"offset\": 0, \"limit\": 1000, \"total\": 490, \"records\": [{\"id\": 17812, \"summary\": \"User created\", \"created\": \"2020-03-27T20:58:22.837+0000\", \"category\": \"user management\", \"eventSource\": \"\", \"objectItem\": {\"id\": \"qm:6d90bc78-fd3b-4401-8720-31989454f2b7:d583e9f8-94b1-4ad1-aa5a-67c35ecfd480\", \"name\": \"qm:6d90bc78-fd3b-4401-8720-31989454f2b7:d583e9f8-94b1-4ad1-aa5a-67c35ecfd480\", \"typeName\": \"USER\", \"parentId\": \"10000\", \"parentName\": \"IDP Directory\"}, \"changedValues\": [{\"fieldName\": \"Active / Inactive\", \"changedTo\": \"Active\"}], \"associatedItems\": [{\"id\": \"qm:6d90bc78-fd3b-4401-8720-31989454f2b7:d583e9f8-94b1-4ad1-aa5a-67c35ecfd480\", \"name\": \"qm:6d90bc78-fd3b-4401-8720-31989454f2b7:d583e9f8-94b1-4ad1-aa5a-67c35ecfd480\", \"typeName\": \"USER\", \"parentId\": \"10000\", \"parentName\": \"IDP Directory\"}]}, {\"id\": 17811, \"summary\": \"User added to group\", \"created\": \"2020-03-27T19:03:08.067+0000\", \"category\": \"group management\", \"eventSource\": \"\", \"objectItem\": {\"name\": \"system-administrators\", \"typeName\": \"GROUP\", \"parentId\": \"10000\", \"parentName\": \"IDP Directory\"}, \"associatedItems\": [{\"id\": \"5be151ae92e2727e0939e20a\", \"name\": \"5be151ae92e2727e0939e20a\", \"typeName\": \"USER\", \"parentId\": \"10000\", \"parentName\": \"IDP Directory\"}]}, {\"id\": 17810, \"summary\": \"User added to group\", \"created\": \"2020-03-27T19:03:08.063+0000\", \"category\": \"group management\", \"eventSource\": \"\", \"objectItem\": {\"name\": \"site-admins\", \"typeName\": \"GROUP\", \"parentId\": \"10000\", \"parentName\": \"IDP Directory\"}, \"associatedItems\": [{\"id\": \"5be151ae92e2727e0939e20a\", \"name\": \"5be151ae92e2727e0939e20a\", \"typeName\": \"USER\", \"parentId\": \"10000\", \"parentName\": \"IDP Directory\"}]}"
| rex mode=sed "s/({\"id\": \d{5})/#\1/g"
| makemv delim="#" _raw
| stats count by _raw
| spath

Hi, For this results,

props.conf

[ associated_json ](# make original sourcetype)
SHOULD_LINEMERGE=false
LINE_BREAKER=(.){\"id\": \d{5}
NO_BINARY_CHECK=true
CHARSET=UTF-8
INDEXED_EXTRACTIONS=json
KV_MODE=none
TRUNCATE=0
category=Structured
description=JSON
disabled=false
pulldown_type=true
TRANSFORMS-header = null

transforms.conf

[null]
REGEX = offset
DEST_KEY = queue
FORMAT = nullQueue

Unnecessary headers need to be removed.

View solution in original post

0 Karma

to4kawa
Ultra Champion
|makeresults
| eval _raw="{\"offset\": 0, \"limit\": 1000, \"total\": 490, \"records\": [{\"id\": 17812, \"summary\": \"User created\", \"created\": \"2020-03-27T20:58:22.837+0000\", \"category\": \"user management\", \"eventSource\": \"\", \"objectItem\": {\"id\": \"qm:6d90bc78-fd3b-4401-8720-31989454f2b7:d583e9f8-94b1-4ad1-aa5a-67c35ecfd480\", \"name\": \"qm:6d90bc78-fd3b-4401-8720-31989454f2b7:d583e9f8-94b1-4ad1-aa5a-67c35ecfd480\", \"typeName\": \"USER\", \"parentId\": \"10000\", \"parentName\": \"IDP Directory\"}, \"changedValues\": [{\"fieldName\": \"Active / Inactive\", \"changedTo\": \"Active\"}], \"associatedItems\": [{\"id\": \"qm:6d90bc78-fd3b-4401-8720-31989454f2b7:d583e9f8-94b1-4ad1-aa5a-67c35ecfd480\", \"name\": \"qm:6d90bc78-fd3b-4401-8720-31989454f2b7:d583e9f8-94b1-4ad1-aa5a-67c35ecfd480\", \"typeName\": \"USER\", \"parentId\": \"10000\", \"parentName\": \"IDP Directory\"}]}, {\"id\": 17811, \"summary\": \"User added to group\", \"created\": \"2020-03-27T19:03:08.067+0000\", \"category\": \"group management\", \"eventSource\": \"\", \"objectItem\": {\"name\": \"system-administrators\", \"typeName\": \"GROUP\", \"parentId\": \"10000\", \"parentName\": \"IDP Directory\"}, \"associatedItems\": [{\"id\": \"5be151ae92e2727e0939e20a\", \"name\": \"5be151ae92e2727e0939e20a\", \"typeName\": \"USER\", \"parentId\": \"10000\", \"parentName\": \"IDP Directory\"}]}, {\"id\": 17810, \"summary\": \"User added to group\", \"created\": \"2020-03-27T19:03:08.063+0000\", \"category\": \"group management\", \"eventSource\": \"\", \"objectItem\": {\"name\": \"site-admins\", \"typeName\": \"GROUP\", \"parentId\": \"10000\", \"parentName\": \"IDP Directory\"}, \"associatedItems\": [{\"id\": \"5be151ae92e2727e0939e20a\", \"name\": \"5be151ae92e2727e0939e20a\", \"typeName\": \"USER\", \"parentId\": \"10000\", \"parentName\": \"IDP Directory\"}]}"
| rex mode=sed "s/({\"id\": \d{5})/#\1/g"
| makemv delim="#" _raw
| stats count by _raw
| spath

Hi, For this results,

props.conf

[ associated_json ](# make original sourcetype)
SHOULD_LINEMERGE=false
LINE_BREAKER=(.){\"id\": \d{5}
NO_BINARY_CHECK=true
CHARSET=UTF-8
INDEXED_EXTRACTIONS=json
KV_MODE=none
TRUNCATE=0
category=Structured
description=JSON
disabled=false
pulldown_type=true
TRANSFORMS-header = null

transforms.conf

[null]
REGEX = offset
DEST_KEY = queue
FORMAT = nullQueue

Unnecessary headers need to be removed.

0 Karma

rmorrison6
Engager

Hi to4kawa,

I attempted to ingest the data using the provided props.conf, however I'm still seeing only one event be indexed. I ran the makeresults command and it appears to be breaking the data into multiple events. Any idea why the data may not be indexed in a similar fashion? Are there potentially logs that may identify the issue?

Thanks for your help thus far

-Rob

0 Karma

to4kawa
Ultra Champion

your sample contains [\r\n]+
but actual log does not have, maybe.
please check log and display _raw

0 Karma

rmorrison6
Engager

to4kawa,

I believe you are correct. I had copy + pasted from a text editor previously that was formatting for json. The raw log is below.

Also after looking at the events further I believe they need to be broken at the initial "id" field that specifies a 5 digit number afterwards. I see that "id" shows up as another nested field multiple times but it appears the 5 digit number is the audit change number.

{"offset": 0, "limit": 1000, "total": 490, "records": [{"id": 17812, "summary": "User created", "created": "2020-03-27T20:58:22.837+0000", "category": "user management", "eventSource": "", "objectItem": {"id": "qm:6d90bc78-fd3b-4401-8720-31989454f2b7:d583e9f8-94b1-4ad1-aa5a-67c35ecfd480", "name": "qm:6d90bc78-fd3b-4401-8720-31989454f2b7:d583e9f8-94b1-4ad1-aa5a-67c35ecfd480", "typeName": "USER", "parentId": "10000", "parentName": "IDP Directory"}, "changedValues": [{"fieldName": "Active / Inactive", "changedTo": "Active"}], "associatedItems": [{"id": "qm:6d90bc78-fd3b-4401-8720-31989454f2b7:d583e9f8-94b1-4ad1-aa5a-67c35ecfd480", "name": "qm:6d90bc78-fd3b-4401-8720-31989454f2b7:d583e9f8-94b1-4ad1-aa5a-67c35ecfd480", "typeName": "USER", "parentId": "10000", "parentName": "IDP Directory"}]}, {"id": 17811, "summary": "User added to group", "created": "2020-03-27T19:03:08.067+0000", "category": "group management", "eventSource": "", "objectItem": {"name": "system-administrators", "typeName": "GROUP", "parentId": "10000", "parentName": "IDP Directory"}, "associatedItems": [{"id": "5be151ae92e2727e0939e20a", "name": "5be151ae92e2727e0939e20a", "typeName": "USER", "parentId": "10000", "parentName": "IDP Directory"}]}, {"id": 17810, "summary": "User added to group", "created": "2020-03-27T19:03:08.063+0000", "category": "group management", "eventSource": "", "objectItem": {"name": "site-admins", "typeName": "GROUP", "parentId": "10000", "parentName": "IDP Directory"}, "associatedItems": [{"id": "5be151ae92e2727e0939e20a", "name": "5be151ae92e2727e0939e20a", "typeName": "USER", "parentId": "10000", "parentName": "IDP Directory"}]}

0 Karma

to4kawa
Ultra Champion

I see, check my answer.

0 Karma

rmorrison6
Engager

Hey to4kawa,

Those configs are working perfectly for extracting the file into the appropriate number of events. However, none of the fields are being extracted now. I can manually extract them but figured the "INDEXED_EXTRACTIONS=json" should take care of that? Thanks for your help thus far

-Rob

0 Karma

to4kawa
Ultra Champion
 INDEXED_EXTRACTIONS=none
 KV_MODE=json

maybe, null events causes this.
use KV_MODE=json

0 Karma

rmorrison6
Engager

Perfect! This config is working great now, individual events are extracted along with the appropriate fields. Thanks for your help to4kawa!

-Rob

0 Karma
Get Updates on the Splunk Community!

How to Monitor Google Kubernetes Engine (GKE)

We’ve looked at how to integrate Kubernetes environments with Splunk Observability Cloud, but what about ...

Index This | How can you make 45 using only 4?

October 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with this ...

Splunk Education Goes to Washington | Splunk GovSummit 2024

If you’re in the Washington, D.C. area, this is your opportunity to take your career and Splunk skills to the ...