Getting Data In

Help extracting JSON into multiple events

rmorrison6
Engager

I'm attempting to extract JSON into multiple events. I've read some other answers and attempted to test configurations using the Add Data feature and tweaking the settings. It seems like from what I've read, SHOULD_LINEMERGE should be set to false and there should be a LINE_BREAKER value to identify where to break the lines. Based on the sample data below I think it should break on the associatedItems field. I've attempted to modify + apply different values in the Add Data wizard but it doesn't seem to make a difference.
Any ideas?

This is the source type config I've attempted to use:

[ _json ]
SHOULD_LINEMERGE=false
LINE_BREAKER={"associatedItems"
NO_BINARY_CHECK=true
CHARSET=UTF-8
INDEXED_EXTRACTIONS=json
KV_MODE=none
TRUNCATE=1000000
category=Structured
description=JavaScript Object Notation format. For more information, visit http://json.org/
disabled=false
pulldown_type=true



{
  "limit": 1000,
  "offset": 0,
  "records": [
    {
      "associatedItems": [
        {
          "id": "557058:8bc118d1-552f-4613-9b34-c15add9aad17",
          "name": "removed58",
          "parentId": "10000",
          "parentName": "IDP Directory",
          "typeName": "USER"
        }
      ],
      "category": "user management",
      "changedValues": [
        {
          "changedFrom": "Active",
          "changedTo": "Inactive",
          "fieldName": "Active / Inactive"
        }
      ],
      "created": "2020-03-27T13:27:30.207+0000",
      "eventSource": "",
      "id": 17808,
      "objectItem": {
        "id": "557058:8bc118d1-552f-4613-9b34-c15add9aad17",
        "name": "removed58",
        "parentId": "10000",
        "parentName": "IDP Directory",
        "typeName": "USER"
      },
      "summary": "User updated"
    },
    {
      "associatedItems": [
        {
          "id": "qm:6d90bc78-fd3b-4401-8720-31989454f2b7:5b1d2af1-eb27-4451-8461-575ae3c4a9a5",
          "name": "qm:6d90bc78-fd3b-4401-8720-31989454f2b7:5b1d2af1-eb27-4451-8461-575ae3c4a9a5",
          "parentId": "10000",
          "parentName": "IDP Directory",
          "typeName": "USER"
        }
      ],
      "authorKey": "testuser",
      "category": "user management",
      "changedValues": [
        {
          "changedTo": "Active",
          "fieldName": "Active / Inactive"
        }
      ],
      "created": "2020-03-27T04:55:07.336+0000",
      "eventSource": "",
      "id": 17807,
      "objectItem": {
        "id": "qm:6d90bc78-fd3b-4401-8720-31989454f2b7:5b1d2af1-eb27-4451-8461-575ae3c4a9a5",
        "name": "qm:6d90bc78-fd3b-4401-8720-31989454f2b7:5b1d2af1-eb27-4451-8461-575ae3c4a9a5",
        "parentId": "10000",
        "parentName": "IDP Directory",
        "typeName": "USER"
      },
      "summary": "User created"
    },
0 Karma
1 Solution

to4kawa
Ultra Champion
|makeresults
| eval _raw="{\"offset\": 0, \"limit\": 1000, \"total\": 490, \"records\": [{\"id\": 17812, \"summary\": \"User created\", \"created\": \"2020-03-27T20:58:22.837+0000\", \"category\": \"user management\", \"eventSource\": \"\", \"objectItem\": {\"id\": \"qm:6d90bc78-fd3b-4401-8720-31989454f2b7:d583e9f8-94b1-4ad1-aa5a-67c35ecfd480\", \"name\": \"qm:6d90bc78-fd3b-4401-8720-31989454f2b7:d583e9f8-94b1-4ad1-aa5a-67c35ecfd480\", \"typeName\": \"USER\", \"parentId\": \"10000\", \"parentName\": \"IDP Directory\"}, \"changedValues\": [{\"fieldName\": \"Active / Inactive\", \"changedTo\": \"Active\"}], \"associatedItems\": [{\"id\": \"qm:6d90bc78-fd3b-4401-8720-31989454f2b7:d583e9f8-94b1-4ad1-aa5a-67c35ecfd480\", \"name\": \"qm:6d90bc78-fd3b-4401-8720-31989454f2b7:d583e9f8-94b1-4ad1-aa5a-67c35ecfd480\", \"typeName\": \"USER\", \"parentId\": \"10000\", \"parentName\": \"IDP Directory\"}]}, {\"id\": 17811, \"summary\": \"User added to group\", \"created\": \"2020-03-27T19:03:08.067+0000\", \"category\": \"group management\", \"eventSource\": \"\", \"objectItem\": {\"name\": \"system-administrators\", \"typeName\": \"GROUP\", \"parentId\": \"10000\", \"parentName\": \"IDP Directory\"}, \"associatedItems\": [{\"id\": \"5be151ae92e2727e0939e20a\", \"name\": \"5be151ae92e2727e0939e20a\", \"typeName\": \"USER\", \"parentId\": \"10000\", \"parentName\": \"IDP Directory\"}]}, {\"id\": 17810, \"summary\": \"User added to group\", \"created\": \"2020-03-27T19:03:08.063+0000\", \"category\": \"group management\", \"eventSource\": \"\", \"objectItem\": {\"name\": \"site-admins\", \"typeName\": \"GROUP\", \"parentId\": \"10000\", \"parentName\": \"IDP Directory\"}, \"associatedItems\": [{\"id\": \"5be151ae92e2727e0939e20a\", \"name\": \"5be151ae92e2727e0939e20a\", \"typeName\": \"USER\", \"parentId\": \"10000\", \"parentName\": \"IDP Directory\"}]}"
| rex mode=sed "s/({\"id\": \d{5})/#\1/g"
| makemv delim="#" _raw
| stats count by _raw
| spath

Hi, For this results,

props.conf

[ associated_json ](# make original sourcetype)
SHOULD_LINEMERGE=false
LINE_BREAKER=(.){\"id\": \d{5}
NO_BINARY_CHECK=true
CHARSET=UTF-8
INDEXED_EXTRACTIONS=json
KV_MODE=none
TRUNCATE=0
category=Structured
description=JSON
disabled=false
pulldown_type=true
TRANSFORMS-header = null

transforms.conf

[null]
REGEX = offset
DEST_KEY = queue
FORMAT = nullQueue

Unnecessary headers need to be removed.

View solution in original post

0 Karma

to4kawa
Ultra Champion
|makeresults
| eval _raw="{\"offset\": 0, \"limit\": 1000, \"total\": 490, \"records\": [{\"id\": 17812, \"summary\": \"User created\", \"created\": \"2020-03-27T20:58:22.837+0000\", \"category\": \"user management\", \"eventSource\": \"\", \"objectItem\": {\"id\": \"qm:6d90bc78-fd3b-4401-8720-31989454f2b7:d583e9f8-94b1-4ad1-aa5a-67c35ecfd480\", \"name\": \"qm:6d90bc78-fd3b-4401-8720-31989454f2b7:d583e9f8-94b1-4ad1-aa5a-67c35ecfd480\", \"typeName\": \"USER\", \"parentId\": \"10000\", \"parentName\": \"IDP Directory\"}, \"changedValues\": [{\"fieldName\": \"Active / Inactive\", \"changedTo\": \"Active\"}], \"associatedItems\": [{\"id\": \"qm:6d90bc78-fd3b-4401-8720-31989454f2b7:d583e9f8-94b1-4ad1-aa5a-67c35ecfd480\", \"name\": \"qm:6d90bc78-fd3b-4401-8720-31989454f2b7:d583e9f8-94b1-4ad1-aa5a-67c35ecfd480\", \"typeName\": \"USER\", \"parentId\": \"10000\", \"parentName\": \"IDP Directory\"}]}, {\"id\": 17811, \"summary\": \"User added to group\", \"created\": \"2020-03-27T19:03:08.067+0000\", \"category\": \"group management\", \"eventSource\": \"\", \"objectItem\": {\"name\": \"system-administrators\", \"typeName\": \"GROUP\", \"parentId\": \"10000\", \"parentName\": \"IDP Directory\"}, \"associatedItems\": [{\"id\": \"5be151ae92e2727e0939e20a\", \"name\": \"5be151ae92e2727e0939e20a\", \"typeName\": \"USER\", \"parentId\": \"10000\", \"parentName\": \"IDP Directory\"}]}, {\"id\": 17810, \"summary\": \"User added to group\", \"created\": \"2020-03-27T19:03:08.063+0000\", \"category\": \"group management\", \"eventSource\": \"\", \"objectItem\": {\"name\": \"site-admins\", \"typeName\": \"GROUP\", \"parentId\": \"10000\", \"parentName\": \"IDP Directory\"}, \"associatedItems\": [{\"id\": \"5be151ae92e2727e0939e20a\", \"name\": \"5be151ae92e2727e0939e20a\", \"typeName\": \"USER\", \"parentId\": \"10000\", \"parentName\": \"IDP Directory\"}]}"
| rex mode=sed "s/({\"id\": \d{5})/#\1/g"
| makemv delim="#" _raw
| stats count by _raw
| spath

Hi, For this results,

props.conf

[ associated_json ](# make original sourcetype)
SHOULD_LINEMERGE=false
LINE_BREAKER=(.){\"id\": \d{5}
NO_BINARY_CHECK=true
CHARSET=UTF-8
INDEXED_EXTRACTIONS=json
KV_MODE=none
TRUNCATE=0
category=Structured
description=JSON
disabled=false
pulldown_type=true
TRANSFORMS-header = null

transforms.conf

[null]
REGEX = offset
DEST_KEY = queue
FORMAT = nullQueue

Unnecessary headers need to be removed.

0 Karma

rmorrison6
Engager

Hi to4kawa,

I attempted to ingest the data using the provided props.conf, however I'm still seeing only one event be indexed. I ran the makeresults command and it appears to be breaking the data into multiple events. Any idea why the data may not be indexed in a similar fashion? Are there potentially logs that may identify the issue?

Thanks for your help thus far

-Rob

0 Karma

to4kawa
Ultra Champion

your sample contains [\r\n]+
but actual log does not have, maybe.
please check log and display _raw

0 Karma

rmorrison6
Engager

to4kawa,

I believe you are correct. I had copy + pasted from a text editor previously that was formatting for json. The raw log is below.

Also after looking at the events further I believe they need to be broken at the initial "id" field that specifies a 5 digit number afterwards. I see that "id" shows up as another nested field multiple times but it appears the 5 digit number is the audit change number.

{"offset": 0, "limit": 1000, "total": 490, "records": [{"id": 17812, "summary": "User created", "created": "2020-03-27T20:58:22.837+0000", "category": "user management", "eventSource": "", "objectItem": {"id": "qm:6d90bc78-fd3b-4401-8720-31989454f2b7:d583e9f8-94b1-4ad1-aa5a-67c35ecfd480", "name": "qm:6d90bc78-fd3b-4401-8720-31989454f2b7:d583e9f8-94b1-4ad1-aa5a-67c35ecfd480", "typeName": "USER", "parentId": "10000", "parentName": "IDP Directory"}, "changedValues": [{"fieldName": "Active / Inactive", "changedTo": "Active"}], "associatedItems": [{"id": "qm:6d90bc78-fd3b-4401-8720-31989454f2b7:d583e9f8-94b1-4ad1-aa5a-67c35ecfd480", "name": "qm:6d90bc78-fd3b-4401-8720-31989454f2b7:d583e9f8-94b1-4ad1-aa5a-67c35ecfd480", "typeName": "USER", "parentId": "10000", "parentName": "IDP Directory"}]}, {"id": 17811, "summary": "User added to group", "created": "2020-03-27T19:03:08.067+0000", "category": "group management", "eventSource": "", "objectItem": {"name": "system-administrators", "typeName": "GROUP", "parentId": "10000", "parentName": "IDP Directory"}, "associatedItems": [{"id": "5be151ae92e2727e0939e20a", "name": "5be151ae92e2727e0939e20a", "typeName": "USER", "parentId": "10000", "parentName": "IDP Directory"}]}, {"id": 17810, "summary": "User added to group", "created": "2020-03-27T19:03:08.063+0000", "category": "group management", "eventSource": "", "objectItem": {"name": "site-admins", "typeName": "GROUP", "parentId": "10000", "parentName": "IDP Directory"}, "associatedItems": [{"id": "5be151ae92e2727e0939e20a", "name": "5be151ae92e2727e0939e20a", "typeName": "USER", "parentId": "10000", "parentName": "IDP Directory"}]}

0 Karma

to4kawa
Ultra Champion

I see, check my answer.

0 Karma

rmorrison6
Engager

Hey to4kawa,

Those configs are working perfectly for extracting the file into the appropriate number of events. However, none of the fields are being extracted now. I can manually extract them but figured the "INDEXED_EXTRACTIONS=json" should take care of that? Thanks for your help thus far

-Rob

0 Karma

to4kawa
Ultra Champion
 INDEXED_EXTRACTIONS=none
 KV_MODE=json

maybe, null events causes this.
use KV_MODE=json

0 Karma

rmorrison6
Engager

Perfect! This config is working great now, individual events are extracted along with the appropriate fields. Thanks for your help to4kawa!

-Rob

0 Karma
Get Updates on the Splunk Community!

Introducing Splunk Enterprise 9.2

WATCH HERE! Watch this Tech Talk to learn about the latest features and enhancements shipped in the new Splunk ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...