Getting Data In

Help extracting JSON into multiple events

rmorrison6
Engager

I'm attempting to extract JSON into multiple events. I've read some other answers and attempted to test configurations using the Add Data feature and tweaking the settings. It seems like from what I've read, SHOULD_LINEMERGE should be set to false and there should be a LINE_BREAKER value to identify where to break the lines. Based on the sample data below I think it should break on the associatedItems field. I've attempted to modify + apply different values in the Add Data wizard but it doesn't seem to make a difference.
Any ideas?

This is the source type config I've attempted to use:

[ _json ]
SHOULD_LINEMERGE=false
LINE_BREAKER={"associatedItems"
NO_BINARY_CHECK=true
CHARSET=UTF-8
INDEXED_EXTRACTIONS=json
KV_MODE=none
TRUNCATE=1000000
category=Structured
description=JavaScript Object Notation format. For more information, visit http://json.org/
disabled=false
pulldown_type=true



{
  "limit": 1000,
  "offset": 0,
  "records": [
    {
      "associatedItems": [
        {
          "id": "557058:8bc118d1-552f-4613-9b34-c15add9aad17",
          "name": "removed58",
          "parentId": "10000",
          "parentName": "IDP Directory",
          "typeName": "USER"
        }
      ],
      "category": "user management",
      "changedValues": [
        {
          "changedFrom": "Active",
          "changedTo": "Inactive",
          "fieldName": "Active / Inactive"
        }
      ],
      "created": "2020-03-27T13:27:30.207+0000",
      "eventSource": "",
      "id": 17808,
      "objectItem": {
        "id": "557058:8bc118d1-552f-4613-9b34-c15add9aad17",
        "name": "removed58",
        "parentId": "10000",
        "parentName": "IDP Directory",
        "typeName": "USER"
      },
      "summary": "User updated"
    },
    {
      "associatedItems": [
        {
          "id": "qm:6d90bc78-fd3b-4401-8720-31989454f2b7:5b1d2af1-eb27-4451-8461-575ae3c4a9a5",
          "name": "qm:6d90bc78-fd3b-4401-8720-31989454f2b7:5b1d2af1-eb27-4451-8461-575ae3c4a9a5",
          "parentId": "10000",
          "parentName": "IDP Directory",
          "typeName": "USER"
        }
      ],
      "authorKey": "testuser",
      "category": "user management",
      "changedValues": [
        {
          "changedTo": "Active",
          "fieldName": "Active / Inactive"
        }
      ],
      "created": "2020-03-27T04:55:07.336+0000",
      "eventSource": "",
      "id": 17807,
      "objectItem": {
        "id": "qm:6d90bc78-fd3b-4401-8720-31989454f2b7:5b1d2af1-eb27-4451-8461-575ae3c4a9a5",
        "name": "qm:6d90bc78-fd3b-4401-8720-31989454f2b7:5b1d2af1-eb27-4451-8461-575ae3c4a9a5",
        "parentId": "10000",
        "parentName": "IDP Directory",
        "typeName": "USER"
      },
      "summary": "User created"
    },
0 Karma
1 Solution

to4kawa
Ultra Champion
|makeresults
| eval _raw="{\"offset\": 0, \"limit\": 1000, \"total\": 490, \"records\": [{\"id\": 17812, \"summary\": \"User created\", \"created\": \"2020-03-27T20:58:22.837+0000\", \"category\": \"user management\", \"eventSource\": \"\", \"objectItem\": {\"id\": \"qm:6d90bc78-fd3b-4401-8720-31989454f2b7:d583e9f8-94b1-4ad1-aa5a-67c35ecfd480\", \"name\": \"qm:6d90bc78-fd3b-4401-8720-31989454f2b7:d583e9f8-94b1-4ad1-aa5a-67c35ecfd480\", \"typeName\": \"USER\", \"parentId\": \"10000\", \"parentName\": \"IDP Directory\"}, \"changedValues\": [{\"fieldName\": \"Active / Inactive\", \"changedTo\": \"Active\"}], \"associatedItems\": [{\"id\": \"qm:6d90bc78-fd3b-4401-8720-31989454f2b7:d583e9f8-94b1-4ad1-aa5a-67c35ecfd480\", \"name\": \"qm:6d90bc78-fd3b-4401-8720-31989454f2b7:d583e9f8-94b1-4ad1-aa5a-67c35ecfd480\", \"typeName\": \"USER\", \"parentId\": \"10000\", \"parentName\": \"IDP Directory\"}]}, {\"id\": 17811, \"summary\": \"User added to group\", \"created\": \"2020-03-27T19:03:08.067+0000\", \"category\": \"group management\", \"eventSource\": \"\", \"objectItem\": {\"name\": \"system-administrators\", \"typeName\": \"GROUP\", \"parentId\": \"10000\", \"parentName\": \"IDP Directory\"}, \"associatedItems\": [{\"id\": \"5be151ae92e2727e0939e20a\", \"name\": \"5be151ae92e2727e0939e20a\", \"typeName\": \"USER\", \"parentId\": \"10000\", \"parentName\": \"IDP Directory\"}]}, {\"id\": 17810, \"summary\": \"User added to group\", \"created\": \"2020-03-27T19:03:08.063+0000\", \"category\": \"group management\", \"eventSource\": \"\", \"objectItem\": {\"name\": \"site-admins\", \"typeName\": \"GROUP\", \"parentId\": \"10000\", \"parentName\": \"IDP Directory\"}, \"associatedItems\": [{\"id\": \"5be151ae92e2727e0939e20a\", \"name\": \"5be151ae92e2727e0939e20a\", \"typeName\": \"USER\", \"parentId\": \"10000\", \"parentName\": \"IDP Directory\"}]}"
| rex mode=sed "s/({\"id\": \d{5})/#\1/g"
| makemv delim="#" _raw
| stats count by _raw
| spath

Hi, For this results,

props.conf

[ associated_json ](# make original sourcetype)
SHOULD_LINEMERGE=false
LINE_BREAKER=(.){\"id\": \d{5}
NO_BINARY_CHECK=true
CHARSET=UTF-8
INDEXED_EXTRACTIONS=json
KV_MODE=none
TRUNCATE=0
category=Structured
description=JSON
disabled=false
pulldown_type=true
TRANSFORMS-header = null

transforms.conf

[null]
REGEX = offset
DEST_KEY = queue
FORMAT = nullQueue

Unnecessary headers need to be removed.

View solution in original post

0 Karma

to4kawa
Ultra Champion
|makeresults
| eval _raw="{\"offset\": 0, \"limit\": 1000, \"total\": 490, \"records\": [{\"id\": 17812, \"summary\": \"User created\", \"created\": \"2020-03-27T20:58:22.837+0000\", \"category\": \"user management\", \"eventSource\": \"\", \"objectItem\": {\"id\": \"qm:6d90bc78-fd3b-4401-8720-31989454f2b7:d583e9f8-94b1-4ad1-aa5a-67c35ecfd480\", \"name\": \"qm:6d90bc78-fd3b-4401-8720-31989454f2b7:d583e9f8-94b1-4ad1-aa5a-67c35ecfd480\", \"typeName\": \"USER\", \"parentId\": \"10000\", \"parentName\": \"IDP Directory\"}, \"changedValues\": [{\"fieldName\": \"Active / Inactive\", \"changedTo\": \"Active\"}], \"associatedItems\": [{\"id\": \"qm:6d90bc78-fd3b-4401-8720-31989454f2b7:d583e9f8-94b1-4ad1-aa5a-67c35ecfd480\", \"name\": \"qm:6d90bc78-fd3b-4401-8720-31989454f2b7:d583e9f8-94b1-4ad1-aa5a-67c35ecfd480\", \"typeName\": \"USER\", \"parentId\": \"10000\", \"parentName\": \"IDP Directory\"}]}, {\"id\": 17811, \"summary\": \"User added to group\", \"created\": \"2020-03-27T19:03:08.067+0000\", \"category\": \"group management\", \"eventSource\": \"\", \"objectItem\": {\"name\": \"system-administrators\", \"typeName\": \"GROUP\", \"parentId\": \"10000\", \"parentName\": \"IDP Directory\"}, \"associatedItems\": [{\"id\": \"5be151ae92e2727e0939e20a\", \"name\": \"5be151ae92e2727e0939e20a\", \"typeName\": \"USER\", \"parentId\": \"10000\", \"parentName\": \"IDP Directory\"}]}, {\"id\": 17810, \"summary\": \"User added to group\", \"created\": \"2020-03-27T19:03:08.063+0000\", \"category\": \"group management\", \"eventSource\": \"\", \"objectItem\": {\"name\": \"site-admins\", \"typeName\": \"GROUP\", \"parentId\": \"10000\", \"parentName\": \"IDP Directory\"}, \"associatedItems\": [{\"id\": \"5be151ae92e2727e0939e20a\", \"name\": \"5be151ae92e2727e0939e20a\", \"typeName\": \"USER\", \"parentId\": \"10000\", \"parentName\": \"IDP Directory\"}]}"
| rex mode=sed "s/({\"id\": \d{5})/#\1/g"
| makemv delim="#" _raw
| stats count by _raw
| spath

Hi, For this results,

props.conf

[ associated_json ](# make original sourcetype)
SHOULD_LINEMERGE=false
LINE_BREAKER=(.){\"id\": \d{5}
NO_BINARY_CHECK=true
CHARSET=UTF-8
INDEXED_EXTRACTIONS=json
KV_MODE=none
TRUNCATE=0
category=Structured
description=JSON
disabled=false
pulldown_type=true
TRANSFORMS-header = null

transforms.conf

[null]
REGEX = offset
DEST_KEY = queue
FORMAT = nullQueue

Unnecessary headers need to be removed.

0 Karma

rmorrison6
Engager

Hi to4kawa,

I attempted to ingest the data using the provided props.conf, however I'm still seeing only one event be indexed. I ran the makeresults command and it appears to be breaking the data into multiple events. Any idea why the data may not be indexed in a similar fashion? Are there potentially logs that may identify the issue?

Thanks for your help thus far

-Rob

0 Karma

to4kawa
Ultra Champion

your sample contains [\r\n]+
but actual log does not have, maybe.
please check log and display _raw

0 Karma

rmorrison6
Engager

to4kawa,

I believe you are correct. I had copy + pasted from a text editor previously that was formatting for json. The raw log is below.

Also after looking at the events further I believe they need to be broken at the initial "id" field that specifies a 5 digit number afterwards. I see that "id" shows up as another nested field multiple times but it appears the 5 digit number is the audit change number.

{"offset": 0, "limit": 1000, "total": 490, "records": [{"id": 17812, "summary": "User created", "created": "2020-03-27T20:58:22.837+0000", "category": "user management", "eventSource": "", "objectItem": {"id": "qm:6d90bc78-fd3b-4401-8720-31989454f2b7:d583e9f8-94b1-4ad1-aa5a-67c35ecfd480", "name": "qm:6d90bc78-fd3b-4401-8720-31989454f2b7:d583e9f8-94b1-4ad1-aa5a-67c35ecfd480", "typeName": "USER", "parentId": "10000", "parentName": "IDP Directory"}, "changedValues": [{"fieldName": "Active / Inactive", "changedTo": "Active"}], "associatedItems": [{"id": "qm:6d90bc78-fd3b-4401-8720-31989454f2b7:d583e9f8-94b1-4ad1-aa5a-67c35ecfd480", "name": "qm:6d90bc78-fd3b-4401-8720-31989454f2b7:d583e9f8-94b1-4ad1-aa5a-67c35ecfd480", "typeName": "USER", "parentId": "10000", "parentName": "IDP Directory"}]}, {"id": 17811, "summary": "User added to group", "created": "2020-03-27T19:03:08.067+0000", "category": "group management", "eventSource": "", "objectItem": {"name": "system-administrators", "typeName": "GROUP", "parentId": "10000", "parentName": "IDP Directory"}, "associatedItems": [{"id": "5be151ae92e2727e0939e20a", "name": "5be151ae92e2727e0939e20a", "typeName": "USER", "parentId": "10000", "parentName": "IDP Directory"}]}, {"id": 17810, "summary": "User added to group", "created": "2020-03-27T19:03:08.063+0000", "category": "group management", "eventSource": "", "objectItem": {"name": "site-admins", "typeName": "GROUP", "parentId": "10000", "parentName": "IDP Directory"}, "associatedItems": [{"id": "5be151ae92e2727e0939e20a", "name": "5be151ae92e2727e0939e20a", "typeName": "USER", "parentId": "10000", "parentName": "IDP Directory"}]}

0 Karma

to4kawa
Ultra Champion

I see, check my answer.

0 Karma

rmorrison6
Engager

Hey to4kawa,

Those configs are working perfectly for extracting the file into the appropriate number of events. However, none of the fields are being extracted now. I can manually extract them but figured the "INDEXED_EXTRACTIONS=json" should take care of that? Thanks for your help thus far

-Rob

0 Karma

to4kawa
Ultra Champion
 INDEXED_EXTRACTIONS=none
 KV_MODE=json

maybe, null events causes this.
use KV_MODE=json

0 Karma

rmorrison6
Engager

Perfect! This config is working great now, individual events are extracted along with the appropriate fields. Thanks for your help to4kawa!

-Rob

0 Karma
Get Updates on the Splunk Community!

Enterprise Security Content Update (ESCU) | New Releases

In December, the Splunk Threat Research Team had 1 release of new security content via the Enterprise Security ...

Why am I not seeing the finding in Splunk Enterprise Security Analyst Queue?

(This is the first of a series of 2 blogs). Splunk Enterprise Security is a fantastic tool that offers robust ...

Index This | What are the 12 Days of Splunk-mas?

December 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...