I'm attempting to extract JSON into multiple events. I've read some other answers and attempted to test configurations using the Add Data feature and tweaking the settings. It seems like from what I've read, SHOULD_LINEMERGE should be set to false and there should be a LINE_BREAKER value to identify where to break the lines. Based on the sample data below I think it should break on the associatedItems field. I've attempted to modify + apply different values in the Add Data wizard but it doesn't seem to make a difference.
Any ideas?
This is the source type config I've attempted to use:
[ _json ]
SHOULD_LINEMERGE=false
LINE_BREAKER={"associatedItems"
NO_BINARY_CHECK=true
CHARSET=UTF-8
INDEXED_EXTRACTIONS=json
KV_MODE=none
TRUNCATE=1000000
category=Structured
description=JavaScript Object Notation format. For more information, visit http://json.org/
disabled=false
pulldown_type=true
{
"limit": 1000,
"offset": 0,
"records": [
{
"associatedItems": [
{
"id": "557058:8bc118d1-552f-4613-9b34-c15add9aad17",
"name": "removed58",
"parentId": "10000",
"parentName": "IDP Directory",
"typeName": "USER"
}
],
"category": "user management",
"changedValues": [
{
"changedFrom": "Active",
"changedTo": "Inactive",
"fieldName": "Active / Inactive"
}
],
"created": "2020-03-27T13:27:30.207+0000",
"eventSource": "",
"id": 17808,
"objectItem": {
"id": "557058:8bc118d1-552f-4613-9b34-c15add9aad17",
"name": "removed58",
"parentId": "10000",
"parentName": "IDP Directory",
"typeName": "USER"
},
"summary": "User updated"
},
{
"associatedItems": [
{
"id": "qm:6d90bc78-fd3b-4401-8720-31989454f2b7:5b1d2af1-eb27-4451-8461-575ae3c4a9a5",
"name": "qm:6d90bc78-fd3b-4401-8720-31989454f2b7:5b1d2af1-eb27-4451-8461-575ae3c4a9a5",
"parentId": "10000",
"parentName": "IDP Directory",
"typeName": "USER"
}
],
"authorKey": "testuser",
"category": "user management",
"changedValues": [
{
"changedTo": "Active",
"fieldName": "Active / Inactive"
}
],
"created": "2020-03-27T04:55:07.336+0000",
"eventSource": "",
"id": 17807,
"objectItem": {
"id": "qm:6d90bc78-fd3b-4401-8720-31989454f2b7:5b1d2af1-eb27-4451-8461-575ae3c4a9a5",
"name": "qm:6d90bc78-fd3b-4401-8720-31989454f2b7:5b1d2af1-eb27-4451-8461-575ae3c4a9a5",
"parentId": "10000",
"parentName": "IDP Directory",
"typeName": "USER"
},
"summary": "User created"
},
|makeresults
| eval _raw="{\"offset\": 0, \"limit\": 1000, \"total\": 490, \"records\": [{\"id\": 17812, \"summary\": \"User created\", \"created\": \"2020-03-27T20:58:22.837+0000\", \"category\": \"user management\", \"eventSource\": \"\", \"objectItem\": {\"id\": \"qm:6d90bc78-fd3b-4401-8720-31989454f2b7:d583e9f8-94b1-4ad1-aa5a-67c35ecfd480\", \"name\": \"qm:6d90bc78-fd3b-4401-8720-31989454f2b7:d583e9f8-94b1-4ad1-aa5a-67c35ecfd480\", \"typeName\": \"USER\", \"parentId\": \"10000\", \"parentName\": \"IDP Directory\"}, \"changedValues\": [{\"fieldName\": \"Active / Inactive\", \"changedTo\": \"Active\"}], \"associatedItems\": [{\"id\": \"qm:6d90bc78-fd3b-4401-8720-31989454f2b7:d583e9f8-94b1-4ad1-aa5a-67c35ecfd480\", \"name\": \"qm:6d90bc78-fd3b-4401-8720-31989454f2b7:d583e9f8-94b1-4ad1-aa5a-67c35ecfd480\", \"typeName\": \"USER\", \"parentId\": \"10000\", \"parentName\": \"IDP Directory\"}]}, {\"id\": 17811, \"summary\": \"User added to group\", \"created\": \"2020-03-27T19:03:08.067+0000\", \"category\": \"group management\", \"eventSource\": \"\", \"objectItem\": {\"name\": \"system-administrators\", \"typeName\": \"GROUP\", \"parentId\": \"10000\", \"parentName\": \"IDP Directory\"}, \"associatedItems\": [{\"id\": \"5be151ae92e2727e0939e20a\", \"name\": \"5be151ae92e2727e0939e20a\", \"typeName\": \"USER\", \"parentId\": \"10000\", \"parentName\": \"IDP Directory\"}]}, {\"id\": 17810, \"summary\": \"User added to group\", \"created\": \"2020-03-27T19:03:08.063+0000\", \"category\": \"group management\", \"eventSource\": \"\", \"objectItem\": {\"name\": \"site-admins\", \"typeName\": \"GROUP\", \"parentId\": \"10000\", \"parentName\": \"IDP Directory\"}, \"associatedItems\": [{\"id\": \"5be151ae92e2727e0939e20a\", \"name\": \"5be151ae92e2727e0939e20a\", \"typeName\": \"USER\", \"parentId\": \"10000\", \"parentName\": \"IDP Directory\"}]}"
| rex mode=sed "s/({\"id\": \d{5})/#\1/g"
| makemv delim="#" _raw
| stats count by _raw
| spath
Hi, For this results,
props.conf
[ associated_json ](# make original sourcetype)
SHOULD_LINEMERGE=false
LINE_BREAKER=(.){\"id\": \d{5}
NO_BINARY_CHECK=true
CHARSET=UTF-8
INDEXED_EXTRACTIONS=json
KV_MODE=none
TRUNCATE=0
category=Structured
description=JSON
disabled=false
pulldown_type=true
TRANSFORMS-header = null
transforms.conf
[null]
REGEX = offset
DEST_KEY = queue
FORMAT = nullQueue
Unnecessary headers need to be removed.
|makeresults
| eval _raw="{\"offset\": 0, \"limit\": 1000, \"total\": 490, \"records\": [{\"id\": 17812, \"summary\": \"User created\", \"created\": \"2020-03-27T20:58:22.837+0000\", \"category\": \"user management\", \"eventSource\": \"\", \"objectItem\": {\"id\": \"qm:6d90bc78-fd3b-4401-8720-31989454f2b7:d583e9f8-94b1-4ad1-aa5a-67c35ecfd480\", \"name\": \"qm:6d90bc78-fd3b-4401-8720-31989454f2b7:d583e9f8-94b1-4ad1-aa5a-67c35ecfd480\", \"typeName\": \"USER\", \"parentId\": \"10000\", \"parentName\": \"IDP Directory\"}, \"changedValues\": [{\"fieldName\": \"Active / Inactive\", \"changedTo\": \"Active\"}], \"associatedItems\": [{\"id\": \"qm:6d90bc78-fd3b-4401-8720-31989454f2b7:d583e9f8-94b1-4ad1-aa5a-67c35ecfd480\", \"name\": \"qm:6d90bc78-fd3b-4401-8720-31989454f2b7:d583e9f8-94b1-4ad1-aa5a-67c35ecfd480\", \"typeName\": \"USER\", \"parentId\": \"10000\", \"parentName\": \"IDP Directory\"}]}, {\"id\": 17811, \"summary\": \"User added to group\", \"created\": \"2020-03-27T19:03:08.067+0000\", \"category\": \"group management\", \"eventSource\": \"\", \"objectItem\": {\"name\": \"system-administrators\", \"typeName\": \"GROUP\", \"parentId\": \"10000\", \"parentName\": \"IDP Directory\"}, \"associatedItems\": [{\"id\": \"5be151ae92e2727e0939e20a\", \"name\": \"5be151ae92e2727e0939e20a\", \"typeName\": \"USER\", \"parentId\": \"10000\", \"parentName\": \"IDP Directory\"}]}, {\"id\": 17810, \"summary\": \"User added to group\", \"created\": \"2020-03-27T19:03:08.063+0000\", \"category\": \"group management\", \"eventSource\": \"\", \"objectItem\": {\"name\": \"site-admins\", \"typeName\": \"GROUP\", \"parentId\": \"10000\", \"parentName\": \"IDP Directory\"}, \"associatedItems\": [{\"id\": \"5be151ae92e2727e0939e20a\", \"name\": \"5be151ae92e2727e0939e20a\", \"typeName\": \"USER\", \"parentId\": \"10000\", \"parentName\": \"IDP Directory\"}]}"
| rex mode=sed "s/({\"id\": \d{5})/#\1/g"
| makemv delim="#" _raw
| stats count by _raw
| spath
Hi, For this results,
props.conf
[ associated_json ](# make original sourcetype)
SHOULD_LINEMERGE=false
LINE_BREAKER=(.){\"id\": \d{5}
NO_BINARY_CHECK=true
CHARSET=UTF-8
INDEXED_EXTRACTIONS=json
KV_MODE=none
TRUNCATE=0
category=Structured
description=JSON
disabled=false
pulldown_type=true
TRANSFORMS-header = null
transforms.conf
[null]
REGEX = offset
DEST_KEY = queue
FORMAT = nullQueue
Unnecessary headers need to be removed.
Hi to4kawa,
I attempted to ingest the data using the provided props.conf, however I'm still seeing only one event be indexed. I ran the makeresults command and it appears to be breaking the data into multiple events. Any idea why the data may not be indexed in a similar fashion? Are there potentially logs that may identify the issue?
Thanks for your help thus far
-Rob
your sample contains [\r\n]+
but actual log does not have, maybe.
please check log and display _raw
to4kawa,
I believe you are correct. I had copy + pasted from a text editor previously that was formatting for json. The raw log is below.
Also after looking at the events further I believe they need to be broken at the initial "id" field that specifies a 5 digit number afterwards. I see that "id" shows up as another nested field multiple times but it appears the 5 digit number is the audit change number.
{"offset": 0, "limit": 1000, "total": 490, "records": [{"id": 17812, "summary": "User created", "created": "2020-03-27T20:58:22.837+0000", "category": "user management", "eventSource": "", "objectItem": {"id": "qm:6d90bc78-fd3b-4401-8720-31989454f2b7:d583e9f8-94b1-4ad1-aa5a-67c35ecfd480", "name": "qm:6d90bc78-fd3b-4401-8720-31989454f2b7:d583e9f8-94b1-4ad1-aa5a-67c35ecfd480", "typeName": "USER", "parentId": "10000", "parentName": "IDP Directory"}, "changedValues": [{"fieldName": "Active / Inactive", "changedTo": "Active"}], "associatedItems": [{"id": "qm:6d90bc78-fd3b-4401-8720-31989454f2b7:d583e9f8-94b1-4ad1-aa5a-67c35ecfd480", "name": "qm:6d90bc78-fd3b-4401-8720-31989454f2b7:d583e9f8-94b1-4ad1-aa5a-67c35ecfd480", "typeName": "USER", "parentId": "10000", "parentName": "IDP Directory"}]}, {"id": 17811, "summary": "User added to group", "created": "2020-03-27T19:03:08.067+0000", "category": "group management", "eventSource": "", "objectItem": {"name": "system-administrators", "typeName": "GROUP", "parentId": "10000", "parentName": "IDP Directory"}, "associatedItems": [{"id": "5be151ae92e2727e0939e20a", "name": "5be151ae92e2727e0939e20a", "typeName": "USER", "parentId": "10000", "parentName": "IDP Directory"}]}, {"id": 17810, "summary": "User added to group", "created": "2020-03-27T19:03:08.063+0000", "category": "group management", "eventSource": "", "objectItem": {"name": "site-admins", "typeName": "GROUP", "parentId": "10000", "parentName": "IDP Directory"}, "associatedItems": [{"id": "5be151ae92e2727e0939e20a", "name": "5be151ae92e2727e0939e20a", "typeName": "USER", "parentId": "10000", "parentName": "IDP Directory"}]}
I see, check my answer.
Hey to4kawa,
Those configs are working perfectly for extracting the file into the appropriate number of events. However, none of the fields are being extracted now. I can manually extract them but figured the "INDEXED_EXTRACTIONS=json" should take care of that? Thanks for your help thus far
-Rob
INDEXED_EXTRACTIONS=none
KV_MODE=json
maybe, null events causes this.
use KV_MODE=json
Perfect! This config is working great now, individual events are extracted along with the appropriate fields. Thanks for your help to4kawa!
-Rob