Deployment Architecture

How to split this json become mutiple events?

blzaxe
New Member

Hello

I'm trying to split a Json file from FaceBook Graph API into multiple Events in the props.conf

Here is the json simple:

{
"about": "http://www.appledaily.com.tw",
"posts": {
"data": [
{
"message": "first post message",
"created_time": "2016-11-01T11:20:01+0000",
"id": "232633627068_10155237456442069",
"likes": {
"data": [
{
"id": "125823837756509",
"name": "XXX"
},
{
"id": "125547431150532",
"name": "OOO"
}
],
"paging": {
"cursors": {
"before": "MTI1ODIzODM3NzU2NTA5",
"after": "Nzk0NDQzNDAzOTEyNjc3"
}
}
}
},
{
"message": "other messages",
"created_time": "2016-11-01T11:10:00+0000",
"id": "232633627068_10155237171047069",
"likes": {
"data": [
{
"id": "434788333331603",
"name": "AA"
},
{
"id": "1485443865001594",
"name": "BB"
}
],
"paging": {
"cursors": {
"before": "NDM0Nzg4MzMzMzMxNjAz",
"after": "NjA4NDc4NTY5MjU5ODQ1"
}
}
}
}
],
"paging": {
"previous": "https://graph.facebook.com/v2.8/232633627068/posts?limit=10&fields=likes.limit%2810000%29,message,cr...",
"next": "https://graph.facebook.com/v2.8/232633627068/posts?limit=10&fields=likes.limit%2810000%29,message,cr..."
}
},
"id": "232633627068"
}

This is my props.conf setting:

[_json]
INDEXED_EXTRACTIONS = json
KV_MODE = JSON
DATETIME_CONFIG = CURRENT
NO_BINARY_CHECK = true
BREAK_ONLY_BEFORE = ^{
TIMESTAMP_FIELDS = created_time
TIME_FORMAT = %FT%T%z
TRUNCATE = 100000000
pulldown_type = true
disabled = false
TZ = UTC

What should the props.conf look like to split such a file to become multiple Events?
or input the file then used spath to to split event ?
thank you for your suggestions.

Tags (1)
0 Karma

bmacias84
Champion

Hello @blzaxe,

The best way would be to preprocess with a modular input or some kinda of script. If thats not an option you are going to need to use index time transforms withs some additional props. I am guessing the data you want to split in to multiple events is everything contained within :

{
"about": "http://www.appledaily.com.tw",
"posts": {
"data": [

I am also assuming its a single line event or is it pretty printed. I let you figure that out, but for this example I am going believe your event looks is a single line like this {"about": "http://www.appledaily.com.tw","posts": {"data": [

Step one create transforms to strip out the outer json body

[removeOuterBody1]
# regex captures outer envelop/message container
REGEX = ^({[^\n]+data\":\s\[)([^\n]+)
FORMAT = $2
DEST_KEY = _raw

[removeOuterBody1]
# regex captures begining envelop/message container
REGEX = ([^\n]+)(\}\}\])$
FORMAT = $1
DEST_KEY = _raw

[removeOuterBody2]
# regex captures end envelop/message container
REGEX = ([^\n]+)(\}\}\])$
FORMAT = $1
DEST_KEY = _raw

Now you need to apply these to your props.

[CustomSourcetype]
TRANSFORMS-cleanMsg = removeOuterBody1, removeOuterBody2
DATETIME_CONFIG = CURRENT
NO_BINARY_CHECK = true
BREAK_ONLY_BEFORE =  ,\{"message":
TIMESTAMP_FIELDS = created_time
TIME_FORMAT = %FT%T%z
TRUNCATE = 100000000
pulldown_type = true 
disabled = false
TZ = UTC

The unfortunate problem is that you will still end up with a comma in your broken events, but unfortunately each event still contains a comma which makes it invalid json. You could clean this up if you did all this pre-parsing an a HF and then used another transform to strip the comma at the begin of the event on the indexers.

0 Karma

blzaxe
New Member

Excuse me! I put transforms.conf in \etc\apps\app_names\local
why it can't do?

0 Karma
Get Updates on the Splunk Community!

More Ways To Control Your Costs With Archived Metrics | Register for Tech Talk

Tuesday, May 14, 2024  |  11AM PT / 2PM ET Register to Attend Join us for this Tech Talk and learn how to ...

.conf24 | Personalize your .conf experience with Learning Paths!

Personalize your .conf24 Experience Learning paths allow you to level up your skill sets and dive deeper ...

Threat Hunting Unlocked: How to Uplevel Your Threat Hunting With the PEAK Framework ...

WATCH NOWAs AI starts tackling low level alerts, it's more critical than ever to uplevel your threat hunting ...