Getting Data In

Break json twitter events

mbschriek
Explorer

I would like to know how to break down the following json events:

[{
    "created_at": "Tue Aug 09 16:00:34 +0000 2016",
    "id": xx,
    "id_str": "xx",
    "text": "xx",
    "truncated": false,
    "entities": {
        "hashtags": [{
            "text": "xx",
            "indices": [
                60,
                72
            ]
        }, {
            "text": "happiness",
            "indices": [
                74,
                84
            ]
        }],
        "symbols": [],
        "user_mentions": [],
        "urls": [],
        "media": [{
            "id": xx,
            "id_str": "xx",
            "indices": [
                85,
                108
            ],
            "media_url": "http://pbs.twimg.com/media/xx.jpg",
            "media_url_https": "https://pbs.twimg.com/media/xx.jpg",
            "url": "https://t.co/xx",
            "display_url": "pic.twitter.com/xx",
            "expanded_url": "http://twitter.com/xx/status/xx/photo/1",
            "type": "photo",
            "sizes": {
                "small": {
                    "w": 680,
                    "h": 510,
                    "resize": "fit"
                },
                "thumb": {
                    "w": 150,
                    "h": 150,
                    "resize": "crop"
                },
                "large": {
                    "w": 2048,
                    "h": 1536,
                    "resize": "fit"
                },
                "medium": {
                    "w": 1200,
                    "h": 900,
                    "resize": "fit"
                }
            }
        }]
    },
    "extended_entities": {
        "media": [{
            "id": xx,
            "id_str": "xx",
            "indices": [
                85,
                108
            ],
            "media_url": "http://pbs.twimg.com/media/xx.jpg",
            "media_url_https": "https://pbs.twimg.com/media/xx.jpg",
            "url": "https://t.co/xx",
            "display_url": "pic.twitter.com/xx",
            "expanded_url": "http://twitter.com/xx/status/xx/photo/1",
            "type": "photo",
            "sizes": {
                "small": {
                    "w": 680,
                    "h": 510,
                    "resize": "fit"
                },
                "thumb": {
                    "w": 150,
                    "h": 150,
                    "resize": "crop"
                },
                "large": {
                    "w": 2048,
                    "h": 1536,
                    "resize": "fit"
                },
                "medium": {
                    "w": 1200,
                    "h": 900,
                    "resize": "fit"
                }
            }
        }]
    },
    "source": "<a href="
    http: //twitter.com" rel="nofollow">Twitter Web Client</a>",
        "in_reply_to_status_id": null,
    "in_reply_to_status_id_str": null,
    "in_reply_to_user_id": null,
    "in_reply_to_user_id_str": null,
    "in_reply_to_screen_name": null,
    "user": {
        "id": xx,
        "id_str": "xx",
        "name": "xx xx",
        "screen_name": "xx",
        "location": "xx",
        "description": "xx",
        "url": "https://t.co/xx",
        "entities": {
            "url": {
                "urls": [{
                    "url": "https://t.co/xx",
                    "expanded_url": "http://xx",
                    "display_url": "xx",
                    "indices": [
                        0,
                        23
                    ]
                }]
            },
            "description": {
                "urls": []
            }
        },
        "protected": false,
        "followers_count": 1076,
        "friends_count": 832,
        "listed_count": 36,
        "created_at": "Tue Feb 04 17:49:41 +0000 2014",
        "favourites_count": 162,
        "utc_offset": 7200,
        "time_zone": "xx",
        "geo_enabled": true,
        "verified": false,
        "statuses_count": 637,
        "lang": "nl",
        "contributors_enabled": false,
        "is_translator": false,
        "is_translation_enabled": false,
        "profile_background_color": "89C9FA",
        "profile_background_image_url": "http://abs.twimg.com/images/themes/theme1/bg.png",
        "profile_background_image_url_https": "https://abs.twimg.com/images/themes/theme1/bg.png",
        "profile_background_tile": false,
        "profile_image_url": "http://pbs.twimg.com/profile_images/xx/xx.png",
        "profile_image_url_https": "https://pbs.twimg.com/profile_images/xx/xx.png",
        "profile_banner_url": "https://pbs.twimg.com/profile_banners/xx/xx",
        "profile_link_color": "1199FF",
        "profile_sidebar_border_color": "000000",
        "profile_sidebar_fill_color": "DDEEF6",
        "profile_text_color": "333333",
        "profile_use_background_image": true,
        "has_extended_profile": false,
        "default_profile": false,
        "default_profile_image": false,
        "following": true,
        "follow_request_sent": false,
        "notifications": false
    },
    "geo": null,
    "coordinates": null,
    "place": null,
    "contributors": null,
    "is_quote_status": false,
    "retweet_count": 9,
    "favorite_count": 21,
    "favorited": false,
    "retweeted": false,
    "possibly_sensitive": false,
    "possibly_sensitive_appealable": false,
    "lang": "en"
}, {
    "created_at": "Tue Aug 09 15:28:16 +0000 2016",
    "id": xx,
    "id_str": "xx",
    "text": "xx",
    "truncated": false,
    "entities": {
        "hashtags": [{
            "text": "xx",
            "indices": [
                103,
                115
            ]
        }],
        "symbols": [],
        "user_mentions": [{
            "screen_name": "xx",
            "name": "xx / xx",
            "id": xx,
            "id_str": "xx",
            "indices": [
                7,
                21
            ]
        }],
        "urls": [],
        "media": [{
            "id": xx,
            "id_str": "xx",
            "indices": [
                116,
                139
            ],
            "media_url": "http://pbs.twimg.com/media/xx.jpg",
            "media_url_https": "https://pbs.twimg.com/media/xx.jpg",
            "url": "https://t.co/xx",
            "display_url": "pic.twitter.com/xx",
            "expanded_url": "http://twitter.com/xx/status/xx/photo/1",
            "type": "photo",
            "sizes": {
                "small": {
                    "w": 680,
                    "h": 510,
                    "resize": "fit"
                },
                "large": {
                    "w": 2048,
                    "h": 1536,
                    "resize": "fit"
                },
                "thumb": {
                    "w": 150,
                    "h": 150,
                    "resize": "crop"
                },
                "medium": {
                    "w": 1200,
                    "h": 900,
                    "resize": "fit"
                }
            }
        }]
    },
    "extended_entities": {
        "media": [{
            "id": xx,
            "id_str": "xx",
            "indices": [
                116,
                139
            ],
            "media_url": "http://pbs.twimg.com/media/xx.jpg",
            "media_url_https": "https://pbs.twimg.com/media/xx.jpg",
            "url": "https://t.co/xx",
            "display_url": "pic.twitter.com/xx",
            "expanded_url": "http://twitter.com/xx/status/xx/photo/1",
            "type": "photo",
            "sizes": {
                "small": {
                    "w": 680,
                    "h": 510,
                    "resize": "fit"
                },
                "large": {
                    "w": 2048,
                    "h": 1536,
                    "resize": "fit"
                },
                "thumb": {
                    "w": 150,
                    "h": 150,
                    "resize": "crop"
                },
                "medium": {
                    "w": 1200,
                    "h": 900,
                    "resize": "fit"
                }
            }
        }, {
            "id": xx,
            "id_str": "xx",
            "indices": [
                116,
                139
            ],
            "media_url": "http://pbs.twimg.com/media/xx.jpg",
            "media_url_https": "https://pbs.twimg.com/media/xx.jpg",
            "url": "https://t.co/xx",
            "display_url": "pic.twitter.com/xx",
            "expanded_url": "http://twitter.com/xx/status/xx/photo/1",
            "type": "photo",
            "sizes": {
                "small": {
                    "w": 680,
                    "h": 510,
                    "resize": "fit"
                },
                "medium": {
                    "w": 1200,
                    "h": 900,
                    "resize": "fit"
                },
                "thumb": {
                    "w": 150,
                    "h": 150,
                    "resize": "crop"
                },
                "large": {
                    "w": 2048,
                    "h": 1536,
                    "resize": "fit"
                }
            }
        }]
    },
    "source": "<a href="
    http: //twitter.com" rel="nofollow">Twitter Web Client</a>",
        "in_reply_to_status_id": null,
    "in_reply_to_status_id_str": null,
    "in_reply_to_user_id": null,
    "in_reply_to_user_id_str": null,
    "in_reply_to_screen_name": null,
    "user": {
        "id": xx,
        "id_str": "xx",
        "name": "xx xx",
        "screen_name": "xx",
        "location": "xx",
        "description": "xx",
        "url": "https://t.co/xx",
        "entities": {
            "url": {
                "urls": [{
                    "url": "https://t.xx/xx",
                    "expanded_url": "http://www.xx-xx.com",
                    "display_url": "xx-xx.com",
                    "indices": [
                        0,
                        23
                    ]
                }]
            },
            "description": {
                "urls": []
            }
        },
        "protected": false,
        "followers_count": 1076,
        "friends_count": 832,
        "listed_count": 36,
        "created_at": "Tue Feb 04 17:49:41 +0000 2014",
        "favourites_count": 162,
        "utc_offset": 7200,
        "time_zone": "Amsterdam",
        "geo_enabled": true,
        "verified": false,
        "statuses_count": 637,
        "lang": "nl",
        "contributors_enabled": false,
        "is_translator": false,
        "is_translation_enabled": false,
        "profile_background_color": "89C9FA",
        "profile_background_image_url": "http://abs.twimg.com/images/themes/theme1/bg.png",
        "profile_background_image_url_https": "https://abs.twimg.com/images/themes/theme1/bg.png",
        "profile_background_tile": false,
        "profile_image_url": "http://pbs.twimg.com/profile_images/xx/xx.png",
        "profile_image_url_https": "https://pbs.twimg.com/profile_images/xx/xx.png",
        "profile_banner_url": "https://pbs.twimg.com/profile_banners/2327493595/1469130376",
        "profile_link_color": "1199FF",
        "profile_sidebar_border_color": "000000",
        "profile_sidebar_fill_color": "DDEEF6",
        "profile_text_color": "333333",
        "profile_use_background_image": true,
        "has_extended_profile": false,
        "default_profile": false,
        "default_profile_image": false,
        "following": true,
        "follow_request_sent": false,
        "notifications": false
    },
    "geo": null,
    "coordinates": null,
    "place": null,
    "contributors": null,
    "is_quote_status": false,
    "retweet_count": 5,
    "favorite_count": 4,
    "favorited": false,
    "retweeted": false,
    "possibly_sensitive": false,
    "possibly_sensitive_appealable": false,
    "lang": "en"
}]

I looked at the following answer and develop a props.conf myself, but this is not working. Similar to this answer.

[example]
BREAK_ONLY_BEFORE_DATE = false
BREAK_ONLY_BEFORE = (\{\s+"created_at")
MUST_BREAK_AFTER = ("en"\s+\})
SEDCMD-remove_header = s/\[{\s+\"created_at"/{"created_at"/g
SEDCMD-remove_trailing_commas = s/\}, \{\s+\"created_at"/}{"created_at"/g
SEDCMD-remove_footer = s/"en"\s+\}]/"en"}/g
TIME_PREFIX = \"created_at\":\s+\"
Tags (1)
0 Karma
1 Solution

jkat54
SplunkTrust
SplunkTrust

This worked with your data for me:

[ <SOURCETYPE NAME> ]
SHOULD_LINEMERGE=false
NO_BINARY_CHECK=true
CHARSET=AUTO
disabled=false
LINE_BREAKER=}\, {(\s+)"created_at"
SEDCMD-aaaheader=s/\[\{\s+//g
SEDCMD-bbbfooter=s/\}\]\s+//g
SEDCMD-cccfixjson=s/"created/{\n"created/g
SEDCMD-dddfixjson=s/\}\,\s\{/}/g

alt text

View solution in original post

jkat54
SplunkTrust
SplunkTrust

This worked with your data for me:

[ <SOURCETYPE NAME> ]
SHOULD_LINEMERGE=false
NO_BINARY_CHECK=true
CHARSET=AUTO
disabled=false
LINE_BREAKER=}\, {(\s+)"created_at"
SEDCMD-aaaheader=s/\[\{\s+//g
SEDCMD-bbbfooter=s/\}\]\s+//g
SEDCMD-cccfixjson=s/"created/{\n"created/g
SEDCMD-dddfixjson=s/\}\,\s\{/}/g

alt text

Get Updates on the Splunk Community!

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...