Hi folks,
I've struggled for a couple of days parsing a file that looks like this:
{ "status": { "timestamp": "2018-10-10T23:40:26.101Z", "error_code": 0, "error_message": null, "elapsed": 14, "credit_count": 1 }, "data": [ { "id": 1, "name": "Bitcoin", "symbol": "BTC", "slug": "bitcoin", "circulating_supply": 17315325, "total_supply": 17315325, "max_supply": 21000000, "date_added": "2013-04-28T00:00:00.000Z", "num_market_pairs": 6324, "cmc_rank": 1, "last_updated": "2018-10-10T23:39:30.000Z", "quote": { "USD": { "price": 6593.07839577, "volume_24h": 3781676977.17942, "percent_change_1h": 0.299567, "percent_change_24h": -0.633751, "percent_change_7d": 0.95377, "market_cap": 114161295173.23618, "last_updated": "2018-10-10T23:39:30.000Z" } } }, { "id": 1027, "name": "Ethereum", "symbol": "ETH", "slug": "ethereum", "circulating_supply": 102492584.749, "total_supply": 102492584.749, "max_supply": null, "date_added": "2015-08-07T00:00:00.000Z", "num_market_pairs": 4471, "cmc_rank": 2, "last_updated": "2018-10-10T23:39:42.000Z", "quote": { "USD": { "price": 225.917892111, "volume_24h": 1382931374.09404, "percent_change_1h": 0.569086, "percent_change_24h": -0.704234, "percent_change_7d": 1.85499, "market_cap": 23154908703.502106, "last_updated": "2018-10-10T23:39:42.000Z" } } }, { "id": 52, "name": "XRP", "symbol": "XRP", "slug": "ripple", "circulating_supply": 39997634397, "total_supply": 99991817275, "max_supply": 100000000000, "date_added": "2013-08-04T00:00:00.000Z", "num_market_pairs": 230, "cmc_rank": 3, "last_updated": "2018-10-10T23:40:09.000Z", "quote": { "USD": { "price": 0.461478997819, "volume_24h": 411665959.164198, "percent_change_1h": 0.445149, "percent_change_24h": -3.5819, "percent_change_7d": -13.0624, "market_cap": 18458068236.65832, "last_updated": "2018-10-10T23:40:09.000Z" } } }, { "id": 1831, "name": "Bitcoin Cash", "symbol": "BCH", "slug": "bitcoin-cash", "circulating_supply": 17395525, "total_supply": 17395525, "max_supply": 21000000, "date_added": "2017-07-23T00:00:00.000Z", "num_market_pairs": 342, "cmc_rank": 4, "last_updated": "2018-10-10T23:39:36.000Z", "quote": { "USD": { "price": 512.630430925, "volume_24h": 369198906.22134, "percent_change_1h": 0.365485, "percent_change_24h": -0.745641, "percent_change_7d": -1.26889, "market_cap": 8917475476.91661, "last_updated": "2018-10-10T23:39:36.000Z" } } }, { "id": 1765, "name": "EOS", "symbol": "EOS", "slug": "eos", "circulating_supply": 906245117.6, "total_supply": 1006245119.9339, "max_supply": null, "date_added": "2017-07-01T00:00:00.000Z", "num_market_pairs": 174, "cmc_rank": 5, "last_updated": "2018-10-10T23:39:36.000Z", "quote": { "USD": { "price": 5.88204493255, "volume_24h": 532584590.023657, "percent_change_1h": 0.326144, "percent_change_24h": -0.205986, "percent_change_7d": 4.79873, "market_cap": 5330574501.627259, "last_updated": "2018-10-10T23:39:36.000Z" } } }, { "id": 512, "name": "Stellar", "symbol": "XLM", "slug": "stellar", "circulating_supply": 18890616791.6665, "total_supply": 104363617748.491, "max_supply": null, "date_added": "2014-08-05T00:00:00.000Z", "num_market_pairs": 120, "cmc_rank": 6, "last_updated": "2018-10-10T23:39:19.000Z", "quote": { "USD": { "price": 0.241243489647, "volume_24h": 40002371.0719364, "percent_change_1h": 0.417773, "percent_change_24h": -1.36287, "percent_change_7d": -0.740479, "market_cap": 4557238316.405842, "last_updated": "2018-10-10T23:39:19.000Z" } } },
What I need to do is:
-extract the timestamp, which I'm doing successfully
- extract each coin and all the related information, including the nested "quote" as single event (the actual coin info starts after each "id", and finishes on the }, )
-all info before "data" can be suppressed
My current sourcetype looks like this:
[lalaland]
DATETIME_CONFIG =
KV_MODE = json
NO_BINARY_CHECK = true
SHOULD_LINEMERGE = true
TIME_FORMAT = %Y-%m-%dT%H:%M:%S.%3NT
TIME_PREFIX = \"timestamp\":
TZ = Australia/Sydney
category = Custom
pulldown_type = 1
BREAK_ONLY_BEFORE = \s*{\s*\"id\":
disabled = false
MAX_EVENTS = 10000
What am I doing wrong?
Tks!
I prefer to use the more explicit LINE_BREAKER
setting and set SHOULD_LINEMERGE
to false.
Also: Taking the timestamp from the 'header' and breaking the rest into separate events may not really be an option. Splunk expects a timestamp per (broken) event, not 1 timestamp and then a bunch of timestampless events. It might be possible to get that to work, but not sure if and how.
Can you not use the last update time in each of the coin sections as the timestamp? Then you could use config somewhat like this:
[lalaland]
KV_MODE = json
NO_BINARY_CHECK = true
SHOULD_LINEMERGE = false
TIME_FORMAT = %Y-%m-%dT%H:%M:%S.%3NT
TIME_PREFIX = \"last_updated\"\:
TZ = Australia/Sydney
category = Custom
pulldown_type = 1
LINE_BREAKER = (?:(^[^\[]+\[\s+)\{\s+\"id\"\:|(,\s+)\{\s+\"id\"\:)
disabled = false
MAX_EVENTS = 10000
regex101 link for the linebreaker regex: https://regex101.com/r/UfbLif/1
I prefer to use the more explicit LINE_BREAKER
setting and set SHOULD_LINEMERGE
to false.
Also: Taking the timestamp from the 'header' and breaking the rest into separate events may not really be an option. Splunk expects a timestamp per (broken) event, not 1 timestamp and then a bunch of timestampless events. It might be possible to get that to work, but not sure if and how.
Can you not use the last update time in each of the coin sections as the timestamp? Then you could use config somewhat like this:
[lalaland]
KV_MODE = json
NO_BINARY_CHECK = true
SHOULD_LINEMERGE = false
TIME_FORMAT = %Y-%m-%dT%H:%M:%S.%3NT
TIME_PREFIX = \"last_updated\"\:
TZ = Australia/Sydney
category = Custom
pulldown_type = 1
LINE_BREAKER = (?:(^[^\[]+\[\s+)\{\s+\"id\"\:|(,\s+)\{\s+\"id\"\:)
disabled = false
MAX_EVENTS = 10000
regex101 link for the linebreaker regex: https://regex101.com/r/UfbLif/1
Perfect one, I didn't look at question properly to break events for every coin. Still I am getting timestamp recognition error on 7.2 in my lab (may be due to %3NT
)
Oh, yes, I didn't double check that part. That should be a Z instead of a T of course. Or even a %Z if you actually want to capture that Z as the indicator for Zulu (UTC) timezone.
Bit strange by they way to set TZ = Australia/Sydney, while the logs mention that Z, meaning Zulu (UTC) timezone...
Hi Frank,
This is because the source of that info is based in the US, whilst we're based in AU.
:)
Thanks for the response, guys! Seems to be working fine!
How does your location matter for the TZ you set? If the information is logged in UTC timezone (which the Z indicates), you should tell splunk to interpret it as UTC, not as Australia time.
If you want things displayed in australia time, you do that with your user's timezone settings in splunk web, not with the props.conf. Telling splunk to index UTC logs as Australia/Sidney, will cause splunk to put skewed values into _time.
Hi @ADRIANODL,
Have you tried below configuration on Universal Forwarder ?
props.conf
[yourSourcetype]
INDEXED_EXTRACTIONS = json
That (on its own) is not going to help him split this data into separate events per coin.