Getting Data In

Need help with complicated Json format

jrahikasplunk
New Member

I've got complicated structure.

Start of the log file:

{
"dataUpdatedTime" : "2017-12-28T12:07:00+02:00",
"links" : [ {
"id" : 27,
"linkMeasurements" : [ {
"fluencyClass" : 5,
"minute" : 329,
"averageSpeed" : 75.851,
"medianTravelTime" : 158,
"measuredTime" : "2017-12-27T05:29:00+02:00"

.
.
.
.

}, {
"fluencyClass" : 5,
"minute" : 1289,
"averageSpeed" : 75.374,
"medianTravelTime" : 159,
"measuredTime" : "2017-12-27T21:29:00+02:00"
}, {
"fluencyClass" : 5,
"minute" : 1358,
"averageSpeed" : 72.633,
"medianTravelTime" : 165,
"measuredTime" : "2017-12-27T22:38:00+02:00"
} ],
"measuredTime" : "2017-12-27T22:38:00+02:00"
}, {
"id" : 30,
"linkMeasurements" : [ {
"fluencyClass" : 5,
"minute" : 0,
"averageSpeed" : 43.548,
"medianTravelTime" : 124,
"measuredTime" : "2017-12-27T00:00:00+02:00"

Notice that id doesnt change until certain period. How to index events based on id which is unique identifier which how ever doesnt appear in every json array.

0 Karma

nickhills
Ultra Champion

Assuming i have interpreted your JSON right,, spath is interpreting correctly:

I tested with this:

| makeresults |eval samplejson="{
    \"dataUpdatedTime\": \"2017-12-28T12:07:00+02:00\",
    \"links\": [{
        \"id\": 27,
        \"linkMeasurements\": [{
            \"fluencyClass\": 5,
            \"minute\": 329,
            \"averageSpeed\": 75.851,
            \"medianTravelTime\": 158,
            \"measuredTime\": \"2017-12-27T05:29:00+02:00\"
        }, {
            \"fluencyClass\": 5,
            \"minute\": 331,
            \"averageSpeed\": 75.851,
            \"medianTravelTime\": 158,
            \"measuredTime\": \"2017-12-27T05:31:00+02:00\"

        }, {
            \"fluencyClass\": 5,
            \"minute\": 354,
            \"averageSpeed\": 83.807,
            \"medianTravelTime\": 143,
            \"measuredTime\": \"2017-12-27T05:54:00+02:00\"
        }],
        \"measuredTime\": \"2017-12-27T22:38:00+02:00\"
    }, {
        \"id\": 30,
        \"linkMeasurements\": [{
            \"fluencyClass\": 5,
            \"minute\": 0,
            \"averageSpeed\": 43.548,
            \"medianTravelTime\": 124,
            \"measuredTime\": \"2017-12-27T00:00:00+02:00\"
        }]
    }]
}"|spath input=samplejson|table *

I wonder if your issue is truncation - very large Json events which exceed 10,000 bytes can often cause complications.

Run this:

index=_internal sourcetype=splunkd LineBreakingProcessor - Truncating line because limit of 10000 has been exceeded

Do you see this for your json sourcetype?

If my comment helps, please give it a thumbs up!
0 Karma

gwalford
Path Finder

Are you looking to only index events that have the unique identifier? If that is the case, then you probably want to do something like this:

https://answers.splunk.com/answers/477356/how-to-only-index-events-that-contain-specific-fie.html

If you are looking to index all the JSON files, and then trace events with the same ID, then you probably want to use the TRANSACTION command:

https://docs.splunk.com/Documentation/SplunkCloud/6.6.3/SearchReference/Transaction

Note: If you have a lot of events per ID, you may want to use STATS instead of TRANSACTION.

0 Karma

nickhills
Ultra Champion

Can you paste a complete json block.

Ideally confirm its well formed first with https://jsonlint.com/

If my comment helps, please give it a thumbs up!
0 Karma

jrahikasplunk
New Member

Since the log file is huge in event wise i will not post whole log file, but here is little bit more.

{
"dataUpdatedTime" : "2017-12-28T12:07:00+02:00",
"links" : [ {
"id" : 27,
"linkMeasurements" : [ {
"fluencyClass" : 5,
"minute" : 329,
"averageSpeed" : 75.851,
"medianTravelTime" : 158,
"measuredTime" : "2017-12-27T05:29:00+02:00"
}, {
"fluencyClass" : 5,
"minute" : 330,
"averageSpeed" : 75.851,
"medianTravelTime" : 158,
"measuredTime" : "2017-12-27T05:30:00+02:00"
}, {
"fluencyClass" : 5,
"minute" : 331,
"averageSpeed" : 75.851,
"medianTravelTime" : 158,
"measuredTime" : "2017-12-27T05:31:00+02:00"
}, {
"fluencyClass" : 5,
"minute" : 332,
"averageSpeed" : 75.851,
"medianTravelTime" : 158,
"measuredTime" : "2017-12-27T05:32:00+02:00"
}, {
"fluencyClass" : 5,
"minute" : 333,
"averageSpeed" : 75.851,
"medianTravelTime" : 158,
"measuredTime" : "2017-12-27T05:33:00+02:00"
}, {
"fluencyClass" : 5,
"minute" : 352,
"averageSpeed" : 83.807,
"medianTravelTime" : 143,
"measuredTime" : "2017-12-27T05:52:00+02:00"
}, {
"fluencyClass" : 5,
"minute" : 353,
"averageSpeed" : 83.807,
"medianTravelTime" : 143,
"measuredTime" : "2017-12-27T05:53:00+02:00"
}, {
"fluencyClass" : 5,
"minute" : 354,
"averageSpeed" : 83.807,
"medianTravelTime" : 143,
"measuredTime" : "2017-12-27T05:54:00+02:00"
}, {
"fluencyClass" : 5,
"minute" : 355,
"averageSpeed" : 83.807,
"medianTravelTime" : 143,
"measuredTime" : "2017-12-27T05:55:00+02:00"

....

}, {
"fluencyClass" : 5,
"minute" : 1274,
"averageSpeed" : 70.496,
"medianTravelTime" : 170,
"measuredTime" : "2017-12-27T21:14:00+02:00"
}, {
"fluencyClass" : 5,
"minute" : 1275,
"averageSpeed" : 70.496,
"medianTravelTime" : 170,
"measuredTime" : "2017-12-27T21:15:00+02:00"
}, {
"fluencyClass" : 5,
"minute" : 1276,
"averageSpeed" : 70.496,
"medianTravelTime" : 170,
"measuredTime" : "2017-12-27T21:16:00+02:00"
}, {
"fluencyClass" : 5,
"minute" : 1277,
"averageSpeed" : 70.496,
"medianTravelTime" : 170,
"measuredTime" : "2017-12-27T21:17:00+02:00"
}, {
"fluencyClass" : 5,
"minute" : 1278,
"averageSpeed" : 70.496,
"medianTravelTime" : 170,
"measuredTime" : "2017-12-27T21:18:00+02:00"
}, {
"fluencyClass" : 5,
"minute" : 1287,
"averageSpeed" : 75.374,
"medianTravelTime" : 159,
"measuredTime" : "2017-12-27T21:27:00+02:00"
}, {
"fluencyClass" : 5,
"minute" : 1288,
"averageSpeed" : 75.374,
"medianTravelTime" : 159,
"measuredTime" : "2017-12-27T21:28:00+02:00"
}, {
"fluencyClass" : 5,
"minute" : 1289,
"averageSpeed" : 75.374,
"medianTravelTime" : 159,
"measuredTime" : "2017-12-27T21:29:00+02:00"
}, {
"fluencyClass" : 5,
"minute" : 1358,
"averageSpeed" : 72.633,
"medianTravelTime" : 165,
"measuredTime" : "2017-12-27T22:38:00+02:00"
} ],
"measuredTime" : "2017-12-27T22:38:00+02:00"
}, {
"id" : 30,
"linkMeasurements" : [ {
"fluencyClass" : 5,
"minute" : 0,
"averageSpeed" : 43.548,
"medianTravelTime" : 124,
"measuredTime" : "2017-12-27T00:00:00+02:00"

You get the idea?

0 Karma

jrahikasplunk
New Member

"Id" is basically is in unique place in geolocation .

0 Karma
Get Updates on the Splunk Community!

Introducing Splunk Enterprise 9.2

WATCH HERE! Watch this Tech Talk to learn about the latest features and enhancements shipped in the new Splunk ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...