All Apps and Add-ons

In the REST API Modular Input, how do you extract nested JSON with non-unique parent object names?

chrismmckenna
New Member

I've only copied a couple of the JSON objects below. I've tried to use the spath command, but am not sure how to identify the initial path because the parent object name is variable. (e.g 201803071708VZ7S2-8MFEU08V, 201803071708VZ7S2-DLFZ62AD, 201803071708VZ7S2-*, etc). There are 100s of these variable named objects per event.

I want extract all of the child objects (_id, label, parameters, etc) so I can use them for evaluation.

{
  "201803071708VZ7S2-ABCD1234": {
    "_id": "201803071708VZ7S2-8MFEU08V",
    "customer_id": "201803071708VZ7S2",
    "label": "CB-PRD-SEV1-NP-Offline-Authorization-Queue",
    "interval": 5,
    "notifications": [
      {
        "4GHUS": {
          "schedule": "All",
          "delay": 0
        }
      }
    ],
    "runlocations": false,
    "type": "HTTPPARSE",
    "status": "removed",
    "modified": 1541439105629,
    "enable": "inactive",
    "public": true,
    "dep": false,
    "parameters": {
      "target": "https://thatapp.domain.com/status",
      "fields": {
        "FKSTPR": {
          "name": "2.count",
          "min": 0,
          "max": 20
        }
      },
      "threshold": 5,
      "sens": 2
    },
    "created": 1541437905291,
    "queue": false,
    "uuid": "27wr5z4x-wunu-4mh9-8942-zc9vd19xi1sq",
    "firstdown": 1541437932354,
    "state": 0
  },
  "201803071708VZ7S2-ABCD4567": {
    "_id": "201803071708VZ7S2-DLFZ62AD",
    "customer_id": "201803071708VZ7S2",
    "label": "CB-PRD-NP-Checkout-BFR-Status-NA",
    "interval": 5,
    "notifications": [

    ],
    "runlocations": [
      "nam"
    ],
    "type": "HTTPCONTENT",
    "status": "assigned",
    "modified": 1536810209184,
    "enable": "active",
    "public": true,
    "dep": false,
    "parameters": {
      "target": "https://thisapp.domain.com/status",
      "ipv6": false,
      "invert": false,
      "contentstring": "OK",
      "follow": false,
      "threshold": 5,
      "sens": 2
    },
    "created": 1536810209184,
    "queue": "flIuAQDnwJ",
    "uuid": "u3wsf2oz-xudj-4rwi-82e0-63a1m1ykn0ft",
    "firstdown": 0,
    "state": 1
  }
}
0 Karma

sloshburch
Ultra Champion

Reading this again, I have a different perspective (and so a different answer).

Is the spath working but producing many fields for you since each node is a unique number? If so, then maybe the situation here is that the data is being parsed correctly by Splunk and we just need to do a little search time magic to transform the data to your needs.

For example, you can use wildcards to refer to the fields. Maybe something like:

sourcetype=nodeping_check_config 
|stats values(*id*) AS IDs

The above could be the start of an example on how you might get access to the data you need without having to list every unique field name.

If we're on the right track here, shoot back with more on what you need to compare in the search so we can see how further to make wildcard magic.

0 Karma

chrismmckenna
New Member

The data will be used comparing the target field. Each _id represent a different monitor. I also want to store the data in a lookup or kv for configuration management. I'm not really evaluating the fields. I'm using them for integration.

0 Karma

sloshburch
Ultra Champion

I'm having a lot of trouble following this. Do you only need to compare the value of the 'target' field? As in, the field that has values like https://thisapp.domain.com/status? If so then we can approach this totally differently.

What is an example output of this? What would that table look like?

You may find reaching out to your account team is going to be a stronger solution here. They'll be able to get into the specifics that you can't share publicly here AND help you make sure the solution you create is sustainable.

0 Karma

sloshburch
Ultra Champion

That appears to be what's in the configuration file, but to be thorough, would you get the complete sourcetype definition from btool?

Likely something like ${SPLUNK_HOME}/bin/splunk btool props list nodeping_check_config - although that's written off top of head so check Use btool to troubleshoot configurations for the proper usage.

I'm especially interested to see if the KV_MODE mode is set to json. Learn more in the Configure automatic key-value field extraction or Extract fields from files with structured data

If needed: props.conf.spec file.

0 Karma

chrismmckenna
New Member

This is the command executed: splunk btool props list nodeping_check_config

Note: KV_MODE = json, this is set in /opt/splunk/etc/apps/cb-merchant-info/local. I created the props.conf for it and thought that is what would be needed, but that doesn't appear to be the case at the moment.

props.conf
[nodeping_check_config]
TRUNCATE = 0
KV_MODE = json

btool output
[nodeping_check_config]
ADD_EXTRA_TIME_FIELDS = True
ANNOTATE_PUNCT = True
AUTO_KV_JSON = true
BREAK_ONLY_BEFORE =
BREAK_ONLY_BEFORE_DATE = True
CHARSET = UTF-8
DATETIME_CONFIG = /etc/datetime.xml
DEPTH_LIMIT = 1000
HEADER_MODE =
KV_MODE = json
LEARN_MODEL = true
LEARN_SOURCETYPE = true
LINE_BREAKER_LOOKBEHIND = 100
MATCH_LIMIT = 100000
MAX_DAYS_AGO = 2000
MAX_DAYS_HENCE = 2
MAX_DIFF_SECS_AGO = 3600
MAX_DIFF_SECS_HENCE = 604800
MAX_EVENTS = 256
MAX_TIMESTAMP_LOOKAHEAD = 128
MUST_BREAK_AFTER =
MUST_NOT_BREAK_AFTER =
MUST_NOT_BREAK_BEFORE =
SEGMENTATION = indexing
SEGMENTATION-all = full
SEGMENTATION-inner = inner
SEGMENTATION-outer = outer
SEGMENTATION-raw = none
SEGMENTATION-standard = standard
SHOULD_LINEMERGE = True
TRANSFORMS =
TRUNCATE = 0
detect_trailing_nulls = false
maxDist = 100
priority =
sourcetype =

0 Karma

chrismmckenna
New Member

These are what the REST API Modular Input App created.

inputs.conf

[rest://PRD-NodePing-GetCheckConfigs]
activation_key = #######################
auth_type = basic
auth_user = ##########################
endpoint = https://api.nodeping.com/api/1/checks/
host = api.nodeping.com
http_method = GET
index = np_metrics
index_error_response_codes = 1
polling_interval = 300
response_type = json
sequential_mode = 0
sourcetype = nodeping_check_config
streaming_request = 0
url_args = customerid=201803071708VZ7S2
delimiter = ,
disabled = 0

props.conf
[nodeping_check_config]
TRUNCATE = 0

0 Karma

sloshburch
Ultra Champion

Would you add in the sourcetype definition you are using for this?

0 Karma

chrismmckenna
New Member

Is this a good use case for parsing at index time versus search time?

0 Karma

sloshburch
Ultra Champion

Absolutely not. Think of index time merely as a field extraction that has more metadata with it. The challenge here precedes index vs search time. First we need to resolve the ideal way for the data to become valuable for you. At that point, I'm predicting we won't want index time extraction because we may have a wide variety of field names or values because of the nature of the data - that would means some MASSIVE metadata in order to capture the variability, which would means a ton of disk space for little discernible value in performance.

0 Karma

chrismmckenna
New Member

FYI - I am collecting the data using the REST API Modular Input App. This is returning hundreds of objects in one event. I'd like individual events instead. This would simplify the processing so I don't have to search time parse every event which contains hundreds of objects.

0 Karma
.conf21 CFS Extended through 5/20!

Don't miss your chance
to share your Splunk
wisdom in-person or
virtually at .conf21!

Call for Speakers has
been extended through
Thursday, 5/20!