In the REST API Modular Input, how do you extract ...

chrismmckenna · ‎12-13-2018

I've only copied a couple of the JSON objects below. I've tried to use the spath command, but am not sure how to identify the initial path because the parent object name is variable. (e.g 201803071708VZ7S2-8MFEU08V, 201803071708VZ7S2-DLFZ62AD, 201803071708VZ7S2-*, etc). There are 100s of these variable named objects per event.

I want extract all of the child objects (_id, label, parameters, etc) so I can use them for evaluation.

{
  "201803071708VZ7S2-ABCD1234": {
    "_id": "201803071708VZ7S2-8MFEU08V",
    "customer_id": "201803071708VZ7S2",
    "label": "CB-PRD-SEV1-NP-Offline-Authorization-Queue",
    "interval": 5,
    "notifications": [
      {
        "4GHUS": {
          "schedule": "All",
          "delay": 0
        }
      }
    ],
    "runlocations": false,
    "type": "HTTPPARSE",
    "status": "removed",
    "modified": 1541439105629,
    "enable": "inactive",
    "public": true,
    "dep": false,
    "parameters": {
      "target": "https://thatapp.domain.com/status",
      "fields": {
        "FKSTPR": {
          "name": "2.count",
          "min": 0,
          "max": 20
        }
      },
      "threshold": 5,
      "sens": 2
    },
    "created": 1541437905291,
    "queue": false,
    "uuid": "27wr5z4x-wunu-4mh9-8942-zc9vd19xi1sq",
    "firstdown": 1541437932354,
    "state": 0
  },
  "201803071708VZ7S2-ABCD4567": {
    "_id": "201803071708VZ7S2-DLFZ62AD",
    "customer_id": "201803071708VZ7S2",
    "label": "CB-PRD-NP-Checkout-BFR-Status-NA",
    "interval": 5,
    "notifications": [

    ],
    "runlocations": [
      "nam"
    ],
    "type": "HTTPCONTENT",
    "status": "assigned",
    "modified": 1536810209184,
    "enable": "active",
    "public": true,
    "dep": false,
    "parameters": {
      "target": "https://thisapp.domain.com/status",
      "ipv6": false,
      "invert": false,
      "contentstring": "OK",
      "follow": false,
      "threshold": 5,
      "sens": 2
    },
    "created": 1536810209184,
    "queue": "flIuAQDnwJ",
    "uuid": "u3wsf2oz-xudj-4rwi-82e0-63a1m1ykn0ft",
    "firstdown": 0,
    "state": 1
  }
}

sloshburch · ‎01-02-2019

Reading this again, I have a different perspective (and so a different answer).

Is the spath working but producing many fields for you since each node is a unique number? If so, then maybe the situation here is that the data is being parsed correctly by Splunk and we just need to do a little search time magic to transform the data to your needs.

For example, you can use wildcards to refer to the fields. Maybe something like:

sourcetype=nodeping_check_config 
|stats values(*id*) AS IDs

The above could be the start of an example on how you might get access to the data you need without having to list every unique field name.

If we're on the right track here, shoot back with more on what you need to compare in the search so we can see how further to make wildcard magic.

chrismmckenna · ‎01-14-2019

The data will be used comparing the target field. Each _id represent a different monitor. I also want to store the data in a lookup or kv for configuration management. I'm not really evaluating the fields. I'm using them for integration.

sloshburch · ‎01-15-2019

I'm having a lot of trouble following this. Do you only need to compare the value of the 'target' field? As in, the field that has values like https://thisapp.domain.com/status? If so then we can approach this totally differently.

What is an example output of this? What would that table look like?

You may find reaching out to your account team is going to be a stronger solution here. They'll be able to get into the specifics that you can't share publicly here AND help you make sure the solution you create is sustainable.

sloshburch · ‎12-13-2018

That appears to be what's in the configuration file, but to be thorough, would you get the complete sourcetype definition from btool?

Likely something like ${SPLUNK_HOME}/bin/splunk btool props list nodeping_check_config - although that's written off top of head so check Use btool to troubleshoot configurations for the proper usage.

I'm especially interested to see if the KV_MODE mode is set to json. Learn more in the Configure automatic key-value field extraction or Extract fields from files with structured data

If needed: props.conf.spec file.

chrismmckenna · ‎12-20-2018

This is the command executed: splunk btool props list nodeping_check_config

Note: KV_MODE = json, this is set in /opt/splunk/etc/apps/cb-merchant-info/local. I created the props.conf for it and thought that is what would be needed, but that doesn't appear to be the case at the moment.

props.conf
[nodeping_check_config]
TRUNCATE = 0
KV_MODE = json

btool output
[nodeping_check_config]
ADD_EXTRA_TIME_FIELDS = True
ANNOTATE_PUNCT = True
AUTO_KV_JSON = true
BREAK_ONLY_BEFORE =
BREAK_ONLY_BEFORE_DATE = True
CHARSET = UTF-8
DATETIME_CONFIG = /etc/datetime.xml
DEPTH_LIMIT = 1000
HEADER_MODE =
KV_MODE = json
LEARN_MODEL = true
LEARN_SOURCETYPE = true
LINE_BREAKER_LOOKBEHIND = 100
MATCH_LIMIT = 100000
MAX_DAYS_AGO = 2000
MAX_DAYS_HENCE = 2
MAX_DIFF_SECS_AGO = 3600
MAX_DIFF_SECS_HENCE = 604800
MAX_EVENTS = 256
MAX_TIMESTAMP_LOOKAHEAD = 128
MUST_BREAK_AFTER =
MUST_NOT_BREAK_AFTER =
MUST_NOT_BREAK_BEFORE =
SEGMENTATION = indexing
SEGMENTATION-all = full
SEGMENTATION-inner = inner
SEGMENTATION-outer = outer
SEGMENTATION-raw = none
SEGMENTATION-standard = standard
SHOULD_LINEMERGE = True
TRANSFORMS =
TRUNCATE = 0
detect_trailing_nulls = false
maxDist = 100
priority =
sourcetype =

chrismmckenna · ‎12-13-2018

These are what the REST API Modular Input App created.

inputs.conf

[rest://PRD-NodePing-GetCheckConfigs]
activation_key = #######################
auth_type = basic
auth_user = ##########################
endpoint = https://api.nodeping.com/api/1/checks/
host = api.nodeping.com
http_method = GET
index = np_metrics
index_error_response_codes = 1
polling_interval = 300
response_type = json
sequential_mode = 0
sourcetype = nodeping_check_config
streaming_request = 0
url_args = customerid=201803071708VZ7S2
delimiter = ,
disabled = 0

props.conf
[nodeping_check_config]
TRUNCATE = 0

sloshburch · ‎12-13-2018

Would you add in the sourcetype definition you are using for this?

chrismmckenna · ‎12-13-2018

Is this a good use case for parsing at index time versus search time?

sloshburch · ‎12-13-2018

Absolutely not. Think of index time merely as a field extraction that has more metadata with it. The challenge here precedes index vs search time. First we need to resolve the ideal way for the data to become valuable for you. At that point, I'm predicting we won't want index time extraction because we may have a wide variety of field names or values because of the nature of the data - that would means some MASSIVE metadata in order to capture the variability, which would means a ton of disk space for little discernible value in performance.

chrismmckenna · ‎12-13-2018

FYI - I am collecting the data using the REST API Modular Input App. This is returning hundreds of objects in one event. I'd like individual events instead. This would simplify the processing so I don't have to search time parse every event which contains hundreds of objects.

In the REST API Modular Input, how do you extract nested JSON with non-unique parent object names?

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Fuel Your Journey: What’s Waiting for You at the .conf26 Acceleration Station

Join the Final Session of the Data Management & Federation Bootcamp Series

From Data to Insight: Announcing the Winners of the Splunk Dashboard Contest

Join the Conversation