Splunk Search

How to extract fields from JSON data conforming to XDAS-v2?

dominiquevocat
SplunkTrust
SplunkTrust

I have some json conforming to XDAS-v2 and, unfortunately, the spath command cannot make much sense of it. Is there a easy way to use this kind of json that I overlooked?

I tried to do some of it with props.conf:

[xdas-events]
KV_MODE = JSON
INDEXED_EXTRACTIONS = JSON
pulldown_type=1

Sample event:

 Jun 18 12:28:31 IDM : INFO {"Source" : "IDM","Observer" : {"Entity" : {"SysName" : "chhs-sidm017"}},"Initiator" : {"Entity" : {"SvcName" : "CN=INTG2,OU=SYSTEM,O=SOME","SvcComp" : "\\Driver"}},"Target" : {"Data" : {"DATA" : "<status level=\"success\" type=\"driver-status\">Driver state changed to Running.<application>DirXML</application>\n\t<module>WORKER</module>\n\t<object-dn></object-dn>\n\t<component>Subscriber</component>\n</status>","MIME_HINT" : "3","ORIGINATOR_TYPE" : "1","TARGET_TYPE" : "1","TEXT3" : "Driver state changed to Running.","VALUE1" : "2","VALUE2" : "0","VALUE3" : "0"},"Entity" : {"SvcName" : "CN=WORKER,CN=IDM-INTG,OU=IDM,OU=SYSTEM,O=SOME","SvcComp" : "DirXML-State"}},"Action" : {"Event" : {"Id" : "0.0.3.5","Name" : "Enable Service","SubEvent" : "30022"},"Time" : {"Offset" : 1434623311},"Log" : {"Severity" : 7}}} 

The schema of the content is as follows:

{
    "id":"XDASv2",
    "title":"XDAS Version 2 JSON Schema",
    "description":"A JSON representation of an XDASv2 event record.",
    "type":"objectr",
    "properties":{
      "Source":{
        "description":"The original source of the event, if applicable.",
        "type":"string",
        "optional":true
      },
      "Observer":{
        "description":"The recorder (ie., the XDASv2 service) of the event.",
        "type":"object",
        "optional":false,
        "properties":{
          "Account":{"$ref":"account"},
          "Entity":{"$ref":"entity"}
        }
      },
      "Initiator":{
        "description":"The authenticated entity or access token that causes an event.",
        "type":"object",
        "optional":false,
        "properties":{
          "Account":{"$ref":"account","optional":true},
          "Entity":{"$ref":"entity"},
          "Assertions":{
            "description":"Attribute/value assertions about an identity.",
            "type":"object",
            "optional":true
          }
        }
      },
      "Target":{
        "description":"The target object, account, data item, etc of the event.",
        "type":"object",
        "optional":true,
        "properties":{
          "Account":{"$ref":"account"},
          "Entity":{"$ref":"entity"},
          "Data":{                           
            "description":"A set attribute/value pairs describing the target object.",        * 
            "type":"object",        
            "optional":true
          }  
        }
      },
      "Action":{
        "description":"The action describes the event in a uniform manner.",
        "type":"object",
        "optional":false,
        "properties":{
          "Event":{
            "description":"The event identifier in standard XDASv2 taxonomy.",
            "type":"object",
            "optional":false,
            "properties":{
              "Id":{
                "description":"The XDASv2 taxonomy event identifier.",
                "type":"string",
                "optional":false,
                "pattern":"/^[0-9]+(\.[0-9]+)*$/" 
              },
              "Name":{
                "description":"A short descriptive name for the specific event.", eg. a new replica is added 
                "type":"string",
                "optional":true
              },
      "CorrelationID":{
          "description":"Correlation ID, source#uniqueID#connID",
                 "type":"string",
                 "optional":true
      }
     },
     "SubEvent":{
      "type":object
      "description": "Describes the actual domain specific event that has occured.",
      "optional":true,
      "properties":{
        "Name"":{
                    "description":"A short descriptive name for this event.",
                    "type":"string",
                    "optional":true
                  },
      }
            }  
          }
          "Log":{
            "description":"Client-specified logging attributes.",
            "optional":true,
            "properties":{
              "Severity":{"type":"integer", "optional":true},
              "Priority":{"type":"integer", "optional":true},
              "Facility":{"type":"integer", "optional":true}
            }
          }
          "Outcome":{
            "description":"The XDASv2 taxonomy outcome identifier.",
            "type":"string",
            "optional":false,
            "pattern":"/^[0-9]+(\.[0-9]+)*$/"
          }
          "Time":{
            "description":"The time the event occurred.",
            "type":"object",
            "optional":false,
            "properties":{
              "Offset":{
                "description":"Seconds since Jan 1, 1970.",
                "type":"integer"
              },
              "Sequence":{
                "description":"Milliseconds since last integral second.",
                "type":"integer",
                "optional":true
              },
              "Tolerance":{
                "description":"A tolerance value in milliseconds.",
                "type":"integer",
                "optional":true
              },
              "Certainty":{
                "description":"Percentage certainty of tolerance.",
                "type":"integer",
                "optional":true,
                "minimum":0,
                "maximum":100,
                "default":100,
              },
              "Source":{
                "description":"The time source (eg., ntp://time.nist.gov).",
                "type":"string",
                "optional":true
              },
              "Zone":{
                "description":"A valid timezone symbol (eg., MST/MDT).",
                "type":"string",
                "optional":true
              }
            }
      "ExtendedOutcome":{
            "description":"The XDASv2 taxonomy outcome identifier.",
            "type":"string",
            "optional":false,
            "pattern":"/^[0-9]+(\.[0-9]+)*$/"
           }
        }
      }
    }
  },
  {
    "id":"account",
    "description":"A representation of an XDAS account.",
    "type":"object",
    "properties":{
      "Domain":{
        "description":"A (URL) reference to the authority managing this account.",    /* lets take it as the partition?
        "type":"string"
      },
      "Name":{
        "description":"A human-readable account name.",        - DN
        "type":"string",
        "optional":true
      },
      "Id":{
        "description":"A machine-readable unique account identifier value.",  - EntryID
        "type":"integer"
      }
    }
  },
  {
    "id":"entity",                    - Server details for Target, client address details for the initiator
    "description":"A representation of an addressable entity.",
    "type":"object",
    "properties":{
      "SysAddr":{"type":"string","optional":true},  
      "SysName":{"type":"string","optional":true},
      "SvcName":{"type":"string","optional":true},
      "SvcComp":{"type":"string","optional":true},
    }
  }
0 Karma

liamalexandertm
New Member

Sorry to necro post, but dominiquevocat comment seemed the simplest way to go, and I think was almost there.

On the forwarder recieving the syslog from our eDirectory servers, i created a new eDir app and added a props.conf with

Defined in eDir Apps props.conf (sedCMDs to remove the preceeding “eDirectory : INFO ” and” IDM : INFO ”)

[eDirXDAS]
SEDCMD-StripEDirInfo = s/eDirectory : INFO {/{/g
SEDCMD-StripIDMInfo = s/IDM : INFO {/{/g
KV_MODE = json
INDEXED_EXTRACTIONS = json
pulldown_type=1

and then set the sourcetype for that syslog listener to eDirXDAS.
The SED commands can be added to for any other strings that are being prepended to the supplied JSON.

0 Karma

alacercogitatus
SplunkTrust
SplunkTrust

To have spath work correctly, Splunk must recognize the output as JSON. With the leading "syslog-style" meta information, the event is now no longer "json" and won't parse correctly. If you have access to the output logger of the app, change it to insert a timestamp into the json, and only output the JSON on a single line with no other information around it. Notice the event below, and the two additional fields I added at the front.

{ "log_level":"INFO", "_time": "Jun 18 12:28:31", "Source" : "IDM","Observer" : {"Entity" : {"SysName" : "chhs-sidm017"}},"Initiator" : {"Entity" : {"SvcName" : "CN=INTG2,OU=SYSTEM,O=SOME","SvcComp" : "\\Driver"}},"Target" : {"Data" : {"DATA" : "<status level=\"success\" type=\"driver-status\">Driver state changed to Running.<application>DirXML</application>\n\t<module>WORKER</module>\n\t<object-dn></object-dn>\n\t<component>Subscriber</component>\n</status>","MIME_HINT" : "3","ORIGINATOR_TYPE" : "1","TARGET_TYPE" : "1","TEXT3" : "Driver state changed to Running.","VALUE1" : "2","VALUE2" : "0","VALUE3" : "0"},"Entity" : {"SvcName" : "CN=WORKER,CN=IDM-INTG,OU=IDM,OU=SYSTEM,O=SOME","SvcComp" : "DirXML-State"}},"Action" : {"Event" : {"Id" : "0.0.3.5","Name" : "Enable Service","SubEvent" : "30022"},"Time" : {"Offset" : 1434623311},"Log" : {"Severity" : 7}}}
0 Karma

mschlereth
New Member

I have been struggling with the same problem and finally got it to work.

The first thing I did was to modify the default xdasconfig.properties file for the layout.ConversionPattern to not put the syslog level and time stamp and to only output the json event.

log4j.appender.R.layout.ConversionPattern=%m%n

I found through searching many posts that INDEXED_EXTRACTIONS do NOT work on indexers. INDEXED_EXTRACTIONS are applied on forwarders. From what I can tell this is something that changed in version 6. Once I put the following in the props.conf file Universal Forwarder which has the xdas log file local it started working like a charm.

props.conf on Universal Forwarder
[xdas]
INDEXED_EXTRACTIONS = json
detect_trailing_nulls = auto
SHOULD_LINEMERGE = false

Finally, modify the props.conf file on the indexer to not perform search time indexing otherwise the fields will appear to be duplicated.

props.conf on the indexer
[xdas]
KV_MODE = none
AUTO_KV_JSON = false

Hope this helps!

0 Karma

dominiquevocat
SplunkTrust
SplunkTrust

Hi mschlereth, with these parameters the forwarder won't start anymore complaining about

06-23-2015 16:37:24.113 +0200 ERROR JsonLineBreaker - JSON StreamID: 13302974020879139619 had parsing error: Unexpected character while looking for value: 'J'

Also do you mind posting the setting you use to extract the timestamp from

{"Offset" : 1435057782}

?

0 Karma

dominiquevocat
SplunkTrust
SplunkTrust

hm, this in props.xonf on the forwarder seems to do fairly well:

TIME_FORMAT=%s
TIMESTAMP_FIELDS=Action.Time.Offset
INDEXED_EXTRACTIONS=json
NO_BINARY_CHECK=true
KV_MODE=json
disabled=false
pulldown_type=true

0 Karma

mschlereth
New Member

I have not got around to getting the timestamp working. I was just using the index time as being good enough. I think TIMESTAMP_FIELDS needs to go in the props.conf on the indexer rather than the forwarder but not sure.

0 Karma

dominiquevocat
SplunkTrust
SplunkTrust

if i do
| eval tmp=substr(_raw,27) | spath input=tmp | fields - tmp
it looks pretty fine so i think if i get the indexing as json correct then it should work out of the box.

now in props.conf on the indexer i have

[xdas-events]
SEDCMD-StripHeader = ^[^{]+
KV_MODE = json
INDEXED_EXTRACTIONS = json
pulldown_type=1

but it does not seem to work so well...

0 Karma
Get Updates on the Splunk Community!

Earn a $35 Gift Card for Answering our Splunk Admins & App Developer Survey

Survey for Splunk Admins and App Developers is open now! | Earn a $35 gift card!      Hello there,  Splunk ...

Continuing Innovation & New Integrations Unlock Full Stack Observability For Your ...

You’ve probably heard the latest about AppDynamics joining the Splunk Observability portfolio, deepening our ...

Monitoring Amazon Elastic Kubernetes Service (EKS)

As we’ve seen, integrating Kubernetes environments with Splunk Observability Cloud is a quick and easy way to ...