ingest json logs - no extractions

Darthsplunker · ‎10-03-2024

First time ingesting JSON logs, so need assistance on figuring out why my JSON log ingestion is not auto extracting.

Environment: SHC, IDX cluster, typical management servers.

I first tested a manual upload of a log sample by going to a SH then settings -> add data -> upload. When i uploaded a log, the sourcetype _json was automatically selected. In the preview panel, everything looked good, so i saved the sourcetype as foo. i completed the upload into a index=test. Looked at the data, everything was good, in the " interesting fields " pane on the left had the auto extractions completed.

in ../apps/search/local/props.conf, an entry was created...

[foo]
KV_MODE = none
LINE_BREAKER = ([\r\n]+)
NO_BINARY_CHECK = true
category = Structured
description = JavaScript Object Notation format. For more information, visit http://json.org/
disabled = false
pulldown_type = true
BREAK_ONLY_BEFORE =
INDEXED_EXTRACTIONS = json

I noticed it used INDEXED_EXTRACTIONS, which is not what i wanted (have never used indexed_extractions before), but figured these are just occasional scan logs which were literally just a few kilobytes every now and then, so wasn't a big deal.

I copied the above sourcetype stanza to an app in the cluster masters-manager apps folder in an app that i have a bunch of random one off props.conf sourcetype stanzas, then pushed it out to the IDX cluster. Then created an inputs.conf and server class on the DS to push to the particular forwarder that monitors the folder for the appropriate JSON scan logs. As expected, eventually the scan logs started being indexed and viewable on the Search head.

Unfortunately, the auto extractions were not being parsed. The Interesting fields panel on the left only had the default fields. on the Right panel where the logs are the Fields names were highlighted in Red, which i guess means splunk recognizes the field names?? But either way the issue is i had no interesting fields.

I figured maybe the issue was on the Search heads i had the " indexed extractions" set and figure thats probably an indexer setting, so i commented that out and tried using KV_MODE=json in its place. saved the .conf file and restarted the SH. But the issue remains.. no interesting fields.

The test upload worked just fine and i had interesting fields in the test index, however when the logs started coming through from the UF, I no longer had interesting fields despite using the same sourcetype.

What am i missing? is there more to ingesting a JSON file then simply just using kv_mode or indexed_extractions? but then why does my test upload work ?

here is a sample log:

{"createdAt": "2024-09-04T15:23:12-04:00", "description": "bunch of words.", "detectorId": "text/hardcoded-credentials@v1.0", "detectorName": "Hardcoded credentials", "detectorTags": ["secrets", "security", "owasp-top10", "top25-cwes", "cwe-798", "Text"], "generatorId": "something", "id": "LongIDstring", "remediation": {"recommendation": {"text": "a bunch of text.", "url": "a url"}}, "resource": {"id": "oit-aws-codescan"}, "ruleId": "multilanguage-password", "severity": "Critical", "status": "Open", "title": "CWE-xxx - Hardcoded credentials", "type": "Software and Configuration Checks", "updatedAt": "2024-09-18T10:54:02.916000-04:00", "vulnerability": {"filePath": {"codeSnippet": [{"content": "    ftp_site = 'something.com'", "number": 139}, {"content": "    ftp_base = '/somesite/'", "number": 140}, {"content": "    ftp_filename_ext = '.csv'", "number": 111}, {"content": "    ", "number": 111}, {"content": "    ftp_username = 'anonymous'", "number": 111}, {"content": "    ftp_password = 'a****'", "number": 111}, {"content": "", "number": 111}, {"content": "    # -- DOWNLOAD DATA -----", "number": 111}, {"content": "    # Put all of the data pulls within a try-except case to protect against crashing", "number": 111}, {"content": "", "number": 148}, {"content": "    email_alert_sent = False", "number": 111}], "endLine": 111, "name": "somethingsomething.py", "path": "something.py", "startLine": 111}, "id": "LongIDstring", "referenceUrls": [], "relatedVulnerabilities": ["CWE-xxx"]}}

I appreciate any guidance..

sainag_splunk · ‎10-03-2024

Thank you for sharing your detailed process and the issue you're encountering with JSON log ingestion. Your testing approach was thorough, but there are a few key points to address:

Props.conf location: The primary parsing settings should be on the indexers, not the search heads. For JSON data, you typically only need minimal settings on the search head.
Search head settings: On the search head, you can simplify your props.conf to just:
[yoursourcetype]
KV_MODE = JSON
This tells Splunk to parse the JSON at search time, which should give you the field extractions you're looking for.
In order to onboard this properly you can also set MAGIC6 props on your indexers.
https://community.splunk.com/t5/Getting-Data-In/props-conf/m-p/426134

Try to run the below search to figure out what app is taking precedence.

| rest splunk_server=local /services/configs/conf-props/YOURSOURCEYPE| transpose | search column=eai:acl.app

Please UpVote/Solved if this helps.

ingest json logs - no extractions

data

field extraction

JSON

props.conf

Continuing Innovation & New Integrations Unlock Full Stack Observability For Your ...

Monitoring Amazon Elastic Kubernetes Service (EKS)

Cloud Platform & Enterprise: Classic Dashboard Export Feature Deprecation