First time ingesting JSON logs, so need assistance on figuring out why my JSON log ingestion is not auto extracting.
Environment: SHC, IDX cluster, typical management servers.
I first tested a manual upload of a log sample by going to a SH then settings -> add data -> upload. When i uploaded a log, the sourcetype _json was automatically selected. In the preview panel, everything looked good, so i saved the sourcetype as foo. i completed the upload into a index=test. Looked at the data, everything was good, in the " interesting fields " pane on the left had the auto extractions completed.
in ../apps/search/local/props.conf, an entry was created...
[foo]
KV_MODE = none
LINE_BREAKER = ([\r\n]+)
NO_BINARY_CHECK = true
category = Structured
description = JavaScript Object Notation format. For more information, visit http://json.org/
disabled = false
pulldown_type = true
BREAK_ONLY_BEFORE =
INDEXED_EXTRACTIONS = json
I noticed it used INDEXED_EXTRACTIONS, which is not what i wanted (have never used indexed_extractions before), but figured these are just occasional scan logs which were literally just a few kilobytes every now and then, so wasn't a big deal.
I copied the above sourcetype stanza to an app in the cluster masters-manager apps folder in an app that i have a bunch of random one off props.conf sourcetype stanzas, then pushed it out to the IDX cluster. Then created an inputs.conf and server class on the DS to push to the particular forwarder that monitors the folder for the appropriate JSON scan logs. As expected, eventually the scan logs started being indexed and viewable on the Search head.
Unfortunately, the auto extractions were not being parsed. The Interesting fields panel on the left only had the default fields. on the Right panel where the logs are the Fields names were highlighted in Red, which i guess means splunk recognizes the field names?? But either way the issue is i had no interesting fields.
I figured maybe the issue was on the Search heads i had the " indexed extractions" set and figure thats probably an indexer setting, so i commented that out and tried using KV_MODE=json in its place. saved the .conf file and restarted the SH. But the issue remains.. no interesting fields.
The test upload worked just fine and i had interesting fields in the test index, however when the logs started coming through from the UF, I no longer had interesting fields despite using the same sourcetype.
What am i missing? is there more to ingesting a JSON file then simply just using kv_mode or indexed_extractions? but then why does my test upload work ?
here is a sample log:
{"createdAt": "2024-09-04T15:23:12-04:00", "description": "bunch of words.", "detectorId": "text/hardcoded-credentials@v1.0", "detectorName": "Hardcoded credentials", "detectorTags": ["secrets", "security", "owasp-top10", "top25-cwes", "cwe-798", "Text"], "generatorId": "something", "id": "LongIDstring", "remediation": {"recommendation": {"text": "a bunch of text.", "url": "a url"}}, "resource": {"id": "oit-aws-codescan"}, "ruleId": "multilanguage-password", "severity": "Critical", "status": "Open", "title": "CWE-xxx - Hardcoded credentials", "type": "Software and Configuration Checks", "updatedAt": "2024-09-18T10:54:02.916000-04:00", "vulnerability": {"filePath": {"codeSnippet": [{"content": " ftp_site = 'something.com'", "number": 139}, {"content": " ftp_base = '/somesite/'", "number": 140}, {"content": " ftp_filename_ext = '.csv'", "number": 111}, {"content": " ", "number": 111}, {"content": " ftp_username = 'anonymous'", "number": 111}, {"content": " ftp_password = 'a****'", "number": 111}, {"content": "", "number": 111}, {"content": " # -- DOWNLOAD DATA -----", "number": 111}, {"content": " # Put all of the data pulls within a try-except case to protect against crashing", "number": 111}, {"content": "", "number": 148}, {"content": " email_alert_sent = False", "number": 111}], "endLine": 111, "name": "somethingsomething.py", "path": "something.py", "startLine": 111}, "id": "LongIDstring", "referenceUrls": [], "relatedVulnerabilities": ["CWE-xxx"]}}
I appreciate any guidance..
Thank you for sharing your detailed process and the issue you're encountering with JSON log ingestion. Your testing approach was thorough, but there are a few key points to address:
Try to run the below search to figure out what app is taking precedence.
| rest splunk_server=local /services/configs/conf-props/YOURSOURCEYPE| transpose | search column=eai:acl.app
Please UpVote/Solved if this helps.