I recently had cause to ingest Oracle Unified Directory logs in ODL format. I'm performing pretty simple file-based ingestion and didn't want to go down the path of DB connect, etc. so it made sense to write my own parsing logic. I didn't have a lot of luck finding things precisely relevant to my logs and I could have benefited from the concrete example I'll lay out below, so here it is, hopefully it will help others. This specific example is OUD logs, but it's relevant to any arbitrary _KEY_n, _VAL_n extraction in Splunk.
The ODL logs are mostly well structured, but the format is a little odd to work with.
Below is a small sample of a few different types of logs:
[2016-12-30T11:08:46.216-05:00] [ESSBASE0] [NOTIFICATION:16] [TCP-59] [TCP] [ecid: 1482887126970,0] [tid: 140198389143872] Connected from [::ffff:999.999.99.999]
[2016-12-30T11:08:27.60-05:00] [ESSBASE0] [NOTIFICATION:16] [AGENT-1001] [AGENT] [ecid: 1482887126970,0] [tid: 140198073563456] Received client request: Clear Application/Database (from user [sampleuser@Native Directory])
[2016-12-30T11:08:24.302-05:00] [PLN3] [NOTIFICATION:16] [REQ-91] [REQ] [ecid: 148308120489,0] [tid: 140641102035264] [DBNAME: SAMPLE] Received Command [SetAlias] from user [sampleuser@Native Directory]
[2016-12-30T11:08:26.932-05:00] [PLN3] [NOTIFICATION:16] [SSE-82] [SSE] [ecid: 148308120489,0] [tid: 140641102035264] [DBNAME: SAMPLE] Spreadsheet Extractor Big Block Allocs -- Dyn.Calc.Cache : [202] non-Dyn.Calc.Cache : [0]
[2025-02-07T15:18:35.014-05:00] [OUD] [TRACE] [OUD-24641549] [PROTOCOL] [host: testds01.redacted.fake] [nwaddr: redacted] [tid: 200] [userId: ldap] [ecid: 0000PJY46q64Usw5sFt1iX1bdZJL0003QL,0:1] [category: RES] [conn: 1285] [op: 0] [msgID: 1] [result: 0] [authDN: uid=redacted,ou=redacted,o=redacted,c=redacted] [etime: 0] BIND
[2025-02-07T15:18:35.014-05:00] [OUD] [TRACE] [OUD-24641548] [PROTOCOL] [host: testds01.redacted.fake] [nwaddr: redacted] [tid: 200] [userId: ldap] [ecid: 0000PJY46q64Usw5sFt1iX1bdZJL0003QK,0:1] [category: REQ] [conn: 1285] [op: 0] [msgID: 1] [bindType: SIMPLE] [dn: uid=redacted,ou=redacted,o=redacted,c=redacted] BIND
We opted to use the sourcetype "odl:oud". This allows future extension into other ODL formatted logs.
Note the 3-part REPORT processing. This utilizes regexes in transforms.conf to process the headers, the key-value pairs, and the trailing message. This could then be extended to utilize portions of that extraction for other log types that fall under ODL format.
[odl:oud]
REPORT-oudparse = extractOUDheader, extractODLkv, extractOUDmessage
This was the piece I could have used some concrete examples of. These three report extractions allow a pretty flexible ingestion of all parts of the logs. The key here was separating the _KEY_1 _VAL_1 extraction into its own process with the REPEAT_MATCH flag set
# extract fixed leading values from Oracle Unified Directory log message
[extractOUDheader]
REGEX = ^\[(?<timestamp>\d{4}[^\]]+)\] \[(?<organization_id>[^\]]+)\] \[(?<message_type>[^\]]+)\] \[(?<message_id>[^\]]+)\] \[(?<component>[^\]]+?)\]
# extract N number of key-value pairs from Oracle Diagnostic Logging log body
[extractODLkv]
REGEX = \[(?<_KEY_1>[^:]+?): (?<_VAL_1>[^\]]+?)\]
REPEAT_MATCH = true
# extract trailing, arbitrary message text from Oracle Unified Directory log
[extractOUDmessage]
REGEX = \[[^:]+: [^\]]+\] (?<message>[^\[].*)$.
The final regex there looks for a preceding key-value pair, NOT followed by a new square bracket, with any arbitrary characters thereafter to end the line.
I initially tried to make one regex to perform all of this, which doesn't really work without being very prescriptive in the structure. With an unknown number of key-value pairs, this is not ideal and this solution seemed much more Splunk-esque.
I hope this helps someone else!
Accepting this post as a solution since my "question" contains the solution and was really for information sharing purposes.
Accepting this post as a solution since my "question" contains the solution and was really for information sharing purposes.
@Wiessiet- Thanks for posting your findings on Splunk community.
Though I have a suggestion for you. If you already have a solution to the problem. What I would do is post a question and post a reply to your question & accept your answer. This way other people see that question as resolved & available with solution straightaway.
I hope this make sense!!