Repeat Matches in transforms; keeping groups of da...

icewolf69 · ‎04-09-2023

Hi,

I'm running index-time field extractions for a large TXT report. For this particular regex searches, I'm searching and capturing 3 fields, and then using the repeat_match = true flag to crawl the rest of the TXT file.

My goal is to extract data, but also somehow keep the sets of data extracted together but separate from the next set of regex captures. For example:

repeat 0 regex: Title, CCI, FixText

repeat 1 regex: Title, CCI, FixText

repeat 2 regex: Title, CCI, FixText

But I need to keep the repeat0 fields connected somehow, repeat1 fields connected somehow, and repeat2 fields connected somehow; but also separate the repeat0 set from the repeat1 set. In this example, I want to ensure that Title for repeat 0 doesn't end up being attached to the CCI in repeat 1.

In my current rendition, I get all the data I need, but they are inside giant fields, where "CCI" might contain 100 items, and "FixText" might contain 100 items. But I can't seem to figure out how to divide/expand them so that i can ensure that each group has the correct correlated information. The "FixText" field could include 1 line or many lines, so I can't separate those from one another easily after the get grouped.

I would like to note, that I'm ok with expanding these at search time as opposed to index time, but i'm thinking it might be easier reference the fields if they get separated at index time? Maybe I could add a pipe or something to the end of each capture, and then use a delimiter to expand the fields?

Any help is appreciated.
Thank you

Transforms:

[SCAP_FAIL_INFO]
REGEX = Title\s+\:\s(?<scap_fail_title>V.+)[\s\S]+?NIST\sSP\s800\-53\sRev\s4\:\s(?<scap_cci>.+);[\s\S]+?Fix\sText\s+(?<scap_fix_text>[\S\s]*?)?\nSeverity
LOOKAHEAD = 600000
REPEAT_MATCH = true
WRITE_META = true

tscroggins · ‎04-09-2023

Hi,

A better approach might be to index each result as a separate event. This allows Splunk to manage the data more efficiently while satisfying your requirement to keep result fields together.

Can you provide a sample of your SCAP report format?

Repeat Matches in transforms; keeping groups of data together

props.conf

transforms.conf

Splunk Decoded: Service Maps vs Service Analyzer Tree View vs Flow Maps

What’s New in Splunk Observability – September 2025

Fun with Regular Expression - multiples of nine

Are you a member of the Splunk Community?