Splunk Search

How do I get REGEX to extract multiple values from an event?

tmarlette
Motivator

I have a long, that gets pretty long, and currently splunk is ingesting it as a whole. this log gets up a couple hundred lines long, and there are multiple events within this log that I need to extract. I am currently using REGEX to do the extraction, but it is only pulling the most recent instance of the extraction and not extracting the other instances within the log.

For example, here is my extraction:

NOTE:\sPROCEDURE\s(?<procedure>\w+)\sused

And here is the log file that I am consuming.

NOTE: Deleting WORK.CONTENTS (memtype=DATA).
    NOTE: PROCEDURE DATASETS used (Total process time):
          real time           0.00 seconds
          cpu time            0.01 seconds

    MACROGEN(CONTENTS_CNTR):   data _null_ ;
    MACROGEN(CONTENTS_CNTR):   file "/idn/wsmis/SDPMON_Raw/Logs/SDPMONRaw_Job45_error_log_FDT20150423_RD20151116.txt" mod ;
    MACROGEN(CONTENTS_CNTR):   put "*** value for list_of_files cnt/freq:" @80 "8" @93 "***;";

ommiting 197 lines...
    NOTE: PROCEDURE CONTENTS used (Total process time):
          real time           0.00 seconds
          cpu time            0.00 seconds

    MACROGEN(CONTENTS_CNTR):   data _null_ ;
    MACROGEN(CONTENTS_CNTR):   set contents ;
    MACROGEN(CONTENTS_CNTR):   if _n_ = 1 ;
    MACROGEN(CONTENTS_CNTR):   call symput('no_obs',strip(put(NOBS,comma12.)));
    MACROGEN(CONTENTS_CNTR):   call symput('desc',"list_of_files_last");

In this example you can clearly see there there are two PROCEDURES the first is called DATASETS and the next is called CONTENTS.
My extraction is only pulling out the DATASETS value, and then not pulling out the other. Should I be adding a setting to my sourcetype to allow for multiple values here?

Adding the search / index time extractions as requested:

search time settings:

EXTRACT-rT_cpUT = The SAS System used:\s+real\s+time\s+(?<totalRealTime>[^s]+)[^.*]+cpu\stime\s+(?<totalCPUTime>[^s]+)\s+
EVAL-totalCPUTime = replace(totalCPUTime, "^(\d{2})\.(\d{2})","00:00:\1.\2")
EXTRACT-proc = NOTE:\sPROCEDURE\s(?<procedure>\w+)\sused
EXTRACT-logFile = \/idn\/saslogs\/Altlogs_Linux\/(?<fileDate>\d+)\/(?<user>[^-]+)-(?<version>[^-]+)-\d+-(?<startTime>\d+)-PID(?<pid>\d+) in source
EXTRACT-logFile2 = \/idn\/saslogs\/Altlogs\/(?<fileDate>\d+)\/(?<user>[^-]+)-(?<version>[^-]+)-\d+-(?<startTime>\d+)-PID(?<pid>\d+) in source

index time settings:

NO_BINARY_CHECK=1
LINE_BREAKER = ((*FAIL))
SHOULD_LINEMERGE = false
TRUNCATE = 9999999

Thank you for any help!!

0 Karma
1 Solution

woodcock
Esteemed Legend

You need to add MV_ADD = 1 to the appropriate stanza in transforms.conf. This does the same thing:

...  | rex max_match=0 "NOTE:\sPROCEDURE\s(?<procedure>\w+)\sused"

View solution in original post

Sebastian2
Path Finder

Why don't you use the LINE_BREAKER expression to properly break your events (And what are you trying to archieve with LINE_BREAKER = ((*FAIL)) help here)?

  1. http://docs.splunk.com/Documentation/Splunk/6.2.2/Data/Indexmulti-lineevents#Line_breaking_general_a...
  2. http://docs.splunk.com/Documentation/Splunk/6.3.1/Admin/Propsconf (search for "LINE_BREAKER")
0 Karma

tmarlette
Motivator

unfortunatley, the logs are not clean enough to use a line breaker. Event start/stop is not clearly delineated.

0 Karma

Sebastian2
Path Finder

Did you read the props.conf documentation carefully? There are a bunch of possibilities to break events (not only the LINE_BREAKER😞

/edit I cut that out again, way too much ugly formatted text. Search for LINE_BREAKER, there are several pages regarding event breaking.

0 Karma

woodcock
Esteemed Legend

You need to add MV_ADD = 1 to the appropriate stanza in transforms.conf. This does the same thing:

...  | rex max_match=0 "NOTE:\sPROCEDURE\s(?<procedure>\w+)\sused"

tmarlette
Motivator

I tried this, though it didn't seem to work. When I say this, I mean the 'rex' format you mentioned above. I didn't adjust this in the props.

0 Karma

richgalloway
SplunkTrust
SplunkTrust

You will need to add MV_ADD=1 to props.conf for the file to work correctly. Then you will have to use mv*commands to process the multi-valued 'procedure' variable.

---
If this reply helps you, Karma would be appreciated.
0 Karma

yulianaif
New Member

I try this and success.

config on props.conf

[testmv_add] => this is the sourcetype
SHOULD_LINEMERGE = true
REPORT-testmv_add = mv_addreport

config on transforms.conf

[mv_addreport]
REGEX=PROCEDURE\s([^\s]+)
FORMAT = ProcedureName::$1
MV_ADD=true

alt text

0 Karma

richgalloway
SplunkTrust
SplunkTrust

Please provide the entire props.conf stanza for this sourcetype, if you're doing an index-time extraction.
If you're doing a search-time extraction, please provide the search.

---
If this reply helps you, Karma would be appreciated.
0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

Design, Compete, Win: Submit Your Best Splunk Dashboards for a .conf26 Pass

Hello Splunkers,  We’re excited to kick off a Splunk Dashboard contest! We know that dashboards are a primary ...

May 2026 Splunk Expert Sessions: Security & Observability

Level Up Your Operations: May 2026 Splunk Expert Sessions Whether you are refining your security posture or ...

Network to App: Observability Unlocked [May & June Series]

In today’s digital landscape, your environment is no longer confined to the data center. It spans complex ...