Splunk Search

How do I get REGEX to extract multiple values from an event?

Motivator

I have a long, that gets pretty long, and currently splunk is ingesting it as a whole. this log gets up a couple hundred lines long, and there are multiple events within this log that I need to extract. I am currently using REGEX to do the extraction, but it is only pulling the most recent instance of the extraction and not extracting the other instances within the log.

For example, here is my extraction:

NOTE:\sPROCEDURE\s(?<procedure>\w+)\sused

And here is the log file that I am consuming.

NOTE: Deleting WORK.CONTENTS (memtype=DATA).
    NOTE: PROCEDURE DATASETS used (Total process time):
          real time           0.00 seconds
          cpu time            0.01 seconds

    MACROGEN(CONTENTS_CNTR):   data _null_ ;
    MACROGEN(CONTENTS_CNTR):   file "/idn/wsmis/SDPMON_Raw/Logs/SDPMONRaw_Job45_error_log_FDT20150423_RD20151116.txt" mod ;
    MACROGEN(CONTENTS_CNTR):   put "*** value for list_of_files cnt/freq:" @80 "8" @93 "***;";

ommiting 197 lines...
    NOTE: PROCEDURE CONTENTS used (Total process time):
          real time           0.00 seconds
          cpu time            0.00 seconds

    MACROGEN(CONTENTS_CNTR):   data _null_ ;
    MACROGEN(CONTENTS_CNTR):   set contents ;
    MACROGEN(CONTENTS_CNTR):   if _n_ = 1 ;
    MACROGEN(CONTENTS_CNTR):   call symput('no_obs',strip(put(NOBS,comma12.)));
    MACROGEN(CONTENTS_CNTR):   call symput('desc',"list_of_files_last");

In this example you can clearly see there there are two PROCEDURES the first is called DATASETS and the next is called CONTENTS.
My extraction is only pulling out the DATASETS value, and then not pulling out the other. Should I be adding a setting to my sourcetype to allow for multiple values here?

Adding the search / index time extractions as requested:

search time settings:

EXTRACT-rT_cpUT = The SAS System used:\s+real\s+time\s+(?<totalRealTime>[^s]+)[^.*]+cpu\stime\s+(?<totalCPUTime>[^s]+)\s+
EVAL-totalCPUTime = replace(totalCPUTime, "^(\d{2})\.(\d{2})","00:00:\1.\2")
EXTRACT-proc = NOTE:\sPROCEDURE\s(?<procedure>\w+)\sused
EXTRACT-logFile = \/idn\/saslogs\/Altlogs_Linux\/(?<fileDate>\d+)\/(?<user>[^-]+)-(?<version>[^-]+)-\d+-(?<startTime>\d+)-PID(?<pid>\d+) in source
EXTRACT-logFile2 = \/idn\/saslogs\/Altlogs\/(?<fileDate>\d+)\/(?<user>[^-]+)-(?<version>[^-]+)-\d+-(?<startTime>\d+)-PID(?<pid>\d+) in source

index time settings:

NO_BINARY_CHECK=1
LINE_BREAKER = ((*FAIL))
SHOULD_LINEMERGE = false
TRUNCATE = 9999999

Thank you for any help!!

0 Karma
1 Solution

Esteemed Legend

You need to add MV_ADD = 1 to the appropriate stanza in transforms.conf. This does the same thing:

...  | rex max_match=0 "NOTE:\sPROCEDURE\s(?<procedure>\w+)\sused"

View solution in original post

Path Finder

Why don't you use the LINE_BREAKER expression to properly break your events (And what are you trying to archieve with LINE_BREAKER = ((*FAIL)) help here)?

  1. http://docs.splunk.com/Documentation/Splunk/6.2.2/Data/Indexmulti-lineevents#Line_breaking_general_a...
  2. http://docs.splunk.com/Documentation/Splunk/6.3.1/Admin/Propsconf (search for "LINE_BREAKER")
0 Karma

Motivator

unfortunatley, the logs are not clean enough to use a line breaker. Event start/stop is not clearly delineated.

0 Karma

Path Finder

Did you read the props.conf documentation carefully? There are a bunch of possibilities to break events (not only the LINE_BREAKER😞

/edit I cut that out again, way too much ugly formatted text. Search for LINE_BREAKER, there are several pages regarding event breaking.

0 Karma

Esteemed Legend

You need to add MV_ADD = 1 to the appropriate stanza in transforms.conf. This does the same thing:

...  | rex max_match=0 "NOTE:\sPROCEDURE\s(?<procedure>\w+)\sused"

View solution in original post

Motivator

I tried this, though it didn't seem to work. When I say this, I mean the 'rex' format you mentioned above. I didn't adjust this in the props.

0 Karma

SplunkTrust
SplunkTrust

You will need to add MV_ADD=1 to props.conf for the file to work correctly. Then you will have to use mv*commands to process the multi-valued 'procedure' variable.

---
If this reply helps you, an upvote would be appreciated.
0 Karma

New Member

I try this and success.

config on props.conf

[testmv_add] => this is the sourcetype
SHOULD_LINEMERGE = true
REPORT-testmv_add = mv_addreport

config on transforms.conf

[mv_addreport]
REGEX=PROCEDURE\s([^\s]+)
FORMAT = ProcedureName::$1
MV_ADD=true

alt text

0 Karma

SplunkTrust
SplunkTrust

Please provide the entire props.conf stanza for this sourcetype, if you're doing an index-time extraction.
If you're doing a search-time extraction, please provide the search.

---
If this reply helps you, an upvote would be appreciated.
0 Karma
State of Splunk Careers

Access the Splunk Careers Report to see real data that shows how Splunk mastery increases your value and job satisfaction.

Find out what your skills are worth!