Splunk Search

How do I get REGEX to extract multiple values from an event?

tmarlette
Motivator

I have a long, that gets pretty long, and currently splunk is ingesting it as a whole. this log gets up a couple hundred lines long, and there are multiple events within this log that I need to extract. I am currently using REGEX to do the extraction, but it is only pulling the most recent instance of the extraction and not extracting the other instances within the log.

For example, here is my extraction:

NOTE:\sPROCEDURE\s(?<procedure>\w+)\sused

And here is the log file that I am consuming.

NOTE: Deleting WORK.CONTENTS (memtype=DATA).
    NOTE: PROCEDURE DATASETS used (Total process time):
          real time           0.00 seconds
          cpu time            0.01 seconds

    MACROGEN(CONTENTS_CNTR):   data _null_ ;
    MACROGEN(CONTENTS_CNTR):   file "/idn/wsmis/SDPMON_Raw/Logs/SDPMONRaw_Job45_error_log_FDT20150423_RD20151116.txt" mod ;
    MACROGEN(CONTENTS_CNTR):   put "*** value for list_of_files cnt/freq:" @80 "8" @93 "***;";

ommiting 197 lines...
    NOTE: PROCEDURE CONTENTS used (Total process time):
          real time           0.00 seconds
          cpu time            0.00 seconds

    MACROGEN(CONTENTS_CNTR):   data _null_ ;
    MACROGEN(CONTENTS_CNTR):   set contents ;
    MACROGEN(CONTENTS_CNTR):   if _n_ = 1 ;
    MACROGEN(CONTENTS_CNTR):   call symput('no_obs',strip(put(NOBS,comma12.)));
    MACROGEN(CONTENTS_CNTR):   call symput('desc',"list_of_files_last");

In this example you can clearly see there there are two PROCEDURES the first is called DATASETS and the next is called CONTENTS.
My extraction is only pulling out the DATASETS value, and then not pulling out the other. Should I be adding a setting to my sourcetype to allow for multiple values here?

Adding the search / index time extractions as requested:

search time settings:

EXTRACT-rT_cpUT = The SAS System used:\s+real\s+time\s+(?<totalRealTime>[^s]+)[^.*]+cpu\stime\s+(?<totalCPUTime>[^s]+)\s+
EVAL-totalCPUTime = replace(totalCPUTime, "^(\d{2})\.(\d{2})","00:00:\1.\2")
EXTRACT-proc = NOTE:\sPROCEDURE\s(?<procedure>\w+)\sused
EXTRACT-logFile = \/idn\/saslogs\/Altlogs_Linux\/(?<fileDate>\d+)\/(?<user>[^-]+)-(?<version>[^-]+)-\d+-(?<startTime>\d+)-PID(?<pid>\d+) in source
EXTRACT-logFile2 = \/idn\/saslogs\/Altlogs\/(?<fileDate>\d+)\/(?<user>[^-]+)-(?<version>[^-]+)-\d+-(?<startTime>\d+)-PID(?<pid>\d+) in source

index time settings:

NO_BINARY_CHECK=1
LINE_BREAKER = ((*FAIL))
SHOULD_LINEMERGE = false
TRUNCATE = 9999999

Thank you for any help!!

0 Karma
1 Solution

woodcock
Esteemed Legend

You need to add MV_ADD = 1 to the appropriate stanza in transforms.conf. This does the same thing:

...  | rex max_match=0 "NOTE:\sPROCEDURE\s(?<procedure>\w+)\sused"

View solution in original post

Sebastian2
Path Finder

Why don't you use the LINE_BREAKER expression to properly break your events (And what are you trying to archieve with LINE_BREAKER = ((*FAIL)) help here)?

  1. http://docs.splunk.com/Documentation/Splunk/6.2.2/Data/Indexmulti-lineevents#Line_breaking_general_a...
  2. http://docs.splunk.com/Documentation/Splunk/6.3.1/Admin/Propsconf (search for "LINE_BREAKER")
0 Karma

tmarlette
Motivator

unfortunatley, the logs are not clean enough to use a line breaker. Event start/stop is not clearly delineated.

0 Karma

Sebastian2
Path Finder

Did you read the props.conf documentation carefully? There are a bunch of possibilities to break events (not only the LINE_BREAKER😞

/edit I cut that out again, way too much ugly formatted text. Search for LINE_BREAKER, there are several pages regarding event breaking.

0 Karma

woodcock
Esteemed Legend

You need to add MV_ADD = 1 to the appropriate stanza in transforms.conf. This does the same thing:

...  | rex max_match=0 "NOTE:\sPROCEDURE\s(?<procedure>\w+)\sused"

tmarlette
Motivator

I tried this, though it didn't seem to work. When I say this, I mean the 'rex' format you mentioned above. I didn't adjust this in the props.

0 Karma

richgalloway
SplunkTrust
SplunkTrust

You will need to add MV_ADD=1 to props.conf for the file to work correctly. Then you will have to use mv*commands to process the multi-valued 'procedure' variable.

---
If this reply helps you, Karma would be appreciated.
0 Karma

yulianaif
New Member

I try this and success.

config on props.conf

[testmv_add] => this is the sourcetype
SHOULD_LINEMERGE = true
REPORT-testmv_add = mv_addreport

config on transforms.conf

[mv_addreport]
REGEX=PROCEDURE\s([^\s]+)
FORMAT = ProcedureName::$1
MV_ADD=true

alt text

0 Karma

richgalloway
SplunkTrust
SplunkTrust

Please provide the entire props.conf stanza for this sourcetype, if you're doing an index-time extraction.
If you're doing a search-time extraction, please provide the search.

---
If this reply helps you, Karma would be appreciated.
0 Karma
Get Updates on the Splunk Community!

How to Get Started with Splunk Data Management Pipeline Builders (Edge Processor & ...

If you want to gain full control over your growing data volumes, check out Splunk’s Data Management pipeline ...

Out of the Box to Up And Running - Streamlined Observability for Your Cloud ...

  Tech Talk Streamlined Observability for Your Cloud Environment Register    Out of the Box to Up And Running ...

Splunk Smartness with Brandon Sternfield | Episode 3

Hello and welcome to another episode of "Splunk Smartness," the interview series where we explore the power of ...