Splunk Search

Extract fields with multiple results or generate fake/dummy events

michwii
New Member

Hi all,

I'm struggling these days with regular expressions and field extractions with events that contain multiple results.

We are trying to extract SVN logs and do some statistics with them.
In a SVN log we have a date, an userID, an action ( D elete / A dd / U pdate) and a file associated and finally with have a comment.

Here are 3 examples :
Very simple, 1 file added

Fri Jul 31 09:36:48 CEST 2015 --- X9896 --- A BUSINESSOBJECTS/TAGS/PROD_2564845/ --- Progetto BI1500624 - Mapping

Multiple files commited :

Wed Jul 29 11:05:03 CEST 2015 --- X9896 --- A BATCH/BRANCHES/PROD/MENS/DEW/EXECUTE_95_22 A BATCH/BRANCHES/PROD/MENS/DEW/assicurazione_nuove_fasce_riporti__95_22.ksh A BATCH/BRANCHES/PROD/MENS/DEW/Inssurance__95_22.sas U BATCH/BRANCHES/PROD/MENS/DEW/coeff_rip.ksh A BATCH/BRANCHES/PROD/MENS/DEW/coeff_cc_istituzionale_rip__95_22.ksh A BATCH/BRANCHES/PROD/MENS/DEW/coeff_cc_rip__95_22.sas A BATCH/BRANCHES/PROD/MENS/DEW/coeff_cc_rip__95_22__201105.sas A BATCH/BRANCHES/PROD/MENS/DEW/incassi_nuove_fasce_riporti__95_22.ksh A BATCH/BRANCHES/PROD/MENS/DEW/incassi_rip__95_22.sas A BATCH/BRANCHES/PROD/MENS/DEW/incorso_afterInstutition_rip__95_22.ksh A BATCH/BRANCHES/PROD/MENS/DEW/ino_After_rip__95_22.sas A BATCH/BRANCHES/PROD/MENS/DEW/incorsve_fasce_report__95_22.ksh A BATCH/BRANCHES/PROD/MENS/IAS/tnew_fasce__95_22.ksh A BATCH/BRANCHES/PROD/MENS/DEW/taxe_rip__95_22.sas A BATCH/BRANCHES/PROD/MENS/DEW/teorike_rip__95_22.ksh A BATCH/BRANCHES/PROD/MENS/DEW/teorico_rip__95_22.sas --- Addscript presentin ~mens/DEW

More complicated (you have now spaces into the file name) :

Wed Jul 29 10:10:06 CEST 2015 --- G5461 --- D BUSINESSOBJECTS/TRUNK/Nero.sev D BUSINESSOBJECTS/TRUNK/Cadran.sev D BUSINESSOBJECTS/TRUNK/Controllo Metodologico.unv D BUSINESSOBJECTS/TRUNK/MaraCredit.sev D BUSINESSOBJECTS/TRUNK/DM_RISK.unv D BUSINESSOBJECTS/TRUNK/Rers.sev D BUSINESSOBJECTS/TRUNK/Search.sev D BUSINESSOBJECTS/TRUNK/Cars.sev D BUSINESSOBJECTS/TRUNK/uni_rec.unv --- Dismissione universi Nero, Cadran, Maracredit, Controllo Metodologico, DM_RISCHIO, Piani, Ricerca, Universo Recupero e Veicoli

In order to simplify my problematic, I decided to first focus on extracting 3 fields ( userID, commits, comments ) with this regular expression :

sourcetype=svn  source="script-svn_log" | rex max_match=0 ---(?<userID>.*)---(?<Commit>.*)---(?<Comment>.*)

Now I would like to parse again the field Commit. I need to identify all the files committed with the action associated (U or A or D).

First question : How can I run another regular expression on a specific field (in my case Commit) ?

Second question : For each value that has been found, will it been possible to create dummy/fake events that will help me to do statistics on my commits ? Those dummy events has to have the same fields that has the parent.

Don't hesitate if you need further details on my problem.

Thank you guys for your time.

Have a nice day =D

0 Karma

somesoni2
Revered Legend

Something like below worked for me with your sample data

Ans 1: Use the field=fieldname with rex to use the specific field, by default is _raw

sourcetype=svn  source="script-svn_log" | rex "---(?<userID>.*) --- (?<Commit>.*) --- (?<Comment>.*)" | rex max_match=0 field=Commit "(?<Action>\w) (?<FileName>(\w+\/)+((\w*\s*)*(\.\w+)*))"

Ans 2: Dummy event, once you get your file name as multivalued field, use mvexpand, like below, to split them into separate event, keeping value for base events the same (since with file name, you need Action, an extra step to keep Action and FileName together using mvzip)

sourcetype=svn  source="script-svn_log" | rex "---(?<userID>.*) --- (?<Commit>.*) --- (?<Comment>.*)" | rex max_match=0 field=Commit "(?<Action>\w) (?<FileName>(\w+\/)+((\w*\s*)*(\.\w+)*))" | eval temp=mvzip(Action,FileName,"#") | mvexpand temp | rex field=temp "(?<Action>.*)#(?<FileName>.*)" | fields - temp
0 Karma

richgalloway
SplunkTrust
SplunkTrust

To run a regex on a specific field, specify that field in the rex command.

sourcetype=svn  source="script-svn_log" | rex max_match=0 "---(?<userID>.*) --- (?<Commit>.*) --- (?<Comment>.*)" | rex field=Commit "(?<Action>\w) (?<file>.*)"

You can create separate events using the mvexpand command. See Example 3 at http://docs.splunk.com/Documentation/Splunk/6.2.5/SearchReference/Mvexpand#Examples

---
If this reply helps you, Karma would be appreciated.
0 Karma

diogofgm
SplunkTrust
SplunkTrust

have you tried to use something like https://regex101.com?
Are the fields always splited by --- ?

------------
Hope I was able to help you. If so, some karma would be appreciated.
0 Karma
Get Updates on the Splunk Community!

Index This | Why did the turkey cross the road?

November 2025 Edition  Hayyy Splunk Education Enthusiasts and the Eternally Curious!   We’re back with this ...

Enter the Agentic Era with Splunk AI Assistant for SPL 1.4

  &#x1f680; Your data just got a serious AI upgrade — are you ready? Say hello to the Agentic Era with the ...

Feel the Splunk Love: Real Stories from Real Customers

Hello Splunk Community,    What’s the best part of hearing how our customers use Splunk? Easy: the positive ...