Splunk Search

Regex help on multiline fields and mvexpand

Super Champion

My data sample is as below

C12345  my1Surname, my1First        Role Access (Group1) - I
                            Role Access (Group2) - II
                            HR Department1
                            Email Contractors
P98765  Þórarinsson, Guðmundur        Role Access (Group1) - I
                            IT Department1
                            Email Permanent                               
P12567  my3Surname, my3First        Role Access (Group2) - II
                            IT Department1
                            AWS Access Role2
                            Email Permanent  

The first column (is Surname and Firstname), then there is a tab and then it is "multiline" roles of the user
I'm using below regex. The regex splits correctly, but the "\n" of the roles are missed, so unable to split it afterwards

| rex "^(?<employeeID>.*)\t(?<user>.*)\t(?<roles>(?s).*)"

I'm looking for final output into a key-value format possibly to have a relational-value (or json hierarchy). eg C12345 has 4 roles

C12345 = Role Access (Group1) - I
C12345 = Role Access (Group2) - II
C12345 = HR Department1
C12345 = Email Contractors

PS: The data is encoded in iso-8859-1 (European languages)

0 Karma
1 Solution

SplunkTrust
SplunkTrust

This chops up your provided data into each field, format the final output as you need:

| makeresults | eval _raw = " C12345  my1Surname, my1First        Role Access (Group1) - I
                             Role Access (Group2) - II
                             HR Department1
                             Email Contractors
 P98765  Þórarinsson, Guðmundur        Role Access (Group1) - I
                             IT Department1
                             Email Permanent                               
 P12567  my3Surname, my3First        Role Access (Group2) - II
                             IT Department1
                             AWS Access Role2
                             Email Permanent"
| rex max_match=0 "(?m)^(?<entry>\s{0,3}\S[^\r\n]+([\r\n]+\s{4,}[^\r\n]+)*)" 
| mvexpand entry 
| rex field=entry "(?s)^\s*(?<employeeid>\S+)\s+(?<user>.*?)\s\s+(?<roles>.*)" 
| rex max_match=0 field=roles "(?<roles>.*?)(\s\s+|\s*$)"
| table employeeid user roles

Output:

employeeid        user                      roles   
C12345            my1Surname, my1First      Role Access (Group1) - I
                                            Role Access (Group2) - II
                                            HR Department1
                                            Email Contractors
P98765            Þórarinsson, Guðmundur    Role Access (Group1) - I
                                            IT Department1
                                            Email Permanent
P12567            my3Surname, my3First      Role Access (Group2) - II
                                            IT Department1
                                            AWS Access Role2
                                            Email Permanent 

View solution in original post

SplunkTrust
SplunkTrust

This chops up your provided data into each field, format the final output as you need:

| makeresults | eval _raw = " C12345  my1Surname, my1First        Role Access (Group1) - I
                             Role Access (Group2) - II
                             HR Department1
                             Email Contractors
 P98765  Þórarinsson, Guðmundur        Role Access (Group1) - I
                             IT Department1
                             Email Permanent                               
 P12567  my3Surname, my3First        Role Access (Group2) - II
                             IT Department1
                             AWS Access Role2
                             Email Permanent"
| rex max_match=0 "(?m)^(?<entry>\s{0,3}\S[^\r\n]+([\r\n]+\s{4,}[^\r\n]+)*)" 
| mvexpand entry 
| rex field=entry "(?s)^\s*(?<employeeid>\S+)\s+(?<user>.*?)\s\s+(?<roles>.*)" 
| rex max_match=0 field=roles "(?<roles>.*?)(\s\s+|\s*$)"
| table employeeid user roles

Output:

employeeid        user                      roles   
C12345            my1Surname, my1First      Role Access (Group1) - I
                                            Role Access (Group2) - II
                                            HR Department1
                                            Email Contractors
P98765            Þórarinsson, Guðmundur    Role Access (Group1) - I
                                            IT Department1
                                            Email Permanent
P12567            my3Surname, my3First      Role Access (Group2) - II
                                            IT Department1
                                            AWS Access Role2
                                            Email Permanent 

View solution in original post

Super Champion

Thank you perfect.
The missing thing in my search was:

| rex max_match=0 field=roles "(?<roles>.*?)(\s\s+|\s*$)"

So my whole search is now

<raw search>| rex "^(?<employeeID>.*)\t(?<user>.*)\t(?<roles>(?s).*)"| rex max_match=0 field=roles "(?<roles>.*?)(\s\s+|\s*$)"| mvexpand roles | table employeeID,user,roles
0 Karma
State of Splunk Careers

Access the Splunk Careers Report to see real data that shows how Splunk mastery increases your value and job satisfaction.

Find out what your skills are worth!