Splunk Search

Regex help on multiline fields and mvexpand

koshyk
Super Champion

My data sample is as below

C12345  my1Surname, my1First        Role Access (Group1) - I
                            Role Access (Group2) - II
                            HR Department1
                            Email Contractors
P98765  Þórarinsson, Guðmundur        Role Access (Group1) - I
                            IT Department1
                            Email Permanent                               
P12567  my3Surname, my3First        Role Access (Group2) - II
                            IT Department1
                            AWS Access Role2
                            Email Permanent  

The first column (is Surname and Firstname), then there is a tab and then it is "multiline" roles of the user
I'm using below regex. The regex splits correctly, but the "\n" of the roles are missed, so unable to split it afterwards

| rex "^(?<employeeID>.*)\t(?<user>.*)\t(?<roles>(?s).*)"

I'm looking for final output into a key-value format possibly to have a relational-value (or json hierarchy). eg C12345 has 4 roles

C12345 = Role Access (Group1) - I
C12345 = Role Access (Group2) - II
C12345 = HR Department1
C12345 = Email Contractors

PS: The data is encoded in iso-8859-1 (European languages)

0 Karma
1 Solution

martin_mueller
SplunkTrust
SplunkTrust

This chops up your provided data into each field, format the final output as you need:

| makeresults | eval _raw = " C12345  my1Surname, my1First        Role Access (Group1) - I
                             Role Access (Group2) - II
                             HR Department1
                             Email Contractors
 P98765  Þórarinsson, Guðmundur        Role Access (Group1) - I
                             IT Department1
                             Email Permanent                               
 P12567  my3Surname, my3First        Role Access (Group2) - II
                             IT Department1
                             AWS Access Role2
                             Email Permanent"
| rex max_match=0 "(?m)^(?<entry>\s{0,3}\S[^\r\n]+([\r\n]+\s{4,}[^\r\n]+)*)" 
| mvexpand entry 
| rex field=entry "(?s)^\s*(?<employeeid>\S+)\s+(?<user>.*?)\s\s+(?<roles>.*)" 
| rex max_match=0 field=roles "(?<roles>.*?)(\s\s+|\s*$)"
| table employeeid user roles

Output:

employeeid        user                      roles   
C12345            my1Surname, my1First      Role Access (Group1) - I
                                            Role Access (Group2) - II
                                            HR Department1
                                            Email Contractors
P98765            Þórarinsson, Guðmundur    Role Access (Group1) - I
                                            IT Department1
                                            Email Permanent
P12567            my3Surname, my3First      Role Access (Group2) - II
                                            IT Department1
                                            AWS Access Role2
                                            Email Permanent 

View solution in original post

martin_mueller
SplunkTrust
SplunkTrust

This chops up your provided data into each field, format the final output as you need:

| makeresults | eval _raw = " C12345  my1Surname, my1First        Role Access (Group1) - I
                             Role Access (Group2) - II
                             HR Department1
                             Email Contractors
 P98765  Þórarinsson, Guðmundur        Role Access (Group1) - I
                             IT Department1
                             Email Permanent                               
 P12567  my3Surname, my3First        Role Access (Group2) - II
                             IT Department1
                             AWS Access Role2
                             Email Permanent"
| rex max_match=0 "(?m)^(?<entry>\s{0,3}\S[^\r\n]+([\r\n]+\s{4,}[^\r\n]+)*)" 
| mvexpand entry 
| rex field=entry "(?s)^\s*(?<employeeid>\S+)\s+(?<user>.*?)\s\s+(?<roles>.*)" 
| rex max_match=0 field=roles "(?<roles>.*?)(\s\s+|\s*$)"
| table employeeid user roles

Output:

employeeid        user                      roles   
C12345            my1Surname, my1First      Role Access (Group1) - I
                                            Role Access (Group2) - II
                                            HR Department1
                                            Email Contractors
P98765            Þórarinsson, Guðmundur    Role Access (Group1) - I
                                            IT Department1
                                            Email Permanent
P12567            my3Surname, my3First      Role Access (Group2) - II
                                            IT Department1
                                            AWS Access Role2
                                            Email Permanent 

koshyk
Super Champion

Thank you perfect.
The missing thing in my search was:

| rex max_match=0 field=roles "(?<roles>.*?)(\s\s+|\s*$)"

So my whole search is now

<raw search>| rex "^(?<employeeID>.*)\t(?<user>.*)\t(?<roles>(?s).*)"| rex max_match=0 field=roles "(?<roles>.*?)(\s\s+|\s*$)"| mvexpand roles | table employeeID,user,roles
0 Karma
Get Updates on the Splunk Community!

CX Day is Coming!

Customer Experience (CX) Day is on October 7th!! We're so excited to bring back another day full of wonderful ...

Strengthen Your Future: A Look Back at Splunk 10 Innovations and .conf25 Highlights!

The Big One: Splunk 10 is Here!  The moment many of you have been waiting for has arrived! We are thrilled to ...

Now Offering the AI Assistant Usage Dashboard in Cloud Monitoring Console

Today, we’re excited to announce the release of a brand new AI assistant usage dashboard in Cloud Monitoring ...