Splunk Search

How to use rex to capture a list of fields?

Path Finder

I have the following log data:

Number of Users:3
[1]UserId:1 NumberOfUserRoles:2
    [1]UserRoleCode:1 UserRoleText:Admin
    [2]UserRoleCode:2 UserRoleText:Developer
[2]UserId:2 NumberOfUserRoles:1
    [1]UserRoleCode:1 UserRoleText:Developer
[3]UserId:3 NumberOfUserRoles:1
    [1]UserRoleCode:1 UserRoleText:Tester

I need to parse the above fields as following result:

NumberOfUsers    UserId   UserRoleCode   UserRoleText
3                 1        1             Admin
                           2             Developer
                  2        1             Developer
                  3        1             Tester

In the above results, UserId, UserRoleCode & UserRoleText are multivalue fields. I used traditional rex command with max_match attribute but this is not working.

| rex "Number of Users:(?<NumberOfUsers >\d+)"
| rex max_match=0 "UserId:(?<UserId>\d+) NumberOfUserRoles:(?<NumberOfUserRoles>\d+)\n(?<teststring>.*)"
| rex max_match=0 field="teststring" "UserRoleCode:(?<UserRoleCode>\d+) UserRoleText<.*>:(?<UserRoleText>.*)$"

But the above is not working. Please suggest how to parse the above message to get the expected result.

Thanks.

0 Karma

Esteemed Legend

It would probably be best to use rex to convert this to JSON and then use spath but this works, too:

| makeresults 
| eval _raw="Number of Users:3
[1]UserId:1 NumberOfUserRoles:2
 [1]UserRoleCode:1 UserRoleText:Admin
 [2]UserRoleCode:2 UserRoleText:Developer
[2]UserId:2 NumberOfUserRoles:1
 [1]UserRoleCode:1 UserRoleText:Developer
[3]UserId:3 NumberOfUserRoles:1
 [1]UserRoleCode:1 UserRoleText:Tester"

| rename COMMENT AS "Everything above generates sample event data; everything below is your solution"

| streamstats count AS serial
| rex max_match=0 "(?s)[\r\n]\[\d+\](?<User>UserId:.*?)(?=(?:[\r\n]\[)|$)"
| mvexpand User
| rex field=User "UserId:(?<UserId>\d+)"
| rex max_match=0 field=User "[\r\n]\s+(?<UserRole>\[(?<UserPos>\d+)\]UserRoleCode:(?<UserRoleCode>\d+)\s+UserRoleText:(?<UserRoleText>.*))"
| stats list(UserRoleCode) AS UserRoleCode list(UserRoleText) AS UserRoleText BY UserId serial
| eventstats max(UserId) AS NumberOfUsers
| table NumberOfUsers UserId UserRoleCode UserRoleText serial
0 Karma

Ultra Champion

Couple of things:

First regex contains a space behind field name, that will not work.

Second regex will not properly capture the whole set of user roles in each userID. The following regex would work better:

| rex max_match=0 "UserId:(?<UserId>\d+) NumberOfUserRoles:(?<NumberOfUserRoles>\d+)\n(?<teststring>(?:\s+\[\d+\]UserRoleCode:\d+\s+UserRoleText:\w+)+)"

https://regex101.com/r/E1HPXv/1

Third regex also contains some issues (like the <.*> behind UserRoleText). Following should work (you may need to replace that last \w+ by something else if role text can be more than just a single word.

| rex max_match=0 field="teststring" "UserRoleCode:(?<UserRoleCode>\d+) UserRoleText:(?<UserRoleText>\w+)"

https://regex101.com/r/E1HPXv/2

0 Karma

Path Finder

This just lists out the UserId, UserRoleCode & UserRoleText fields as follows:

 NumberOfUsers    UserId   UserRoleCode   UserRoleText
    3                 1        1             Admin
                      2        2             Developer
                      3        1             Developer
                               1             Tester

I need the result in a hierarchical order rather than the list of items. Just like the one in the question

0 Karma

Ultra Champion

Ah, right, sorry, I did think of that, but after fixing the regexes I forgot to look at that part.

This needs a few tweaks to the regexes and then some mvexpands in the right place:

| rex "Number of Users:(?<NumberOfUsers>\d+)"
| rex max_match=0 "UserId:(?<UserId>\d+) NumberOfUserRoles:(?<NumberOfUserRoles>\d+)\n(?<teststring>(?:\s+\[\d+\]UserRoleCode:\d+\s+UserRoleText:\w+)+)"
| mvexpand UserId
| fields - _time,_raw
| eval NumberOfUserRoles = mvindex(NumberOfUserRoles,UserId-1)
| eval teststring = mvindex(teststring,UserId-1)
| rex max_match=0 field="teststring" "(?<roles>UserRoleCode:\d+ UserRoleText:\w+)"
| fields - teststring
| mvexpand roles
| rex max_match=0 field="roles" "UserRoleCode:(?<UserRoleCode>\d+) UserRoleText:(?<UserRoleText>\w+)"
| table NumberOfUsers, UserId, NumberOfUserRoles, UserRoleCode, UserRoleText
0 Karma

Path Finder

Thanks for your valuable support. This actually picks only the first role of each user. Not picking all the roles. Following is the result what I am getting:

NumberOfUsers    UserId   UserRoleCode   UserRoleText
     3                 1        1             Admin
     3                 2        1             Developer
     3                 3        1             Tester

I have tried changing your query to get the proper result. But failed to do so. Do you have the sample run anywhere query which is working for you?

I am fine, even if we can get the user roles as comma separated for each user like below:

NumberOfUsers    UserId   UserRoleCode   UserRoleText
     3                 1        1             Admin, Developer
     3                 2        1             Developer
     3                 3        1             Tester
0 Karma

Ultra Champion

It worked for me when I run it like this:

| makeresults | eval _raw="Number of Users:3
 [1]UserId:1 NumberOfUserRoles:2
     [1]UserRoleCode:1 UserRoleText:Admin
     [2]UserRoleCode:2 UserRoleText:Developer
 [2]UserId:2 NumberOfUserRoles:1
     [1]UserRoleCode:1 UserRoleText:Developer
 [3]UserId:3 NumberOfUserRoles:1
     [1]UserRoleCode:1 UserRoleText:Tester"
| rex "Number of Users:(?<NumberOfUsers>\d+)"
| rex max_match=0 "UserId:(?<UserId>\d+) NumberOfUserRoles:(?<NumberOfUserRoles>\d+)\n(?<teststring>(?:\s+\[\d+\]UserRoleCode:\d+\s+UserRoleText:\w+)+)"
| mvexpand UserId
| fields - _time,_raw
| eval NumberOfUserRoles = mvindex(NumberOfUserRoles,UserId-1)
| eval teststring = mvindex(teststring,UserId-1)
| rex max_match=0 field="teststring" "(?<roles>UserRoleCode:\d+ UserRoleText:\w+)"
| fields - teststring
| mvexpand roles
| rex max_match=0 field="roles" "UserRoleCode:(?<UserRoleCode>\d+) UserRoleText:(?<UserRoleText>\w+)"
| table NumberOfUsers, UserId, NumberOfUserRoles, UserRoleCode, UserRoleText

You might want to try and remove some lines from the bottom of the search to see if the first steps work. E.g. check if the teststring field contains both roles and whether the roles field gets extracted properly with both values.

0 Karma