Splunk Search

How to do regex based on commas

JoshuaJohn
Contributor

I have data like this:

21,enrollmentgroup,19936,40:G6:7Q:G6:89:FG,,nitro - Circle.one10,Phone,11.1.11313,C,10/25/18 4:58,Enroll,enroll@ms.com,Movies & TV,microsoft,Public,YES,,2019.11111.10112.0,0,0,0,Installed,App installed by user,10/24/18 14:50,10/25/18 4:00

21,enrollmentgroup,19935,20:11:32:11:61:71,,nitro - Circle.one10,Phone,10.0.14393,C,10/25/18 4:58,Enroll,enroll@ms.com,PC Manager,c10f24324242432-43242,Internal,YES,11.5.1.11119,18.10.3.1009,0,0,0,Pending Install,Installed,10/24/18 14:49,10/24/18 22:09

443,S-0001,11222,01:11:7F:E1:D1:71,,nitro - loop.one12,Phone,10.0.14393,C,9/3/18 12:23,Enroll,enroll@ms.com, Mapping Better,631d-2e321-4312-b511f-83f2,Internal,YES,10.0.0.623,20.0.0.5921,0,0,0,Pending Install,Unkown,8/31/18 9:33,10/2/18 0:20

I wrote this but it appears to be too much for Splunk

rex field=_raw "(?<ig1>.+?)([,])(?<loc>.+?)([,])(?<ig2>.+?)([,])(?<MacAddress>.+?)([,])(?<modelType>.+?)([,])(?<ig3>.+?)([,])(?<OS>.+?)([,])(?<ig4>.+?)([,])(?<lastSeen>.+?)([,])(?<user>.+?)([,])(?<useremail>.+?)([,])(?<Application>.+?)([,])(?<ig5>.+?)([,])(?<deployed>.+?)([,])(?<public>.+?)([,])(?<assignedVersion>.+?)([,])(?<installedVersion>.+?)([,])(?<ig6>.+?)([,])(?<ig7>.+?)([,])(?<ig8>.+?)([,])(?<status>.+?)([,])(?<installedby>.+?)([,])(?<appFirstSeen>.+?)([,])(?<appUpdatedTime>.+?)([,]|$)"

Any suggestions? I cannot edit profs etc. I must do this via regex within the query.
Thanks,

0 Karma
1 Solution

jimmoriarty
Path Finder
| rex field=_raw "^(?<f1>[^,]+)?,(?<loc>[^,]+)?,(?<f3>[^,]+)?,(?<MacAddress>[^,]+)?,(?<modelType>[^,]+)?,(?<OS>[^,]+)?,(?<f7>[^,]+)?,(?<f8>[^,]+)?,(?<f9>[^,]+)?,(?<lastSeen>[^,]+)?,(?<user>[^,]+)?,(?<useremail>[^,]+)?,(?<Application>[^,]+)?,(?<f14>[^,]+)?,(?<deployed>[^,]+)?,(?<public>[^,]+)?,(?<assignedVersion>[^,]+)?,(?<installedVersion>[^,]+)?,(?<f19>[^,]+)?,(?<f20>[^,]+)?,(?<f21>[^,]+)?,(?<status>[^,]+)?,(?<installedby>[^,]+)?,(?<appFirstSeen>[^,]+)?,(?<appUpdatedTime>[^,]+)?"

This time with everything included...

View solution in original post

jimmoriarty
Path Finder
| rex field=_raw "^(?<f1>[^,]+)?,(?<loc>[^,]+)?,(?<f3>[^,]+)?,(?<MacAddress>[^,]+)?,(?<modelType>[^,]+)?,(?<OS>[^,]+)?,(?<f7>[^,]+)?,(?<f8>[^,]+)?,(?<f9>[^,]+)?,(?<lastSeen>[^,]+)?,(?<user>[^,]+)?,(?<useremail>[^,]+)?,(?<Application>[^,]+)?,(?<f14>[^,]+)?,(?<deployed>[^,]+)?,(?<public>[^,]+)?,(?<assignedVersion>[^,]+)?,(?<installedVersion>[^,]+)?,(?<f19>[^,]+)?,(?<f20>[^,]+)?,(?<f21>[^,]+)?,(?<status>[^,]+)?,(?<installedby>[^,]+)?,(?<appFirstSeen>[^,]+)?,(?<appUpdatedTime>[^,]+)?"

This time with everything included...

JoshuaJohn
Contributor

Awesome thank you!

0 Karma

ddrillic
Ultra Champion

It seems like a perfect csv format, except the missing header...

0 Karma

jimmoriarty
Path Finder

Try this: | rex field=_raw "^(?[^,]+)?,(?[^,]+)?,(?[^,]+)?,(?[^,]+)?,(?[^,]+)?,(?[^,]+)?,(?[^,]+)?,(?[^,]+)?,(?[^,]+)?,(?[^,]+)?,(?[^,]+)?,(?[^,]+)?,(?[^,]+)?,(?[^,]+)?,(?[^,]+)?,(?[^,]+)?,(?[^,]+)?,(?[^,]+)?,(?[^,]+)?,(?[^,]+)?,(?[^,]+)?,(?[^,]+)?,(?[^,]+)?,(?[^,]+)?,(?[^,]+)?"

0 Karma

JoshuaJohn
Contributor

I think this improved it slightly again, still getting the error though.

,(?<loc>.+?),.+?,(?<MacAddress>.+?),,(?<ModelType>.+?),.+?,(?<OS>.+?),.+?,(?<LastSeen>.+?),(?<user>.+?),(?<useremail>.+?),(?<Application>.+?),.+?,(?<deployed>.+?),(?<public>.+?),(?<assignedVersion>.+?),(?<installedVersion>.+?),.+?,.+?,.+?,(?<status>.+?),(?<installedby>.+?),(?<appFirstSeen>.+?),(?<appUpdatedTime>.+?)$
0 Karma

FrankVl
Ultra Champion

All those .+? are very unnecessarily complex. It matches everything and the regex processor constantly needs to check if perhaps it should stop capturing and continue reading the comma and proceed with the next capturing group.

As @jimmoriarty has already mentioned in his answer: replace the .+? by [^,]+ (so matching all characters except comma), which makes it much, much more efficient.

JoshuaJohn
Contributor

Made some changes to try and lower the greediness of my search

^(.+?),(?<loc>.+?),(.+?),(?<MacAddress>.+?),,(?<ModelType>.+?),(.+?),(?<OS>.+?),(.+?),(?<LastSeen>.+?),(?<user>.+?),(?<useremail>.+?),(?<Application>.+?),(.+?),(?<deployed>.+?),(?<public>.+?),(?<assignedVersion>.+?),(?<installedVersion>.+?),(.+?),(.+?),(.+?),(?<status>.+?),(?<installedby>.+?),(?<appFirstSeen>.+?),(?<appUpdatedTime>.+?)$

Still getting the error: has exceeded configured match_limit

0 Karma
Get Updates on the Splunk Community!

Improve Your Security Posture

Watch NowImprove Your Security PostureCustomers are at the center of everything we do at Splunk and security ...

Maximize the Value from Microsoft Defender with Splunk

 Watch NowJoin Splunk and Sens Consulting for this Security Edition Tech TalkWho should attend:  Security ...

This Week's Community Digest - Splunk Community Happenings [6.27.22]

Get the latest news and updates from the Splunk Community here! News From Splunk Answers ✍️ Splunk Answers is ...