Splunk Search

How to do regex based on commas

JoshuaJohn
Contributor

I have data like this:

21,enrollmentgroup,19936,40:G6:7Q:G6:89:FG,,nitro - Circle.one10,Phone,11.1.11313,C,10/25/18 4:58,Enroll,enroll@ms.com,Movies & TV,microsoft,Public,YES,,2019.11111.10112.0,0,0,0,Installed,App installed by user,10/24/18 14:50,10/25/18 4:00

21,enrollmentgroup,19935,20:11:32:11:61:71,,nitro - Circle.one10,Phone,10.0.14393,C,10/25/18 4:58,Enroll,enroll@ms.com,PC Manager,c10f24324242432-43242,Internal,YES,11.5.1.11119,18.10.3.1009,0,0,0,Pending Install,Installed,10/24/18 14:49,10/24/18 22:09

443,S-0001,11222,01:11:7F:E1:D1:71,,nitro - loop.one12,Phone,10.0.14393,C,9/3/18 12:23,Enroll,enroll@ms.com, Mapping Better,631d-2e321-4312-b511f-83f2,Internal,YES,10.0.0.623,20.0.0.5921,0,0,0,Pending Install,Unkown,8/31/18 9:33,10/2/18 0:20

I wrote this but it appears to be too much for Splunk

rex field=_raw "(?<ig1>.+?)([,])(?<loc>.+?)([,])(?<ig2>.+?)([,])(?<MacAddress>.+?)([,])(?<modelType>.+?)([,])(?<ig3>.+?)([,])(?<OS>.+?)([,])(?<ig4>.+?)([,])(?<lastSeen>.+?)([,])(?<user>.+?)([,])(?<useremail>.+?)([,])(?<Application>.+?)([,])(?<ig5>.+?)([,])(?<deployed>.+?)([,])(?<public>.+?)([,])(?<assignedVersion>.+?)([,])(?<installedVersion>.+?)([,])(?<ig6>.+?)([,])(?<ig7>.+?)([,])(?<ig8>.+?)([,])(?<status>.+?)([,])(?<installedby>.+?)([,])(?<appFirstSeen>.+?)([,])(?<appUpdatedTime>.+?)([,]|$)"

Any suggestions? I cannot edit profs etc. I must do this via regex within the query.
Thanks,

0 Karma
1 Solution

jimmoriarty
Path Finder
| rex field=_raw "^(?<f1>[^,]+)?,(?<loc>[^,]+)?,(?<f3>[^,]+)?,(?<MacAddress>[^,]+)?,(?<modelType>[^,]+)?,(?<OS>[^,]+)?,(?<f7>[^,]+)?,(?<f8>[^,]+)?,(?<f9>[^,]+)?,(?<lastSeen>[^,]+)?,(?<user>[^,]+)?,(?<useremail>[^,]+)?,(?<Application>[^,]+)?,(?<f14>[^,]+)?,(?<deployed>[^,]+)?,(?<public>[^,]+)?,(?<assignedVersion>[^,]+)?,(?<installedVersion>[^,]+)?,(?<f19>[^,]+)?,(?<f20>[^,]+)?,(?<f21>[^,]+)?,(?<status>[^,]+)?,(?<installedby>[^,]+)?,(?<appFirstSeen>[^,]+)?,(?<appUpdatedTime>[^,]+)?"

This time with everything included...

View solution in original post

jimmoriarty
Path Finder
| rex field=_raw "^(?<f1>[^,]+)?,(?<loc>[^,]+)?,(?<f3>[^,]+)?,(?<MacAddress>[^,]+)?,(?<modelType>[^,]+)?,(?<OS>[^,]+)?,(?<f7>[^,]+)?,(?<f8>[^,]+)?,(?<f9>[^,]+)?,(?<lastSeen>[^,]+)?,(?<user>[^,]+)?,(?<useremail>[^,]+)?,(?<Application>[^,]+)?,(?<f14>[^,]+)?,(?<deployed>[^,]+)?,(?<public>[^,]+)?,(?<assignedVersion>[^,]+)?,(?<installedVersion>[^,]+)?,(?<f19>[^,]+)?,(?<f20>[^,]+)?,(?<f21>[^,]+)?,(?<status>[^,]+)?,(?<installedby>[^,]+)?,(?<appFirstSeen>[^,]+)?,(?<appUpdatedTime>[^,]+)?"

This time with everything included...

JoshuaJohn
Contributor

Awesome thank you!

0 Karma

ddrillic
Ultra Champion

It seems like a perfect csv format, except the missing header...

0 Karma

jimmoriarty
Path Finder

Try this: | rex field=_raw "^(?[^,]+)?,(?[^,]+)?,(?[^,]+)?,(?[^,]+)?,(?[^,]+)?,(?[^,]+)?,(?[^,]+)?,(?[^,]+)?,(?[^,]+)?,(?[^,]+)?,(?[^,]+)?,(?[^,]+)?,(?[^,]+)?,(?[^,]+)?,(?[^,]+)?,(?[^,]+)?,(?[^,]+)?,(?[^,]+)?,(?[^,]+)?,(?[^,]+)?,(?[^,]+)?,(?[^,]+)?,(?[^,]+)?,(?[^,]+)?,(?[^,]+)?"

0 Karma

JoshuaJohn
Contributor

I think this improved it slightly again, still getting the error though.

,(?<loc>.+?),.+?,(?<MacAddress>.+?),,(?<ModelType>.+?),.+?,(?<OS>.+?),.+?,(?<LastSeen>.+?),(?<user>.+?),(?<useremail>.+?),(?<Application>.+?),.+?,(?<deployed>.+?),(?<public>.+?),(?<assignedVersion>.+?),(?<installedVersion>.+?),.+?,.+?,.+?,(?<status>.+?),(?<installedby>.+?),(?<appFirstSeen>.+?),(?<appUpdatedTime>.+?)$
0 Karma

FrankVl
Ultra Champion

All those .+? are very unnecessarily complex. It matches everything and the regex processor constantly needs to check if perhaps it should stop capturing and continue reading the comma and proceed with the next capturing group.

As @jimmoriarty has already mentioned in his answer: replace the .+? by [^,]+ (so matching all characters except comma), which makes it much, much more efficient.

JoshuaJohn
Contributor

Made some changes to try and lower the greediness of my search

^(.+?),(?<loc>.+?),(.+?),(?<MacAddress>.+?),,(?<ModelType>.+?),(.+?),(?<OS>.+?),(.+?),(?<LastSeen>.+?),(?<user>.+?),(?<useremail>.+?),(?<Application>.+?),(.+?),(?<deployed>.+?),(?<public>.+?),(?<assignedVersion>.+?),(?<installedVersion>.+?),(.+?),(.+?),(.+?),(?<status>.+?),(?<installedby>.+?),(?<appFirstSeen>.+?),(?<appUpdatedTime>.+?)$

Still getting the error: has exceeded configured match_limit

0 Karma
Get Updates on the Splunk Community!

Exporting Splunk Apps

Join us on Monday, October 21 at 11 am PT | 2 pm ET!With the app export functionality, app developers and ...

Cisco Use Cases, ITSI Best Practices, and More New Articles from Splunk Lantern

Splunk Lantern is a Splunk customer success center that provides advice from Splunk experts on valuable data ...

Build Your First SPL2 App!

Watch the recording now!.Do you want to SPL™, too? SPL2, Splunk's next-generation data search and preparation ...