Splunk Search

How to do regex based on commas

JoshuaJohn
Contributor

I have data like this:

21,enrollmentgroup,19936,40:G6:7Q:G6:89:FG,,nitro - Circle.one10,Phone,11.1.11313,C,10/25/18 4:58,Enroll,enroll@ms.com,Movies & TV,microsoft,Public,YES,,2019.11111.10112.0,0,0,0,Installed,App installed by user,10/24/18 14:50,10/25/18 4:00

21,enrollmentgroup,19935,20:11:32:11:61:71,,nitro - Circle.one10,Phone,10.0.14393,C,10/25/18 4:58,Enroll,enroll@ms.com,PC Manager,c10f24324242432-43242,Internal,YES,11.5.1.11119,18.10.3.1009,0,0,0,Pending Install,Installed,10/24/18 14:49,10/24/18 22:09

443,S-0001,11222,01:11:7F:E1:D1:71,,nitro - loop.one12,Phone,10.0.14393,C,9/3/18 12:23,Enroll,enroll@ms.com, Mapping Better,631d-2e321-4312-b511f-83f2,Internal,YES,10.0.0.623,20.0.0.5921,0,0,0,Pending Install,Unkown,8/31/18 9:33,10/2/18 0:20

I wrote this but it appears to be too much for Splunk

rex field=_raw "(?<ig1>.+?)([,])(?<loc>.+?)([,])(?<ig2>.+?)([,])(?<MacAddress>.+?)([,])(?<modelType>.+?)([,])(?<ig3>.+?)([,])(?<OS>.+?)([,])(?<ig4>.+?)([,])(?<lastSeen>.+?)([,])(?<user>.+?)([,])(?<useremail>.+?)([,])(?<Application>.+?)([,])(?<ig5>.+?)([,])(?<deployed>.+?)([,])(?<public>.+?)([,])(?<assignedVersion>.+?)([,])(?<installedVersion>.+?)([,])(?<ig6>.+?)([,])(?<ig7>.+?)([,])(?<ig8>.+?)([,])(?<status>.+?)([,])(?<installedby>.+?)([,])(?<appFirstSeen>.+?)([,])(?<appUpdatedTime>.+?)([,]|$)"

Any suggestions? I cannot edit profs etc. I must do this via regex within the query.
Thanks,

0 Karma
1 Solution

jimmoriarty
Path Finder
| rex field=_raw "^(?<f1>[^,]+)?,(?<loc>[^,]+)?,(?<f3>[^,]+)?,(?<MacAddress>[^,]+)?,(?<modelType>[^,]+)?,(?<OS>[^,]+)?,(?<f7>[^,]+)?,(?<f8>[^,]+)?,(?<f9>[^,]+)?,(?<lastSeen>[^,]+)?,(?<user>[^,]+)?,(?<useremail>[^,]+)?,(?<Application>[^,]+)?,(?<f14>[^,]+)?,(?<deployed>[^,]+)?,(?<public>[^,]+)?,(?<assignedVersion>[^,]+)?,(?<installedVersion>[^,]+)?,(?<f19>[^,]+)?,(?<f20>[^,]+)?,(?<f21>[^,]+)?,(?<status>[^,]+)?,(?<installedby>[^,]+)?,(?<appFirstSeen>[^,]+)?,(?<appUpdatedTime>[^,]+)?"

This time with everything included...

View solution in original post

jimmoriarty
Path Finder
| rex field=_raw "^(?<f1>[^,]+)?,(?<loc>[^,]+)?,(?<f3>[^,]+)?,(?<MacAddress>[^,]+)?,(?<modelType>[^,]+)?,(?<OS>[^,]+)?,(?<f7>[^,]+)?,(?<f8>[^,]+)?,(?<f9>[^,]+)?,(?<lastSeen>[^,]+)?,(?<user>[^,]+)?,(?<useremail>[^,]+)?,(?<Application>[^,]+)?,(?<f14>[^,]+)?,(?<deployed>[^,]+)?,(?<public>[^,]+)?,(?<assignedVersion>[^,]+)?,(?<installedVersion>[^,]+)?,(?<f19>[^,]+)?,(?<f20>[^,]+)?,(?<f21>[^,]+)?,(?<status>[^,]+)?,(?<installedby>[^,]+)?,(?<appFirstSeen>[^,]+)?,(?<appUpdatedTime>[^,]+)?"

This time with everything included...

JoshuaJohn
Contributor

Awesome thank you!

0 Karma

ddrillic
Ultra Champion

It seems like a perfect csv format, except the missing header...

0 Karma

jimmoriarty
Path Finder

Try this: | rex field=_raw "^(?[^,]+)?,(?[^,]+)?,(?[^,]+)?,(?[^,]+)?,(?[^,]+)?,(?[^,]+)?,(?[^,]+)?,(?[^,]+)?,(?[^,]+)?,(?[^,]+)?,(?[^,]+)?,(?[^,]+)?,(?[^,]+)?,(?[^,]+)?,(?[^,]+)?,(?[^,]+)?,(?[^,]+)?,(?[^,]+)?,(?[^,]+)?,(?[^,]+)?,(?[^,]+)?,(?[^,]+)?,(?[^,]+)?,(?[^,]+)?,(?[^,]+)?"

0 Karma

JoshuaJohn
Contributor

I think this improved it slightly again, still getting the error though.

,(?<loc>.+?),.+?,(?<MacAddress>.+?),,(?<ModelType>.+?),.+?,(?<OS>.+?),.+?,(?<LastSeen>.+?),(?<user>.+?),(?<useremail>.+?),(?<Application>.+?),.+?,(?<deployed>.+?),(?<public>.+?),(?<assignedVersion>.+?),(?<installedVersion>.+?),.+?,.+?,.+?,(?<status>.+?),(?<installedby>.+?),(?<appFirstSeen>.+?),(?<appUpdatedTime>.+?)$
0 Karma

FrankVl
Ultra Champion

All those .+? are very unnecessarily complex. It matches everything and the regex processor constantly needs to check if perhaps it should stop capturing and continue reading the comma and proceed with the next capturing group.

As @jimmoriarty has already mentioned in his answer: replace the .+? by [^,]+ (so matching all characters except comma), which makes it much, much more efficient.

JoshuaJohn
Contributor

Made some changes to try and lower the greediness of my search

^(.+?),(?<loc>.+?),(.+?),(?<MacAddress>.+?),,(?<ModelType>.+?),(.+?),(?<OS>.+?),(.+?),(?<LastSeen>.+?),(?<user>.+?),(?<useremail>.+?),(?<Application>.+?),(.+?),(?<deployed>.+?),(?<public>.+?),(?<assignedVersion>.+?),(?<installedVersion>.+?),(.+?),(.+?),(.+?),(?<status>.+?),(?<installedby>.+?),(?<appFirstSeen>.+?),(?<appUpdatedTime>.+?)$

Still getting the error: has exceeded configured match_limit

0 Karma
Get Updates on the Splunk Community!

Troubleshooting the OpenTelemetry Collector

  In this tech talk, you’ll learn how to troubleshoot the OpenTelemetry collector - from checking the ...

Adoption of Infrastructure Monitoring at Splunk

  Splunk's Growth Engineering team showcases one of their first Splunk product adoption-Splunk Infrastructure ...

Modern way of developing distributed application using OTel

Recently, I had the opportunity to work on a complex microservice using Spring boot and Quarkus to develop a ...