Solved: Regex for extraction between text and second comma

Bliide · ‎07-09-2014

I am working on a field extraction. I have created an extraction that pulls the field I want but I need it to pull even further. I currently have it pulling data between the text I identify and the first comma. I need it to pull from the text to the second comma. Example of a log file is this:

6/25/2014 15:05:12.724 | 18072 | EXCEPTION(V): PARN476_02HLOALP_RD:TF F RgstrData(0)(0): RegNum: 5.100.1, size of 0 bytes is invalid (-2147483638), RegisterMsg.cpp line 263 (class CRegisterFromDeviceMsg). Handled: RegisterMsg.cpp(class CRegisterFromDeviceMsg) line 269 |

My current REGEX looks like this:

(?i) regnum:(?P{FIELDNAME}[^,]+)

I need to either change the regex to get everything up to the second comma or change it to find everything between RegNum: and RegisterMsg.cpp

Please advise

somesoni2 · ‎07-09-2014

Give this a try. (replaces your whole regex)

(?i) RegNum:\s(?P<FIELDNAME>.*)(,\s*\w+\.\w+) line

View solution in original post

somesoni2 · ‎07-09-2014

Give this a try. (replaces your whole regex)

(?i) RegNum:\s(?P<FIELDNAME>.*)(,\s*\w+\.\w+) line

somesoni2 · ‎07-09-2014

Its the literal string 'line' in your logs (from 'line 263'). If all your logs are similar, this word should remain same, hence I included it in regex.

Bliide · ‎07-09-2014

Works great. What is "line" for?

bluger_splunk · ‎07-09-2014

Hi Bliide --

If I understand you correctly, please correct me if I'm wrong, you would like capture the following from the above log?:

RegNum: 5.100.1, size of 0 bytes is invalid (-2147483638)

And not the entire RegNum field, correct?

RegNum: 5.100.1, size of 0 bytes is invalid (-2147483638), RegisterMsg.cpp line 263 (class CRegisterFromDeviceMsg).

For the former, you can capture it in many different ways but it would be based on the assumption that there will always be a second perior within that field. If there isn't, the regex would likely fail. If you can rely on there always being 2 periods within that field you may be able to use the following regex to capture that data.

(?<field_name>RegNum\:.*\b\,.*)(?=\,)

However, if you'd rather capture the entire field value you could use the following:

(?<=RegNum\:\s{1})(?<regnum>.*)(?=Handled\:)

Hope this helps!

Kind Regards,

~Brian

Bliide · ‎07-09-2014

I am attempting to create a field extraction that will pull the data between the RegNum: and RegisterMsg.cpp

So in the example log it would pull:

5.100.1, size of 0 bytes is invalid (-2147483638)

When I try to use your suggested REGEX, splunk gives me an "Invalid regex: syntex error". I am sure it is something I am typing incorrectly. The commas are a constant in the log. That is why I was attempting to use the second comma as the end point for the extraction. Where in my field extraction do I plug in your suggested regex?

(?i) regnum:(?P{FIELDNAME}[^,]+)

Regex for extraction between text and second comma

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Think Like an Architect: Introducing the Splunk Certified Cybersecurity Defense ...

Best Practices: Splunk auto adjust pipeline queue

Announcing Modern Navigation: A New Era of Splunk User Experience

Join the Conversation