Solved: Regex Extraction

leonheart78 · ‎08-15-2016

Hi,

I have encountered error while trying to using the Splunk Web to extract the below bolded field

Remark="B78OH30% OV- 22V1797-2 Open"
Remark="F2O2-1 OV- 102V1723-4 Open"
Remark="BSC794 OV- 32V1415-4 Open"
Remark="F2SO4-1 OV- 101V2023-1 Open"

However, I keep getting Extraction failed in the the process.
May I know how can I extract just the required field?
Thank you.

javiergn · ‎08-15-2016

Use rex instead:

| rex field=Remark "^(?<key>\S+)"

Example:

| makeresults | fields - _time
| eval Remark = "B78OH30% OV- 22V1797-2 Open;F2O2-1 OV- 102V1723-4 Open;BSC794 OV- 32V1415-4 Open;F2SO4-1 OV- 101V2023-1 Open"
| eval Remark = split(Remark, ";")
| mvexpand Remark
| rex field=Remark "^(?<key>\S+)"

Output:

View solution in original post

michael_sleep · ‎08-22-2016

If you are trying to capture everything after Remark=" and before the first space then you could use the following search time extraction:

^Remark\=\"(?P<Remark>[^\s]+)

michael_sleep · ‎08-23-2016

See this is answered now but I didn't put my regex in code format so it was all messed up... if you use the above as a search-time extraction it will work fine though.

javiergn · ‎08-15-2016

Use rex instead:

| rex field=Remark "^(?<key>\S+)"

Example:

| makeresults | fields - _time
| eval Remark = "B78OH30% OV- 22V1797-2 Open;F2O2-1 OV- 102V1723-4 Open;BSC794 OV- 32V1415-4 Open;F2SO4-1 OV- 101V2023-1 Open"
| eval Remark = split(Remark, ";")
| mvexpand Remark
| rex field=Remark "^(?<key>\S+)"

Output:

leonheart78 · ‎08-15-2016

Unfortunately, the data have others values, thus I will not be able to use the eval command to list them all out. Anyway to be able to extract the value during index time?

Thank you.

Richfez · ‎08-15-2016

You have two questions, in order the answers are:

Javiergn's example is called a "run anywhere" example that we/you/anyone can test with. If you use his first code sample, the | rex field=Remark "^(?<key>\S+)" it should work to extract from your field Remark a new field called "key" (call it whatever you want) that's all of the old Remark's field up to the first space. The second set of code was just the "proof" it worked that anyone can run.

As to the latter, please read the docs on Creating custom fields at index time carefully. One generally does not need to have the field done at index time, most of the time you just need it done "automatically" via the back end so that your fields show up without you having to rex them every time. That's done with one of two commands (REPORT-something or EXTRACT-something) in props.conf, and are what I think you really want.

To make an extraction that "always" happens, find the sourcetype section for the sourcetype involved in your $splunkhome/etc/apps/system/local/props.conf and add this to it:

EXTRACT-key-from-remark = ^(?<key>\S+) in Remark

Try that, and once you get it working (be sure to restart Splunk and ask if you have problems!) you can decide it if works well enough for your needs. If after that you really feel it should be index-time instead of just "always available", you can modify the above into transforms.conf and refer to it from props.conf with a TRANSFORMS-key-from-remark but we can cross that bridge if you need it.

Does that help?

sowings · ‎08-15-2016

The EXTRACT bit shown above features the syntax "IN ", which requires that the field be extracted already before this regex fires. The problem is that the automatic key=value recognition that Splunk does (governed by the KV_MODE setting) is done after EXTRACT statements. You'd first have to write a regex "EXTRACT-0_get_remark" with a value like Remark=\"(?[^\"]+)\" (ish).

Richfez · ‎08-16-2016

Oh, right, thanks. Good catch! And right after I got done explaining about the run-anywhere example, I went and confused it with a production sample. Bad Rich, Bad. 😞

michael_sleep · ‎08-23-2016

Is there any reason to use index time extractions over search-time? Search-time extractions are better in 99% of cases. This is an overly complicated solution for something that could be done with just the search time extraction?

^Remark\=\"(?P<Remark>[^\s]+)

Regex Extraction

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Think Like an Architect: Introducing the Splunk Certified Cybersecurity Defense ...

Best Practices: Splunk auto adjust pipeline queue

Announcing Modern Navigation: A New Era of Splunk User Experience

Join the Conversation