Splunk Search

Extracting fields from a large text field

psomeshwar
Path Finder

Currently, I have a field called pluginText which is the following (italicized words are anonymized to what they represent):

<plugin_output>
The following software are installed on the remote host:

Vendor Software  [version versionnumber] [installed on date]
...
...
...
</plugin_output>

I wish to extract out Vendor, Software and versionnumber to separate fields and require a rex to do so. I am unfamiliar with using rex on this type of list, so I was hoping someone could point me in the right direction

Labels (2)
0 Karma

marnall
Motivator

I would highly recommend the website https://regex101.com/ as it allows you to see previews of your regex extractions as you write them. 

This regex might work:

on the remote host:\n\n(?<Vendor>[^\[\s]*)\s(?<Software>[^\[\s]*)\s*\[version\s(?<Version>[^\]]*)\]\s\[installed on (?<Date>[^\]]*)\]

 

0 Karma

richgalloway
SplunkTrust
SplunkTrust

@marnallHas better eyes than me and spotted the mix of italics and non-italics in the bracketed text.  The final regex likely will be a combination of our suggestions.

---
If this reply helps you, Karma would be appreciated.
0 Karma

richgalloway
SplunkTrust
SplunkTrust

This regular expression works in regex101.com using the sample data.

| rex field=pluginText "host:\s+(?<vendorSoftware>.+?)\s+\[(?<version>[^\]]+)] \[(?<installedDate>[^\]]+)"

It looks for the "host" introductory text and skips the spaces which follow.  The next set of text (terminated by whitespace before a left bracket) is the software name.  The text in the two sets of brackets become the version and date, respectively.

---
If this reply helps you, Karma would be appreciated.
0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

Build the Future of Agentic AI: Join the Splunk Agentic Ops Hackathon

AI is changing how teams investigate incidents, detect threats, automate workflows, and build intelligent ...

[Puzzles] Solve, Learn, Repeat: Character substitutions with Regular Expressions

This challenge was first posted on Slack #puzzles channelFor BORE at .conf23, we had a puzzle question which ...

Splunk Community Badges!

  Hey everyone! Ready to earn some serious bragging rights in the community? Along with our existing badges ...