Splunk Enterprise

Field Extraction

mbasharat
Contributor

Hi,

I have data set that is getting ingested from the source to Splunk. Using auto extraction for, fields are extracted as they should. In this data, I have a field name pluginText. This field contains a lot of information e.g. software installed on endpoints, updates installed etc. I need to extract this information from this field. Sample is below. What is the best approach? I need both from configuring field extraction for this in configs or in actual Splunk search using rex or eval.

pluginText: <plugin_output> The following software are installed on the remote host :
KB3171021 [version 12.2.5000.0] [installed on 2018/06/11]
Service Pack 3 for SQL Server 2014 (KB4022619) (64-bit) [version 12.3.6024.0] [installed on 2020/06/23] KB4052725 [version 12.2.5571.0] [installed on 2018/06/11]
Veritas NetBackup Client [version 8.1.2] [installed on 2020/05/18]
Windows Policy Checker 8.0.1 SQL Server 2014 Reporting Services [version 12.3.6024.0] [installed on 2020/06/23]
Microsoft Visual Studio 2015 Shell (Minimum) [version 14.0.23107] [installed on 2019/09/11]
Microsoft Visual Studio Tools for Applications 2015 Language Support - ENU Language Pack [version 14.0.23107.20] [installed on 2019/09/11]

The following updates are installed :
Microsoft .NET Framework 4 Multi-Targeting Pack : KB2504637 [version 1] [installed on 9/11/2019]
Microsoft Visual C++ 2010 x64 Redistributable - 10.0.40219 : KB2151757 [version 1] [installed on 6/8/2018]
KB2467173 [version 1] [installed on 6/8/2018] KB2565063 [version 1] [installed on 6/8/2018] KB982573 [version 1] [installed on 6/8/2018]
Microsoft Visual C++ 2010 x86 Redistributable - 10.0.40219 : KB2151757 [version 1] [installed on 6/11/2018]
KB2467173 [version 1] [installed on 6/11/2018] KB2565063 [version 1] [installed on 6/11/2018]
</plugin_output>

 

Thanks in-advance!!

Labels (1)
0 Karma
1 Solution

to4kawa
SplunkTrust
SplunkTrust
index=_internal |head 1 | fields _time _raw |eval _raw="pluginText: <plugin_output> The following software are installed on the remote host :
KB3171021 [version 12.2.5000.0] [installed on 2018/06/11]
Service Pack 3 for SQL Server 2014 (KB4022619) (64-bit) [version 12.3.6024.0] [installed on 2020/06/23] KB4052725 [version 12.2.5571.0] [installed on 2018/06/11]
Veritas NetBackup Client [version 8.1.2] [installed on 2020/05/18]
Windows Policy Checker 8.0.1 SQL Server 2014 Reporting Services [version 12.3.6024.0] [installed on 2020/06/23]
Microsoft Visual Studio 2015 Shell (Minimum) [version 14.0.23107] [installed on 2019/09/11]
Microsoft Visual Studio Tools for Applications 2015 Language Support - ENU Language Pack [version 14.0.23107.20] [installed on 2019/09/11]

The following updates are installed :
Microsoft .NET Framework 4 Multi-Targeting Pack : KB2504637 [version 1] [installed on 9/11/2019]
Microsoft Visual C++ 2010 x64 Redistributable - 10.0.40219 : KB2151757 [version 1] [installed on 6/8/2018]
KB2467173 [version 1] [installed on 6/8/2018] KB2565063 [version 1] [installed on 6/8/2018] KB982573 [version 1] [installed on 6/8/2018]
Microsoft Visual C++ 2010 x86 Redistributable - 10.0.40219 : KB2151757 [version 1] [installed on 6/11/2018]
KB2467173 [version 1] [installed on 6/11/2018] KB2565063 [version 1] [installed on 6/11/2018]
</plugin_output>"
| xmlkv
| rex mode=sed field=plugin_output "s/(?m)^.*following.*$//g"
| table plugin_output
| rex field=plugin_output max_match=0 "(?ms)(?<soft>\w.*?)\s\[version (?<version>[^\]]+)\]\s\[installed on (?<date>[^\]]+)\]"
| eval tmp=mvzip(soft,mvzip(version,date))
| stats count by tmp
| eval soft=mvindex(split(tmp,","),0), version=mvindex(split(tmp,","),1), date=mvindex(split(tmp,","),2)
| eval update=strftime(strptime(date,"%m/%d/%Y"),"%Y/%m/%d"), remote_install = if(isnotnull(update),NULL,date)
| table soft version remote_install update

View solution in original post

0 Karma

to4kawa
SplunkTrust
SplunkTrust
index=_internal |head 1 | fields _time _raw |eval _raw="pluginText: <plugin_output> The following software are installed on the remote host :
KB3171021 [version 12.2.5000.0] [installed on 2018/06/11]
Service Pack 3 for SQL Server 2014 (KB4022619) (64-bit) [version 12.3.6024.0] [installed on 2020/06/23] KB4052725 [version 12.2.5571.0] [installed on 2018/06/11]
Veritas NetBackup Client [version 8.1.2] [installed on 2020/05/18]
Windows Policy Checker 8.0.1 SQL Server 2014 Reporting Services [version 12.3.6024.0] [installed on 2020/06/23]
Microsoft Visual Studio 2015 Shell (Minimum) [version 14.0.23107] [installed on 2019/09/11]
Microsoft Visual Studio Tools for Applications 2015 Language Support - ENU Language Pack [version 14.0.23107.20] [installed on 2019/09/11]

The following updates are installed :
Microsoft .NET Framework 4 Multi-Targeting Pack : KB2504637 [version 1] [installed on 9/11/2019]
Microsoft Visual C++ 2010 x64 Redistributable - 10.0.40219 : KB2151757 [version 1] [installed on 6/8/2018]
KB2467173 [version 1] [installed on 6/8/2018] KB2565063 [version 1] [installed on 6/8/2018] KB982573 [version 1] [installed on 6/8/2018]
Microsoft Visual C++ 2010 x86 Redistributable - 10.0.40219 : KB2151757 [version 1] [installed on 6/11/2018]
KB2467173 [version 1] [installed on 6/11/2018] KB2565063 [version 1] [installed on 6/11/2018]
</plugin_output>"
| xmlkv
| rex mode=sed field=plugin_output "s/(?m)^.*following.*$//g"
| table plugin_output
| rex field=plugin_output max_match=0 "(?ms)(?<soft>\w.*?)\s\[version (?<version>[^\]]+)\]\s\[installed on (?<date>[^\]]+)\]"
| eval tmp=mvzip(soft,mvzip(version,date))
| stats count by tmp
| eval soft=mvindex(split(tmp,","),0), version=mvindex(split(tmp,","),1), date=mvindex(split(tmp,","),2)
| eval update=strftime(strptime(date,"%m/%d/%Y"),"%Y/%m/%d"), remote_install = if(isnotnull(update),NULL,date)
| table soft version remote_install update

View solution in original post

0 Karma

mbasharat
Contributor

I am looking into both responses and doing validations at my end. Will get back with you shortly. Just wanted to let you all know how much I appreciate your assistance ...... always! 

0 Karma

mbasharat
Contributor

Have been doing some validations and adjustments so apology for delay. I ended up using t4kawa's solution. Rich's solution is also good and I want to up-vote that but don't see an option in Splunk community.

0 Karma

richgalloway
SplunkTrust
SplunkTrust
Click the "thumbs-up" icon to up-vote a posting in this new forum.
---
If this reply helps you, an upvote would be appreciated.
0 Karma

richgalloway
SplunkTrust
SplunkTrust

This is a bit of a hack, but it will do the extractions at search time.  Index-time extraction is left as a exercise for the reader.  😉

| makeresults | eval pluginText="<plugin_output> The following software are installed on the remote host :
KB3171021 [version 12.2.5000.0] [installed on 2018/06/11]
Service Pack 3 for SQL Server 2014 (KB4022619) (64-bit) [version 12.3.6024.0] [installed on 2020/06/23] KB4052725 [version 12.2.5571.0] [installed on 2018/06/11]
Veritas NetBackup Client [version 8.1.2] [installed on 2020/05/18]
Windows Policy Checker 8.0.1 SQL Server 2014 Reporting Services [version 12.3.6024.0] [installed on 2020/06/23]
Microsoft Visual Studio 2015 Shell (Minimum) [version 14.0.23107] [installed on 2019/09/11]
Microsoft Visual Studio Tools for Applications 2015 Language Support - ENU Language Pack [version 14.0.23107.20] [installed on 2019/09/11]

The following updates are installed :
Microsoft .NET Framework 4 Multi-Targeting Pack : KB2504637 [version 1] [installed on 9/11/2019]
Microsoft Visual C++ 2010 x64 Redistributable - 10.0.40219 : KB2151757 [version 1] [installed on 6/8/2018]
KB2467173 [version 1] [installed on 6/8/2018] KB2565063 [version 1] [installed on 6/8/2018] KB982573 [version 1] [installed on 6/8/2018]
Microsoft Visual C++ 2010 x86 Redistributable - 10.0.40219 : KB2151757 [version 1] [installed on 6/11/2018]
KB2467173 [version 1] [installed on 6/11/2018] KB2565063 [version 1] [installed on 6/11/2018]
</plugin_output>" 
``` Above just creates text data"```
```Start by stripping out text that is not a plugin```
| rex mode=sed field=pluginText "s/\<plugin_output>.*\s:\n//"
| rex mode=sed field=pluginText "s/The following updates are installed\s://"
| rex mode=sed field=pluginText "s/\<\/plugin_output>//"
| rex mode=sed field=pluginText "s/\n{2,}//g"
| rex field=pluginText "(?<software>[\s\S]+)"
```Now parse the plugin parts into fields```
| rex field=software max_match=0 "(?<package>[^\[]+) \[version\s(?<version>[^\]]+)] \[installed on (?<installedOn>[^\]]+)]\s*"
`` Assemble the 3 multi-value fields into a single multi-value field then expand the result into separate events and break the events up again```
| eval packages = mvzip(package,mvzip(version, installedOn)) | mvexpand packages | eval packages=split(packages,",") 
```Pull the individual fields out of the multi-value field
| eval package=mvindex(packages,0), version=mvindex(packages,1), installedOn=mvindex(packages,2)
| table package, version, installedOn
---
If this reply helps you, an upvote would be appreciated.