Splunk Search

How normalize field values that have slightly different field values? Regex? Match? Replace?

UMDTERPS
Communicator

Hi! 👨‍💼


I am a little stuck on how to normalize "Operating System" data I have.  Currently, we have a field called "Operating System" our data looks something like this:

 

Operating System

Windows 10        Enterprise
Windows 10 
Windows 10 enterprise 
Windows 10 
windows 10 
windows 10 20H2
Windows 10 V2004
windows 10 2004
Windows Server 
windows server
RHEL8
RHEL 8
rhel8
rhel 8
rhel 8.6
Linux Server rel 8
Windows 2012r2
Windows Server 2012 R2
Windows Server 2012

 

After I did a stats count (because data isn't normalized) we have 170+ operating systems.  What is the most efficient way to normalize data without writing 170+  "replace" or "match" statements?

For example, how would I make the following just "RHEL 8": 

RHEL8
RHEL 8
rhel8
rhel 8
rhel 8.6
Linux Server rel 8

Thanks!

Labels (3)
0 Karma
1 Solution

ITWhisperer
SplunkTrust
SplunkTrust
| eval os=case(match(os,"(?i)rhel\s*8[\d\.]*"),"RHEL 8",match(os,"Linux Server rel 8"),"RHEL 8",match(os,"(?i)\s*windows\s10.*"),"Windows 10",match(os,"Windows (|Server )2012.*"),"Windows Server 2012",1==1,os)

and so on

View solution in original post

ITWhisperer
SplunkTrust
SplunkTrust

What do you mean by "normalize"? Do you want all the Windows * operating systems to be simply Windows, and all the rest to be *nix for example?

0 Karma

UMDTERPS
Communicator

For example, how would I make the following just "RHEL 8": 

RHEL8
RHEL 8
rhel8
rhel 8
rhel 8.6
Linux Server rel 8

For example, how would I make the following just "Windows 10":

 Windows 10 Enterprise
Windows 10
Windows 10 enterprise
Windows 10
windows 10
windows 10 20H2
Windows 10 V2004
windows 10 2004

For example, how would I make the following just "Windows Server 2012":

Windows 2012r2
Windows Server 2012 R2
Windows Server 2012

Etc...

Thanks

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust
| eval os=case(match(os,"(?i)rhel\s*8[\d\.]*"),"RHEL 8",match(os,"Linux Server rel 8"),"RHEL 8",match(os,"(?i)\s*windows\s10.*"),"Windows 10",match(os,"Windows (|Server )2012.*"),"Windows Server 2012",1==1,os)

and so on

UMDTERPS
Communicator

Doesn't seem to be working. For example, the field we have for OS is called "Operating System" and there is one entry that is "RHEL 8."  The following SPL, 

 

|eval "os"=case(match("os","RHEL 8"),"RHEL 8")
|fields ip "system" os

 


The search runs, no errors, but the search returns  nothing for "os:"

IP                         system    os
192.168.1.1      ABC 

"os" is blank, any ideas?

Thanks!

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

If your field name has spaces in you need to enclose it is single quotes not double quotes.

UMDTERPS
Communicator

Ahh Yes! Thanks!  It works now!  Karma Granted!

0 Karma
Get Updates on the Splunk Community!

Building Reliable Asset and Identity Frameworks in Splunk ES

 Accurate asset and identity resolution is the backbone of security operations. Without it, alerts are ...

Cloud Monitoring Console - Unlocking Greater Visibility in SVC Usage Reporting

For Splunk Cloud customers, understanding and optimizing Splunk Virtual Compute (SVC) usage and resource ...

Automatic Discovery Part 3: Practical Use Cases

If you’ve enabled Automatic Discovery in your install of the Splunk Distribution of the OpenTelemetry ...