Splunk Search

Automatically generating multivalue fields in generic Key=Value logs

JustinSC
Explorer

I've got a situation that I thought I understood but clearly don't. I have logs that look like this:

2021-11-22 14:00:00 Event=InventoryComplete ComputerName=Server1 ComputerName=Server2 ComputerName=ServerN

thought that ComputerName would automatically be a multivalue field due to there being multiple copies of that Key=Value pair and I'd be able to search any of the values. And I thought there are instances where this works automatically, but it's not right now.

| search sourcetype=inventory_audit ComputerName=Server1 ```works```
| search sourcetype=inventory_audit ComputerName=Server2 ```no results```
| search sourcetype=inventory_audit "ComputerName=Server2" ``` forcing text search works```

Is there something I can do to make these events implicitly multivalue? Ideally for the entire sourcetype regardless of the specific field name, as this sourcetype covers a wide variety of audit logs with different object classes.

Labels (2)
0 Karma

richgalloway
SplunkTrust
SplunkTrust

As you've discovered, Splunk does not default to using multi-value fields.  Once a field has a value, it is not replaced or appended unless specifically requested.

To request it, define a transform like this:

 

[mytransform]
REGEX = (\S+)=(\S+)
FORMAT = $1::$2
REPEAT_MATCH = true

 

then invoke the transform in props.conf.

This may not solve your problem, however, since many SPL commands don't work with multi-value fields.  You may have to modify your queries.

---
If this reply helps you, Karma would be appreciated.

JustinSC
Explorer

Thanks! As I thought about it further I must have been doing a search after something like...

| rex max_match=0 "ComputerName=(?<ComputerName>.+?)\b"

What is the $s in the FORMAT example you gave? Was that meant to be $2?

I should have also mentioned I have control of the scripts writing these logs and could write out the computer names in a denser format. I could also just generate more distinct events (repeat the event per computer name instead of shoving them all in one event) but I'm not sure what's worse; requiring the regex in the transform or generating more log data.

This gave me an idea too; perhaps I could call out the fields I want parsed as multivalue by giving them a suffix like ComputerName[]=Server1 ComputerName[]=Server2; unless there's a smarter way.

0 Karma

richgalloway
SplunkTrust
SplunkTrust

Yes, "$s" should have been "$2".  Thanks for catching that.  I've updated my answer.

If you have control over how the events are generated then I suggest generating them in a way that best fits how you plan to use the data (without painting yourself into a corner).  I prefer to have one event represent one thing that happened in one place.  If the thing happens in many places then many events would be generated.

Another option is to log the events in JSON format, which better handles multi-value fields.

---
If this reply helps you, Karma would be appreciated.
Get Updates on the Splunk Community!

AppDynamics Summer Webinars

This summer, our mighty AppDynamics team is cooking up some delicious content on YouTube Live to satiate your ...

SOCin’ it to you at Splunk University

Splunk University is expanding its instructor-led learning portfolio with dedicated Security tracks at .conf25 ...

Credit Card Data Protection & PCI Compliance with Splunk Edge Processor

Organizations handling credit card transactions know that PCI DSS compliance is both critical and complex. The ...