Splunk Search

Automatically generating multivalue fields in generic Key=Value logs

JustinSC
Explorer

I've got a situation that I thought I understood but clearly don't. I have logs that look like this:

2021-11-22 14:00:00 Event=InventoryComplete ComputerName=Server1 ComputerName=Server2 ComputerName=ServerN

thought that ComputerName would automatically be a multivalue field due to there being multiple copies of that Key=Value pair and I'd be able to search any of the values. And I thought there are instances where this works automatically, but it's not right now.

| search sourcetype=inventory_audit ComputerName=Server1 ```works```
| search sourcetype=inventory_audit ComputerName=Server2 ```no results```
| search sourcetype=inventory_audit "ComputerName=Server2" ``` forcing text search works```

Is there something I can do to make these events implicitly multivalue? Ideally for the entire sourcetype regardless of the specific field name, as this sourcetype covers a wide variety of audit logs with different object classes.

Labels (2)
0 Karma

richgalloway
SplunkTrust
SplunkTrust

As you've discovered, Splunk does not default to using multi-value fields.  Once a field has a value, it is not replaced or appended unless specifically requested.

To request it, define a transform like this:

 

[mytransform]
REGEX = (\S+)=(\S+)
FORMAT = $1::$2
REPEAT_MATCH = true

 

then invoke the transform in props.conf.

This may not solve your problem, however, since many SPL commands don't work with multi-value fields.  You may have to modify your queries.

---
If this reply helps you, Karma would be appreciated.

JustinSC
Explorer

Thanks! As I thought about it further I must have been doing a search after something like...

| rex max_match=0 "ComputerName=(?<ComputerName>.+?)\b"

What is the $s in the FORMAT example you gave? Was that meant to be $2?

I should have also mentioned I have control of the scripts writing these logs and could write out the computer names in a denser format. I could also just generate more distinct events (repeat the event per computer name instead of shoving them all in one event) but I'm not sure what's worse; requiring the regex in the transform or generating more log data.

This gave me an idea too; perhaps I could call out the fields I want parsed as multivalue by giving them a suffix like ComputerName[]=Server1 ComputerName[]=Server2; unless there's a smarter way.

0 Karma

richgalloway
SplunkTrust
SplunkTrust

Yes, "$s" should have been "$2".  Thanks for catching that.  I've updated my answer.

If you have control over how the events are generated then I suggest generating them in a way that best fits how you plan to use the data (without painting yourself into a corner).  I prefer to have one event represent one thing that happened in one place.  If the thing happens in many places then many events would be generated.

Another option is to log the events in JSON format, which better handles multi-value fields.

---
If this reply helps you, Karma would be appreciated.
Get Updates on the Splunk Community!

Introducing Splunk Enterprise 9.2

WATCH HERE! Watch this Tech Talk to learn about the latest features and enhancements shipped in the new Splunk ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...