Splunk Search

How to edit my configurations to extract a multivalue field from an extracted field?

reedmohn
Communicator

I am trying to extract fields for OpenDNS logs.
These come in a CSV format:

  "2015-01-01 20:39:57","client1","client1,site1","1.1.1.1","2.2.2.2","Allowed","1 (A)","NOERROR","www.google.com.","Search Engines"

The challenge here is that fields "identities" and "categories" are often multi-valued (also comma-separated).
I went off the idea from here: https://answers.splunk.com/answers/112311/multi-value-field-extraction.html

  1. Extract all the main fields
  2. Do a second transform to extract the multi-values

The first part works fine:

**props.conf:**
[opendns:dnslog]
REPORT-opendns-fields = opendns_aws_s3

**transforms.conf:**
[opendns_aws_s3]
DELIMS = ","
FIELDS = timestamp,granular_id,identities,internal_ip,external_ip,action,query_type,resp_code,domain,categories

But now I have not split "identities" and "categories".
So I added a second transform, to work on the categories field:

**props.conf:**
[opendns:dnslog]
REPORT-opendns-fields = opendns_aws_s3
REPORT-opendns-category = opendns_aws_s3_category

**transforms.conf:**
[opendns_aws_s3_category]
SOURCE_KEY=categories
DELIMS = ","
FIELDS = category
MV_ADD=true

Here I did something wrong, because this isn't working. I get no new field named "category", and the "categories" field is unchanged.
Should I maybe not have added the FIELDS= entry? This was to name the new field. But that was perhaps not a good idea?
How else can I name this as a new field?

0 Karma
1 Solution

woodcock
Esteemed Legend

Try this:

props.conf:

[opendns:dnslog]
REPORT-opendns-fields = opendns_aws_s3, opendns_aws_s3_category

transforms.conf:

[opendns_aws_s3]
DELIMS = ","
FIELDS = timestamp,granular_id,identities,internal_ip,external_ip,action,query_type,resp_code,domain,categories

[opendns_aws_s3_category]
SOURCE_KEY=categories
REGEX = ([^,]+)(?:,|$)
FORMAT = category::$1
MV_ADD=true

View solution in original post

koshyk
Super Champion

is the "categories" split by doublequotes-comma-doublequotes or just a comma? a more number of examples with multivalues would be great

0 Karma

reedmohn
Communicator

It's just the comma. Only the original field is enclosed in quotes.

Values vary a lot, some domains fit into 4-5 categories. Actual values may contain spaces and slashes.
Could be stuff like:
"Software/Technology,Business Services" (2 categories)
"Adult Themes,Nudity,Pornography,Sexuality" (4 categories)

(disappointingly, that last one shows up frequently just because we have a monitor running to confirm the filter is in place... sad, I know 😉 )

0 Karma

woodcock
Esteemed Legend

Try this:

props.conf:

[opendns:dnslog]
REPORT-opendns-fields = opendns_aws_s3, opendns_aws_s3_category

transforms.conf:

[opendns_aws_s3]
DELIMS = ","
FIELDS = timestamp,granular_id,identities,internal_ip,external_ip,action,query_type,resp_code,domain,categories

[opendns_aws_s3_category]
SOURCE_KEY=categories
REGEX = ([^,]+)(?:,|$)
FORMAT = category::$1
MV_ADD=true

reedmohn
Communicator

This was solved by clearing up the props.conf stanza:

This doesn't work:

[opendns:dnslog]
REPORT-opendns-fields = opendns_aws_s3
REPORT-opendns-category = opendns_aws_s3_category

This works:

[opendns:dnslog]
REPORT-opendns-fields = opendns_aws_s3, opendns_aws_s3_category 

Thanks to woodcock for the right syntax.

woodcock
Esteemed Legend

Yes, otherwise they are process in alphabetical order and your order was wrong ( c comes before f ).

0 Karma

reedmohn
Communicator

Thanks, I'll run that.
I expect regex will do the trick.
I was kinda hoping that since Splunk has a built in mechanism for handling delimited values, that would be the obvious and most efficient choice.

0 Karma

reedmohn
Communicator

Problem solved: I found the answer in your post, but in a different part than you might've intended...
I changed the props-conf stanza so that both transforms were on the same line.

That did it!

So, thanks for clearing up my syntax mistake 🙂

0 Karma

woodcock
Esteemed Legend

Yes, otherwise they are process in alphabetical order and your order was wrong ( c comes before f ).

0 Karma
Get Updates on the Splunk Community!

Developer Spotlight with Paul Stout

Welcome to our very first developer spotlight release series where we'll feature some awesome Splunk ...

State of Splunk Careers 2024: Maximizing Career Outcomes and the Continued Value of ...

For the past four years, Splunk has partnered with Enterprise Strategy Group to conduct a survey that gauges ...

Data-Driven Success: Splunk & Financial Services

Splunk streamlines the process of extracting insights from large volumes of data. In this fast-paced world, ...