Getting Data In

lookup tables: 2 sources fileds aliased to match 1 lookupfield

borgy95
Path Finder

I have some very large lookup tables for known bad domains.(4m+ entries)

the lookup has a field called 'kap_chk' which i want to match two searchtime extractions with. one is kap_uri the other kap_tld.
To get them both to match i figured id use an alias for them; kap_chk.

Therefore my props.conf and transform.conf would look like this:
props.conf

[access_combined_wcookie]
REPORT-kap_uri = kapersky_uri
REPORT-kap_tld = kapersky_tld
FIELDALIAS-kapersky_uri = kap_uri AS kap_chk
FIELDALIAS-kapersky_tld = kap_tld AS kap_chk

transforms.conf

[ss2url_lookup]
filename = ss2url_lookup.csv
case_sensitive_match = false

[ss1url_lookup]
filename = ss1url_lookup.csv
case_sensitive_match = false

 [kapersky_uri]
 SOURCE_KEY = uri
 REGEX = (?:( http\:\/\/www\.|\w+:\/\/|www\.|)(?<kap_uri>.+))

 [kapersky_tld)
 SOURCE_KEY = uri
 REGEX = (http|https|ftp)?(:\/\/)?(www\.)?(?<kap_tld>.+?)(\/|:).*

In you're opinions is this the correct way to go about achieving this?
in the end it should mean i can do the following:

sourcetype="access_combined_wcookie" AND kap_chk="*" | dedupe kap_chk, | lookup ss1url_lookup kap_chk OUTPUT masktype maskid kap_chk | table masktype, maskid, kap_chk

thanks

0 Karma
1 Solution

woodcock
Esteemed Legend

You have not stated your goal but I assume it is that if kap_uri and kap_tld are the same, only do the lookup once, then this will work efficiently:

sourcetype="access_combined_wcookie" | eval kap_chk=if(kap_uri==kap_tld,kap_uri,kap_uri . ":::" . kap_tld) | makemv delim=":::" kap_chk| mvexpand kap_chk | dedup kap_chk, | lookup ss1url_lookup kap_chk OUTPUT masktype maskid kap_chk | table masktype, maskid, kap_chk

`

View solution in original post

0 Karma

borgy95
Path Finder

My aim is to do a search time extraction on the uri field as defined by the custom field kap_uri and kap_tld. I would then like to alias these fields to kap_chk so both fields can be matched against the lookup table field kap_chk.

0 Karma

woodcock
Esteemed Legend

You have not stated your goal but I assume it is that if kap_uri and kap_tld are the same, only do the lookup once, then this will work efficiently:

sourcetype="access_combined_wcookie" | eval kap_chk=if(kap_uri==kap_tld,kap_uri,kap_uri . ":::" . kap_tld) | makemv delim=":::" kap_chk| mvexpand kap_chk | dedup kap_chk, | lookup ss1url_lookup kap_chk OUTPUT masktype maskid kap_chk | table masktype, maskid, kap_chk

`

0 Karma

woodcock
Esteemed Legend

The documentation says:

Note: Splunk Enterprise's field aliasing functionality does not currently support multivalue fields.

This probably also means that if your events have both fields, then kap_chk will only be set once to one of them (probably the first time and then the second alias will be thwarted by the fact that kap_chk already exists). But even if you can create a multi-value field with FIELDALIAS, It still will not work with the lookup; you will still have to pass the stream through mvexpand. If you need something "automatic-ish" then I suggest creating a macro out of my solution and then always calling the macro.

0 Karma
Get Updates on the Splunk Community!

Observe and Secure All Apps with Splunk

  Join Us for Our Next Tech Talk: Observe and Secure All Apps with SplunkAs organizations continue to innovate ...

Splunk Decoded: Business Transactions vs Business IQ

It’s the morning of Black Friday, and your e-commerce site is handling 10x normal traffic. Orders are flowing, ...

Fastest way to demo Observability

I’ve been having a lot of fun learning about Kubernetes and Observability. I set myself an interesting ...