I have some very large lookup tables for known bad domains.(4m+ entries)
the lookup has a field called 'kapchk' which i want to match two searchtime extractions with. one is kapuri the other kaptld.
To get them both to match i figured id use an alias for them; kapchk.
Therefore my props.conf and transform.conf would look like this:
[access_combined_wcookie] REPORT-kap_uri = kapersky_uri REPORT-kap_tld = kapersky_tld FIELDALIAS-kapersky_uri = kap_uri AS kap_chk FIELDALIAS-kapersky_tld = kap_tld AS kap_chk
[ss2url_lookup] filename = ss2url_lookup.csv case_sensitive_match = false [ss1url_lookup] filename = ss1url_lookup.csv case_sensitive_match = false [kapersky_uri] SOURCE_KEY = uri REGEX = (?:( http\:\/\/www\.|\w+:\/\/|www\.|)(?<kap_uri>.+)) [kapersky_tld) SOURCE_KEY = uri REGEX = (http|https|ftp)?(:\/\/)?(www\.)?(?<kap_tld>.+?)(\/|:).*
In you're opinions is this the correct way to go about achieving this?
in the end it should mean i can do the following:
sourcetype="access_combined_wcookie" AND kap_chk="*" | dedupe kap_chk, | lookup ss1url_lookup kap_chk OUTPUT masktype maskid kap_chk | table masktype, maskid, kap_chk
You have not stated your goal but I assume it is that if
kap_tld are the same, only do the lookup once, then this will work efficiently:
sourcetype="access_combined_wcookie" | eval kap_chk=if(kap_uri==kap_tld,kap_uri,kap_uri . ":::" . kap_tld) | makemv delim=":::" kap_chk| mvexpand kap_chk | dedup kap_chk, | lookup ss1url_lookup kap_chk OUTPUT masktype maskid kap_chk | table masktype, maskid, kap_chk
The documentation says:
Note: Splunk Enterprise's field aliasing functionality does not currently support multivalue fields.
This probably also means that if your events have both fields, then
kap_chk will only be set once to one of them (probably the first time and then the second alias will be thwarted by the fact that
kap_chk already exists). But even if you can create a multi-value field with
FIELDALIAS, It still will not work with the lookup; you will still have to pass the stream through
mvexpand. If you need something "automatic-ish" then I suggest creating a
macro out of my solution and then always calling the macro.
My aim is to do a search time extraction on the uri field as defined by the custom field kapuri and kaptld. I would then like to alias these fields to kapchk so both fields can be matched against the lookup table field kapchk.