Hi splunkers. Im running Splunk v6.4.3 and I need to match the output from a normal sourcetype="cisco:syslog" search to a specific list. In other words, only the logs that match that list should show up on the output. I've read plenty of forums and blogs, trying a lot of different combinations without success. It seems pretty simple actually, I am trying to achieve this for much longer list of OR'd items:
sourcetype="cisco:syslog" (item1 OR item2 OR item4 OR itemN)
BUT in this case the "itemN" is a list of dozens or hundreds of items (they're actually hardware/MAC Addresses), an endless list of OR'd items basically, which is why a list is needed. The list is a CSV file imported into the /opt/splunk/etc/apps/search/lookups folder as filename "mac_suspect.csv", and looks like this:
DeviceName MacAddress device1 11:22:33:ff:33:aa:cc device1 22:22:33:aa:33:aa:22 device1 33:22:33:bb:33:aa:bb device1 44:22:33:dd:33:aa:11 ..... .......
When I run "| inputlookup mac_suspect.csv " on Search the list comes up in the Statistics tab, no issues there. When I run sourcetype="cisco:syslog" all those syslogs come up, and I've verified over and over that my test devices (test MAC addresses) are there and they match if i just look for ' sourcetype="cisco:syslog" 33:44:55:66:77:88 '. No issues there. What im trying to say is the MAC addresses from the list are in the search results, and the list pulls up fine using inputlookup - it is available to Splunk for use AFAIK, and it has MAC addresses from the list, so there should be matches.
The search string Im using to try to get only the matching lines is:
sourcetype="cisco:syslog" | lookup mac_suspect.csv MacAddress
Which says to me: filter all logs from this sourcetype and output ONLY those logs that match one of the items on the lookup list "mac_suspect.csv" colum "MacAddress".
What happens is that the search runs, but it doesnt filter anything at all, it just outputs all the syslogs from cisco devices, as if the match is a wildcard (*) or something. From everything I've read, this specific search string should apply the filter from column MacAddress, but it does not seem to do so. I've tried two dozen combinations of this, and cannot get it to work. Am I missing something, is there a better way to do this? Thanks.
You're on the right track, but not quite there yet.
lookup enriches search results, roughly like an SQL left join. Take search results, match field from search result with column from lookup table, add other columns from lookup table as fields to the search results.
lookup doesn't filter.
In order to correctly match up your two sets of data, make sure the
MacAddress field is correctly extracted from your syslog data. Once you have done that, you have several options.
Quick and dirty:
sourcetype=cisco:syslog [inputlookup mac_suspect.csv | fields MacAddress]
That will build a huge OR'd string of all entries in your csv, and add that to your search. To confirm, open the job inspector and look at for example the normalizedSearch value. If your list is enormous the resulting search string will also be enormous.
More polished: Make sure the lookup has a column marking an entry as "interesting", "suspicious", whatever. Define a lookup definition based on your csv file, and an automatic lookup for that sourcetype based on the lookup definition. Have the automatic lookup use the mac address as input, and that marker column as output. Then search like
sourcetype=cisco:syslog marker=suspicious - underneath, Splunk will do the magic necessary to reverse-resolve the lookup to only load matching mac addresses.
Hi Martin. Thanks so much for your response. Just focusing on the quick and dirty, this does not work. Im wondering if there is a bug, because I tried both:
sourcetype="cisco:syslog" [ inputlookup mac_suspect.csv | fields MacAddress ]
sourcetype="cisco:syslog" [ | inputlookup mac_suspect.csv | fields MacAddress ]
and to answer one of your questions, there is are a couple of fields that already parse out MAC addresses on the sourcetype, EapMacAddress and src_mac, so i know those are correctly extracted. It might also be significant that if I use ONLY:
| inputlookup mac_suspect.csv | fields MacAddress
This outputs the .csv list but only the MacAddress column.
Regarding the more polished solution, I get lost at "define a lookup definition based on your csv file and an automatic lookup for that sorucetype based on the lookup definition" I'll research that separately, but i wanted to see the quick & dirty work - the fact it doesnt makes me think there could be something buggy with lookups or matches on this version maybe??
The QnD version expects a field called
MacAddress in your syslog data. If the name is different, you can add a
| rename MacAddress AS yourotherfieldname after the
| fields MacAddress to make the field names match up.