I've been working on a search that I *finally* managed to get working that would look for events generated by a provided network switch and port name and then gives me all the devices that have connected to that specific port over a period of time. Fortunately, most of the device data is included alongside the events which contain the switch/port information.....that is....evenything except the hostname. Because of this, I've tried to use the join command to perform a second search through a second data set which contains the hostnames for all devices which have connected to the network and match those hostnames based on the shared MAC address field.
The search works, and that's great, but it can only work over a time period of about a day or so before the subsearch breaks past the 50k event limit. Is these anyway I can get rid of the join command and maybe use the stats command instead? That's what simialr posts to this one seem to suggest, but I have trouble wrapping my head around how the stats command can be used to correlate data from two different events from different data sets.....in this case the dhcp_host_name getting matched to the corresponding device in my networking logs. I'll gladly take any assistance. Thank you.
index="indexA" log_type IN(Failed_Attempts, Passed_Authentications) IP_Address=* SwitchID=switch01 Port_Id=GigabitEthernet1/0/13
| rex field=message_text "\((?<src_mac>[A-Fa-f0-9]{4}\.[A-Fa-f0-9]{4}\.[A-Fa-f0-9]{4})\)"
| eval src_mac=lower(replace(src_mac, "(\w{2})(\w{2})\.(\w{2})(\w{2})\.(\w{2})(\w{2})", "\1:\2:\3:\4:\5:\6"))
| eval time=strftime(_time,"%Y-%m-%d %T")
| join type=left left=L right=R max=0 where L.src_mac=R.src_mac L.IP_Address=R.src_ip
[| search index="indexB" source="/var/logs/devices.log"
| fields src_mac src_ip dhcp_host_name]
| stats values(L.time) AS Time, count as "Count" by L.src_mac R.dhcp_host_name L.IP_Address L.SwitchID L.Port_Id
Hi dtaylor,
You have seen this https://community.splunk.com/t5/Splunk-Search/How-to-compare-fields-over-multiple-sourcetypes-withou... since you are already thinking of using `stats`
The important thing is really to get a common field from the various data sets and use that in your stats in your case you could use the field `src_mac` as simple as
| stats values(*) AS * by _time src_mac
after your base search should work as long as you get src_mac for all data sets.
Hope this helps ...
cheers, MuS
To build on what @MuS says, here's a simple example that simulates two data sets, the switch data (index A) and the devices data (index B) and the stats command shows how to "join" on the two.
So, everything up to the last two lines is just setting up dummy data sets to model your example and then the search/stats does sort of what you are looking to do - you can just copy/paste this to a search window
| makeresults count=1000
| fields - _time
| eval SwitchID=printf("Switch%02d",random() % 5)
| eval Mac=printf("00-B0-D0-63-C2-%02d", random() % 10)
| eval index="A"
| append [
| makeresults count=1000
| fields - _time
| eval r=random() % 10
| eval Mac=printf("00-B0-D0-63-C2-%02d", r)
| eval dhcp_host_name=printf("Host%02d", r)
| eval index="B", source="/var/logs/devices.log"
| fields - r
]
| eval r=random() % 10
| sort r
| fields - r
``` Now we have a bunch of rows from index A and B```
| search (index="A" SwitchID=switch01) OR (index="B" source="/var/logs/devices.log")
| stats count values(dhcp_host_name) as dhcp_host_name values(SwitchID) as SwitchID by Mac
Hope this helps
Hi dtaylor,
You have seen this https://community.splunk.com/t5/Splunk-Search/How-to-compare-fields-over-multiple-sourcetypes-withou... since you are already thinking of using `stats`
The important thing is really to get a common field from the various data sets and use that in your stats in your case you could use the field `src_mac` as simple as
| stats values(*) AS * by _time src_mac
after your base search should work as long as you get src_mac for all data sets.
Hope this helps ...
cheers, MuS