To make sure that you have defined your use case and tech path clearly, let me highlight several factors that are not super clear to me. (Trust me, explaining to someone with less intimate knowledge ...
See more...
To make sure that you have defined your use case and tech path clearly, let me highlight several factors that are not super clear to me. (Trust me, explaining to someone with less intimate knowledge will help you find the right path. I was in your position.) You clarified that a major performance inhibitor is the 10-mil/day index _network. Could you do some performance test to see if the join or eventstats contribute much to slowdown? In other words, is your original main (outer) search, index=_network snat IN (10.10.10.10*,20.20.20.20*) | bucket span=1m _time | eval client_ip = mvindex(split(client, ":"), 0) | stats count by _time snat client_ip, much faster than your original join, or the solution that is working? If they are comparable, constantly constructing a lookup is just not worth the trouble. Just as fundamentally, I just realized that my previous search was mathematically different from your original joint search. Your original search has a common field name count in both outer search and subsearch. I am not sure whether a mathematical definition exists in this condition but Splunk will output count from the subsearch, i.e., from unrestricted | stats count by _time snat client_ip. Given that the subsearch could be millions of events a day, it could be many times bigger than the count from my previously proposed search. My search gives the count of matching events ONLY. Which count is needed? In the following, I will revert to your original math. Because you need to count _network events unrestricted, I have even more doubt wehther using lookup (or any other method) will really improve performance. I deduce (mind-read:-) from you original joint search that you expect to embed the search in a dashboard with IP (Source_Network_Address, matching snat) and Account_Name selectors. Is this correct? How narrow your selections are can have profound impact on performance. (I will embed comments in places where you should insert the tokens.) You say index _ad is small, and you want to turn it into a lookup table in order to speedup search. So far only you know how this table is constructed, thus limiting other people's ability to help you. I will construct a reference implementation below so we are on the same page. (For clarity, I will call the table index_ad_lookup.) On the subject of this lookup table , you say inclusion of time buckets are paramount. I want to remind you that using a lookup containing time will limit how close your search can run against the latest production of lookup. Unlike an index search, you can only produce the table in fixed intervals. Let's say you want time bucket to be 5 minute, and you produce the lookup every 5 minutes. This would mean that the closet match you can get can be up to 5-minute old. Is this acceptable? There is also a question of cost. If you want 1-minute time bucket, are you willing to refresh the lookup every minute? The search that produces this lookup will also need its search interval to match your final search interval. If you are looking an interval of up to 24 hours, running a 24-hour search every minute can be taxing for even a small index. Aside from that, search interval interval to produce lookup also limits the maximum search interval you can run the main search, i.e., if the lookup is produced with a 24-hour search, the maximum your _network search can run is 24-hours. Is this acceptable? Reference implementation of lookup table Now, assuming that you still want to pursue the lookup path, here is my reference implementation of the table before I propose what I see as an efficient matching search. index=_ad (EventCode=4625 OR (EventCode=4771 Failure_Code=0x18))
| bucket span=1m _time
| eval Account_Name4625= case(EventCode=4625,mvindex(Account_Name,1))
| eval Account_Name4771= case(EventCode=4771,Account_Name)
| eval Account_Name = coalesce(Account_Name4771, Account_Name4625)
| eval snat = Source_Network_Address +":"+Source_Port
| eval DCName=mvindex(split(ComputerName, "."), 0)
| stats count by _time snat Account_Name EventCode DCName
| outputlookup index_ad_lookup This is effectively your original outer search with restrictions on snat and Account_Name removed. Note my reference table name is index_ad_lookup. Using index_ad_lookup If you can keep the lookup fresh enough to suite your needs, this is how to use it to match index _network and add Account_Name, etc. index=_network ```snat IN ($snat_tok$)```
| bucket span=1m _time
| eval client_ip = mvindex(split(client, ":"), 0)
| stats count by _time snat client_ip
| lookup index_ad_lookup snat _time ``` add Account_Name, EventCode, DCName where a match exists ``` The comment is a speculation of how you may eliminate events with input token. Because your original join does not require the events to have a match with index _ad, I seriously doubt if this will have better performance. (In fact, I had already written a search that requires events to have a match in order to be counted before I realized what your original join was doing. That would have improve performance if matching sets are small.) Alternative search without lookup or join? I was also making an alternative search based on possible event reduction by requiring match between _network and _ad before I realized the mathematical difference. If your requirement is to count all _network events, just add Account_Name, etc. where a match exists, any alternative will probably perform similar to the join command. Like this one: index=_network snat IN (10.10.10.10*,20.20.20.20*)
| bucket span=1m _time
| eval client_ip = mvindex(split(client, ":"), 0)
| stats count by _time snat client_ip
| append
[search index=_ad (EventCode=4625 OR (EventCode=4771 Failure_Code=0x18)) Account_Name=JohnDoe Source_Network_Address IN (10.10.10.10 20.20.20.20)
| bucket span=1m _time
| eval Source_Network_Address1 = case(EventCode==4771, trim(Client_Address, "::ffff:")) ``` this field is not used ```
| eval Account_Name4625= case(EventCode=4625,mvindex(Account_Name,1))
| eval Account_Name4771= case(EventCode=4771,Account_Name)
| eval Account_Name = coalesce(Account_Name4771, Account_Name4625)
| eval snat = Source_Network_Address+":"+Source_Port
| eval DCName=mvindex(split(ComputerName, "."), 0)
| stats count as count_ad by _time snat Account_Name EventCode DCName]
| stats values(Account_Name) as Account_Name values(EventCode) as EventCode values(DCName) as DCName by _time snat client_ip count In short, you need to clarify whether you want to count all events from index _network or only count events that find a match in index _ad, or maybe every event is a match in which case there is no difference. Because the main performance inhibitor is the number of events in _network, there is little to be gained if the requirement is not to restrict events.