I am trying to run the below search under a huge index that contains Cisco Firewall data:
index=ciscofirewall host="firewallname" | search action!="teardown" AND action!="success" AND action!="failure" AND vendorclass=acl AND srcinterface="Intra" | rex "access-list (?.*) +permitted" | lookup dnslookup clientip as srcip OUTPUT clienthost as SourceDNS | lookup dnslookup clientip as destip OUTPUT clienthost as DestDNS | rename srcip AS SourceIP, srcport AS SRCPort, destip AS DestIP, transport AS Protocol, destport AS DestPort | table SourceIP, SourceDNS, SRCPort, DestIP, DestDNS, Protocol, DestPort, ACL | dedup SourceIP,DestIP,Dest_Port
I need to run the search for old data that is being stored on a NAS storage as cold buckets. The search never completes. It gets stucked at around 91% of the time range scanned. The data is being split across 40 indexers. The searching environment is a SH cluster with 3 members.
Is there any way i can improve the searching performance? The job inspector shows that most of the time spent is for the command.search component.
you can do couple of things here:
1. narrow down the time range, so if for example you need to search from 9 months ago till a year ago, you can divide the search to 3 and search for every month. send the results to a summary index and after its all done, you can quickly search the summarized data.
2. try and use the fields command so the search will return only the fields needed something like this:
... | fields action vendor_class src_infra ...
3. i think splunk is better at looking for what is, than what is not. try and modify the
action!="teardown" AND action!="success" AND action!="failure"to
action=success OR action=blah OR action=whatever ...
4. you can modify limits.conf but will advise to do so only with professional help as it can have impact on yuor system
5. try to leave all the | rename aside for now and check if it helps
6. use the | dedup before | table from some testing i made it is faster that way and also makes sense, you dont want to table all results and then dedup them.
7. finally, here is my suggestion for your search:
index=cisco_firewall host="firewall_name*" | search action=foo OR action=bar AND action=whatever AND vendor_class=acl AND src_interface="Intra*" | fields action vendor_class src_interface field4 field5 ... fieldN | rex "access-list (?.*) +permitted" | lookup dnslookup clientip as src_ip OUTPUT clienthost as Source_DNS | lookup dnslookup clientip as dest_ip OUTPUT clienthost as Dest_DNS | rename src_ip AS Source_IP, src_port AS SRC_Port, dest_ip AS Dest_IP, transport AS Protocol, dest_port AS Dest_Port | dedup Source_IP,Dest_IP,Dest_Port | table Source_IP, Source_DNS, SRC_Port, Dest_IP, Dest_DNS, Protocol, Dest_Port, ACL
as a side note, maybe you can extract the field you are after with the | rex command
I assume you are searching in fast mode.
we are waiting for your feedback
Additionally, can you provide the search log from the search and let us know what it says. You should be able to track down where the problem is, most likely in a remote peer timeout or similar. ( NAS can be horribly fickle depending on the topology..)
Yes, the error is: Timed out waiting for peer indexerXX. If this occurs frequently, receiveTimeout in distsearch.conf may need to be increased. Search results might be incomplete!
I'm getting the timeout error for around 3 search peers every time.