Splunk Search

Search Very Large Data set

hartfoml
Motivator

I need to search my firewall logs for the past year and find unique source names

I can do this search index=firewall policy_name=* | dedup policy_name

this still is looking at about 48 billion records to find unique policy_name

The reason we are doing this is to compare existing policy names which ones we have used in the past that may have been deleted.

Any ideas how to speed up the search of 48 billion events???

Tags (2)
0 Karma
1 Solution

kristian_kolb
Ultra Champion

Well the general idea is to be as specific as possible before the first pipe, i.e. specify host, sourcetype, index and earliest/latest. If your firewall index only contains information relevant to the search, then you can't really do much more than you already have.

Oh, and turn off field discovery / run in Fast Mode.

See some of the other posts regarding this as well.

http://splunk-base.splunk.com/answers/24082/best-tips-for-speeding-up-searches
http://splunk-base.splunk.com/answers/73941/search-performance-and-optimization

/k

View solution in original post

kristian_kolb
Ultra Champion

Well the general idea is to be as specific as possible before the first pipe, i.e. specify host, sourcetype, index and earliest/latest. If your firewall index only contains information relevant to the search, then you can't really do much more than you already have.

Oh, and turn off field discovery / run in Fast Mode.

See some of the other posts regarding this as well.

http://splunk-base.splunk.com/answers/24082/best-tips-for-speeding-up-searches
http://splunk-base.splunk.com/answers/73941/search-performance-and-optimization

/k

hartfoml
Motivator

Thanks this helped me to think of ways I could speed things up

0 Karma
Get Updates on the Splunk Community!

Automatic Discovery Part 1: What is Automatic Discovery in Splunk Observability Cloud ...

If you’ve ever deployed a new database cluster, spun up a caching layer, or added a load balancer, you know it ...

Real-Time Fraud Detection: How Splunk Dashboards Protect Financial Institutions

Financial fraud isn't slowing down. If anything, it's getting more sophisticated. Account takeovers, credit ...

Splunk + ThousandEyes: Correlate frontend, app, and network data to troubleshoot ...

 Are you tired of troubleshooting delays caused by siloed frontend, application, and network data? We've got a ...