I am using Splunk version 6.1.2 and running a simple search with index name. My search is resulting 27 lakh events for last 30 days, but it's taking too much time for execution. It's almost taking 2.03 minutes to execute the search, which I feel is a bit slow. Is this retrieval time good or bad? I definitely feel the customer experience would be not good letting the customer wait for 2 minutes to get the results.
Now even when I tried adding required fields to the search to get the exact results, 2 minute window is increasing and not reducing. Can anyone help why Splunk is taking this much time to return search results?
I have used the following search.
There are a couple of things to consider here.
First off, if I understand you correctly, you are returning 2.7 million raw events, from a time range of 30 days. A search like yours may look like a simple search, but from a technical perspecitve, a search returning raw events it is one of the slower searches you can run: every one of those individual 2.7 million events has to be fetched from disk. Compare the speed of this one:
index="summary" | stats count
In my case, a search with similar dimension (2.8 million events) took 97 seconds for the first one (without
stats) and 6 for the second (this is on a rather slow machine, I'm sure an adequate machine will be faster). This difference is because the second search does not need to fetch any events from disk. Rather, splunk works some internal magic with how it stores data and metadata to retrieve the information on how many events there are, and you get the result much quicker.
Now, I wouldn't say this is a limitation. Consider the use case for the types of search: the first one returns raw events for you to look through, determine interesting fields manually, check if your indexing and sourcetype definitions are correct, and to discover interesting stuff - generally the type of search you run when you want to specifically look at your raw data, i.e. what exactly you have indexed. For this task, it is pointless to return more than a few hundred to thousand events, because you will never be able to look through them. Keep in mind you can always refine searches with filtering mechanisms.
The second search on the other hand returns statistics, so a summarized description of your data. You use it to aggregate information into an overview of your data. So in comparison, the first search is used to look at your data, while the second search is to have some statistics about your data. The first one is generally more suited to work with smaller data samples, while the second one is used on larger samples.
Applying this to your customer situation, I would never advise a customer to run a search for raw data on such large time spans, knowing that this is neither a nice experience (the search will take time to run for larger amounts of data) and that this will never provide any benefit over searching a smaller amount of data.
So what should you do? As a quick fix, if you really have to look through larger amounts of raw data, you could consider using "Fast Mode" below the time range picker. This will disable any field extraction and other search-time knowledge enrichment and thus finish much faster (the above search took 39 seconds in my case, so less than half of the initial value). It will however leave you without any fields, so you can really only look at the raw data.
In general, I would advise you to learn when to use which search, and for which purpose. A raw event search is most sensible when used to check a given sample of your data (say, the last day or so) for specific requirements: Am I missing any data I would like to see? Are my line breaking settings and field extractions working? I got a call that something didn't work yesterday at around 12:30, what exactly happened from 12:20 to 12:40?
In any other case where you do not need to look at individual events, you should learn to use the many methods splunk offers to work with your data - statistics, most prominently, but also have a look at how to work with fields and other knowledge objects (on a side note, if you find your statistics to be slow because you are running them over long time frames, have a look at report acceleration and summary indexing). It then becomes a question of asking the right questions (forming the right queries). Your initial search is simply very broad, so the answer is very lengthy and broad as well.
Feel free to come back with any further questions, we're happy to help you out 🙂
Awesome answer @jeffland 🙂 As a supplement, a similar/identical question was asked before with a great answer as well http://answers.splunk.com/answers/225289/why-does-a-simple-splunk-search-such-as-indexabc-t.html