We doing a search on index again a one field
index=cog-nativedatastore-nonprod AND source="/logs/uamdsgl/nds-app-subscription-service/splunk-integrator/splunk-application." | search tracking.system=itrac
When inspecting the job
This search has completed and has returned 4,025 results by scanning 106,856 events in 9.964 seconds.
I believe this is slow. The internal Splunk SME mentioned that the field I searched on is already indexed based on the screenshot below. Is there any other way to improve the performance on this search?
Have you tried it this way?
index=cog-nativedatastore-nonprod AND source="/logs/uamdsgl/nds-app-subscription-service/splunk-integrator/splunk-application." tracking.system="itrac"
@DalJeanis @niketnilay. Are those interesting fields actual (SQL like) indexes in Splunk? Can we specifically add "indexes" on fields?
@echen6... indexes in Splunk in not same as indexes in SQL. Every type of data fed to Splunk is Indexed i.e. indexes in Splunk can be treated more like Databases where data is stored (actually there is whole lot to how indexing works (http://docs.splunk.com/Documentation/Splunk/latest/Indexer/HowSplunkstoresindexes).
Since Splunk stores time series data from any kind of structured or unstructured data, sourcetype is something you use to define what kind of data you are ingesting. So sourcetype you can compare to tables in the database. Again it is not that simple ( it provides Schema-on-the-fly with search time field discovery). Smart Mode and Verbose Modes enable the same in Search Time. By default it is based on each Key Value pair or any field extractions/transformations you have defined. It can be changed to json, xml or can be turned off based on your data. Please go through the Splunk Tutorial course on Splunk Education for a quick start with Splunk:
http://www.splunk.com/view/SP-CAAAH9U
Also for transitioning from SQL to Splunk Processing Language (SPL) you can refer to the following:https://docs.splunk.com/Documentation/Splunk/latest/SearchReference/SQLtoSplunk
I just want to caveat what @niketn said about "Every type of data fed to Splunk is Indexed". That's not literally true in the way you might think. Every word is indexed, but not every field is indexed. You can have index-time extracted fields, which ARE indexed, and search-time extracted fields, which are NOT.
So, for example, if you are looking for all records where pet_type is "dog", and pet_type is a search-time field, then the system will have to scan all the records with the word "dog" in them - including the records for old TV shows about "Dog the Bounty Hunter", movie reviews about Benji and Cujo and Lady & the Tramp, and references to Congressional testimony by lawyer Lloyd Doggett to see if the value "dog" is in the pet_type field.
The only way to add indexes on search-time extracted fields is to change them to index-time extraction... which should be done with caution, because it is not retroactive, and slows down indexing.
Have you tried it this way?
index=cog-nativedatastore-nonprod AND source="/logs/uamdsgl/nds-app-subscription-service/splunk-integrator/splunk-application." tracking.system="itrac"
Thanks @DalJeanis. Now the search time is 1/3 of what it was before after I put the filter in base search.
@niketnilay, it didn't gain much on adding sourceType as I only have one sourceType in this index.
And switching to fast mode brought 25% performance gain in this case!
Nice. The performance gain from fast mode is usually going to be even better than that, but in this unique case, where your filters had all been moved to the base search and there wasn't a lot of calculation being done on the records before they were rejected, the savings was less than usual.
I second @DalJeanis's suggestion... including as much filters in base search is better than filtering the results afterwards. Your search will fetch 100K results from index and then filter only 4K results. While DalJeanis's search will fetch only 4K events to start with.
Try to include sourcetype also in your base search. While pulling the results splunk will know data type.
Finally since you are seeing all the interesting fields, it implies you are running search in Verbose mode. Once you have identified/analyzed your interesting fields, you should run the searches in Fast Mode.
Refer to following docs:
https://docs.splunk.com/Documentation/Splunk/latest/Search/Quicktipsforoptimization
http://docs.splunk.com/Documentation/Splunk/latest/Search/Writebettersearches
If these do not resolve performance do let us know more details as it might be due graver infrastructure issues like index bucket size.
plus 1 for Fast Mode
. If you include all fields you are interested in the base search like this field=*
splunk will extracted for you even in Fast Mode