We have indexes per environment (e.g. prod, qa, dev), with all logs from instances of an application in a particular environment being forwarded to that index. In this way, we can find errors in production in a single index. As user requests originate in an externally facing application, and are shuttled about various back-end services, we can find it all in one Splunk index.
Still, the majority of our searches are isolated to one application. We have a search-time field extracted for the application: "app". It's actually determined via lookup from the parent folder of the log file. For example,
C:/logs/Admin/My.Admin.log has parent folder "Admin", which we look up to be "app" value "admin."
Since it's rare that we have to search across applications, our searches usually focus on a single app, like:
index=prod app=admin SomeText
This is so frequent that my knee-jerk reaction is to think, "I should make that an index-time field!". But every resource I come across, including this other answer, say not to do this. They all hint that there are reasons one might actually want an index-time field, but are vague about those reasons.
So, without assuming an index-time field is necessary, what is the best way to ensure that such common searches are as efficient as possible?
The way you are doing it is fine, but here is what I would suggest...
Since it's rare that we have to search
across applications, our searches
usually focus on a single app
This tells me that you will get a performance benefit by creating separate indexes for each application and routing your data to them accordingly. If you are looking for faster searches, this is your best bet.
I would also recommend a couple other things, namely:
BTW, if you route your data to separate indexes (which you would do using separate source-based stanzas in your inputs.conf file(s) and specifying different indexes) you will also be able to get rid of the lookup table.
Isn't shifting my indices to an app-axis instead of environment-axis just trading one for the other? Now I could search an app (e.g. 'admin'), but I'd have to have an environment field (e.g. "env=prod") somehow derived. And since I sometimes host more than one environment on a host, I can't even use host as a surrogate (e.g. vapptest1 hosts both the 'qa' and 'uat' environments).
Or are you really suggesting permutating the environment and application into indices (e.g. adminprod, adminqa, adminuat)?
Regarding making the searches specific, my staging and production environments are load balanced across multiple hosts. So, for example, the "production environment" is split across vappprod1, vappprod2, vapprod3 and vappprod4. If we choose to scale, we'll add more. I'd much prefer for production support people to be able to just say "prod" instead of having to remember which particular hosts a certain application is balanced across at that time.
That was one of the two main things driving me to split the indices by environment (e.g. index=prod). The other was being able to search a transaction across all apps/services (e.g. index=prod ActivityId=123abc789), regardless of app. The latter is rare, but still sometimes needed.