Several possible approaches. Just be sure to test and proceed with caution, especially if choosing option I or III.
Option I: Create a separate index for each site.
For each site, create a new index with a name matching the site.
Then just set the list of which ones should be searched by default, either by setting srchIndexesDefault in authorize.conf or through the Manager's Role settings. See How do I Set the Default Index? and authorize.conf
Now, you can search on index=NY , etc.
For existing data, you can re-index, but you probably don't need to. Take a look at this thread for some hints on moving data into a new index. Or, you can just live with the existing data in the default main index until it ages out.
See here for more information:
http://answers.splunk.com/questions/5479/how-to-rename-an-index
Note that the link is for version 4.1.4 and earlier. I suspect that it will still work, but have not verified -- there may be other implications depending on version.
Since this can impact your existing data, you'll definitely want to test ahead of time and verify that your procedure works as you expect. It would also be a good idea to contact Splunk support and ask them to review your plan.
Option II: Create an eventtype for each site.
You still need to know the hostnames initially, but once it's configured you will only need to update the eventtype definition when things change.
Technically, this still requires identifying a list of hosts or other criteria per-site, but now you only have to manage it at the eventtype level, not per-search or per-view.
Option III: Create an indexed field
Usually not recommended, but may work well in your situation. It will increase the size of the index somewhat and could have implications for search performance. And, of course, it's a mostly permanent choice.
Option IV: Use a lookup table to map between host and site
Listed only for completeness. This option sounds good in theory, but is likely to kill performance, since trying to search by site will likely scan across all events before triggering the lookup. It technically accomplishes what you want, but will utterly destroy perfomance.
Option V: Adopt a naming convention for hosts that includes site, or use IP addresses for host
Probably not realistic, but worth mentioning. If, e.g., all of your LA machines are named 'LA-XXXXX' or have an ip address in 10.1.2.XXX, then it's easy to do a search on host="LA-*" or similar.
If you decide to go this route, you can use a lookup (scripted or static) to resolve the hostname for display purposes. It does create significant performance issues if you want to regularly search based on hostname.
... View more