Hi I am still confused about ITSI entity management and best practices after taking the training. Can someone enlighten me on this?
Here is an example to motivate my confusions.
Suppose I plan to create 2 services that monitor health of a web server, the first one called A focuses on IT metrics such as error rates, response times etc; the second one called B focuses on the VM OS performances such as cpu and memory etc.
As I have imported all splunk forwarders as entities. The server is already there as an entity E1 which has an alias hostname=web01.
Now, in the IIS/apache logs, the server can be found by "host=web01";
In the OS performance logs, the server can be found by "host=web-host';
My question is what is best practices to manage entities in this case.
1. Should I have created 2 other entities with different aliases? That is the second one E2 which has alias "hostname=web01" and E3 which has alias "host=web-host".
2. Or should I somehow normalize the data such that there is only 1 entity. If so, how?
First, these shouldn't be two services. Services are higher level 'groups' of things. You will have one service with diff KPI's. when entities are imported you pick what is the entity title, as long as both hostname and host say 'web01' it doesn't matter if the field is host or hostname because the entity title will be web01. Do not create duplicate entities. I like to create searches that are broad and then normalize the field that has the entity name in it in order to make sure I get a clean set of results. For example: index=main sourcetype=weblog |eval tmp_entity=host In my next import I use index=windows sourcetype=wineventlog |eval tmp_entity=host. If you do this each search will have an alias field of tmp_entity and then in your service you can filter by alias field tmp_entity for web01.
If you use Splunk App for Infrastructure this problem goes away because it normalized the entity name and is automatically imported into I TSI.
Thanks! Let me try to understand what you said.
* About services. I agree for something this simple, 1 service with multiple KPIs should suffice. I proposed multiple services because of some other reasons. One of them is silo-ed departments. The OS team or WEB team is completely separate from each other. They demand clear delineation of "their stuff". In your experience, when would you go with multiple services with dependencies between them?
* Let's go with 1 service for a simplified case. I remember being taught the natural business key of an entity is the combinations of title+alias+information. That's why it is Theoretically possible to have 2 entities with the same title. See this: https://docs.splunk.com/Documentation/ITSI/4.5.0/Entity/EntityImportConflicts for an example that can lead to 2 entities with the same title, e.g 2 servers with the same host name on different data centers. My question is when you would go with treating potential entities from different sources as being essential one entity and when you will separate them?
* Let me apply your strategy of unifying entities that applies to my case, please correct me if I'm wrong:
I appreciate any comments.
I understand groups want their 'stuff' separate from others BUT this doesn't mean you have to create separate services. Best practice, don't create services based on how people work, create them based on their dependency of the components. You can still give them a view of their own stuff either through a Glass Table OR you can create views in Service Analyzer, Deep Dives or Episodes. Personally I like Glass tables because you can create boxes and show whatever you want based on a splunk search. Instead of dragging over a KPI that may have entities they don't care about, instead write a search pulling from itsi_summary index and filter to the entities you do care about. This allows them to see just what they care about without you breaking a service apart to accomodate how they work. Best practice, if you are measuring the same thing on 5 hosts that serve the same purpose (app servers, web servers, databases, or business service) that is one service.
For entities and entity filtering you always want to try and filter by something other than a host name. If you use only a host name or even an alias with the combo of a host and something else, you move towards 'static' membership. If instead you create enrichment fields that you can then filter by, then membership becomes dynamic. For instance, if you have a host naming convention on something like this CHDBBNK3455 where the first two characters are the location (Chicago), the 2nd two are the role (Database), the 3rd are the apps it serves or service (Banking) you can regex those into new fields and create a lookup. OR if you have info from a cmdb you can create a lookup. Then you do entity import searches that are scheduled like this: index=main host=* |lookup myenrichmentdata.csv host ON host. This is basically saying..."hey lookup this is who I am, what do you know about me?" and all the fields in the lookup become enrichment in your entities. This way if you wanted to create some view based on something like Support group, you can easily filter by those fields. They become dynamic instead of static because you schedule it so when a new host starts reporting in, it too will do the lookup, pull it's new info, and then be imported in as an entity. You never want to have duplicate entites even if the name is different and the underlying 'machine' or CI is the same thing because it will throw of your aggregate health scores. Hope that helps!