Summary of the long post:
On universal forwarders, I need to add some kind of identifier like a tag or metadata value to all data before it is sent to distinguish the environment it is coming from, allow it to be searchable based on the value, and a heavy/intermediate forwarder using props/transforms to change and forward data based on this value.
I'm currently working on a large environment that will have multiple environment's universal forwarders reporting to my environment. The way we are setup:
- Have about 10 customers with their own environments
- Each environment will have roughly 10-50 servers in AWS
- Each server will have a universal forwarder installed to point to my splunk environment
- The universal forwarders will use data cloning to send data to my indexers and to an intermediate forwarder on the edge of my environment.
- All of the data will be indexed on my indexers and certain inputs that are sent to the intermediate forwarder will be sent to another environment for security monitoring.
- The intermediate forwarder has props and transforms setup to forward data to the external splunk environment based off of sourcetype, but now that we are adding multiple customer environments that want the security monitoring, and will use different indexes, the transforms need to be modified.
So here is my question,
Is there a way to tag data or add an identifier within the universal forwarders in an environment so the intermediate forwarder can forward to the external splunk environment to a specific index?
The intermediate is a heavy forwarder without local indexing and is the only connection that has routes to the external splunk environment. The reason for all of this and they way it's constructed, is the level of security requirements from our primary.
- Customer A has 20 servers with universal forwarder installed. Universal forwarders add an identifier to all data as it is sent that matches the customer's environment name like CustA.
Customer B has 40 servers and much like Customer A, the forwarders add an identifier to all data, CustB.
The inputs for both environments are configured to go to their respective indexes on my indexers; Customer A to customerAdata and Customer B to customerBdata. The data is then forwarded to both my indexers and the intermediate forwarder.
The indexes customerAdata and customerBdata exist on my indexers and receive the data, but the external splunk security environment has custAsecurity and custAapplication, and custB_security and custB application.
The intermediate forwarder would use props and transforms. When it receives data with sourcetype=linuxaudit and the identifier is CustA, it sends that data to the external environment's custAsecurity index and when receives sourcetype=nginx (or any application source) and the identifier is CustB, it sends that data to the custB_application index in the external environment.
While this is all occurring on the intermediate, all data is being sent from the universal forwarders to my indexers with and being indexed and now be searchable using CustA or CustB.
Thanks in advance, it's a lot of information.
I don't think that it's possible to add a tag, but you can configure hostname and sourcetypes for each group of hosts.
So you can identify logs from a group of forwarders based on hostname (e.g.: hostnames that starts with nw belong to the New York group) or sourcetypes (e.g.: wineventlog from a group of forwarders is called 1_wineventlog).
If this is possible, you can filter and route logs on Heavy Forwarders based on hostname or sourcetype.
If you have both the choices, choose hostname, because sourcetype need additional work to configure indexers to override values and restore the original ones.
Ended up adding a meta tag onto individual monitored inputs and some in the default of inputs.conf and was able to do most of what I needed using the tag. Performing props/transforms on them is possible but very limited, but as far as searching goes, it's a great way to tag data from multiple customers that are being stored in a single index.
Customer A inputs.conf:
Then I added the following to the fields.conf on the indexers and search heads for each meta tag to make them searchable. You don't need to do this. If you do, you can simply search for "customer=CustA" and if you don't, then you can search for "customer::CustA". The "::" is splunk's "=" in the meta tags.
After all that was done, because customer tag was under default, it got applied to all inputs in the config so all of Customer A's events were tagged. Then each of the individual inputs had different meta tags added for when we wanted to narrow down searches to a specific app or log for a specific customer.
I performed all the UF meta tagging configurations with Puppet so that any new servers that get created for customers will automatically get the tags. Performed deployments of the fields.conf to the indexer cluster and search head cluster from the Master deployment server.