We are about to ingest logs from multiple suppliers, where the individual supplier has full control over their infrastructure.
My take was to to create a couple of heavy forwarders and dedicate a port to each supplier:
supplier1 sends data to port 9991
supplier2 sends data to port 9992
This part I think I have working.
The next problem is that I have a need to separate the data from supplier1 from supplier2, My thought was to create a index per supplier.
The problem is then how do I route data received from port 9991 to index1 regardless of what is configured on the Universal Forwarder, except for Splunk stuff (internal ...) the different suppliers might use the same source or sourcetype, so it is only the receiving port on the heavy forwarder I might use to separate the data.
Any help is much appreciated
suppliers systems send data by syslog or they are Universal Forwarders?
if they are syslogs, you can identify supplier from the source (tcp://9991 or tcp://9992) and you can use the source field to perform an index override:
On your heavy forwarders:
[overrideindex] DEST_KEY =_MetaData:Index REGEX = . FORMAT = index1
[source::tcp://9991] TRANSFORMS-index = overrideindex
If they are Universal Forwarders, it's more difficoult, you need a way to identify them: e.g. a one or more values in one or more fields that identify supplier servers.
Or if you know the list of servers from each supplier, you could use a lookup to identify them.
Anyway, I don't like to have more indexes containing the same data, I prefer to identify sources in a different way (source or another field).
I had the idea, that it would be from Universal Forwarders.
Generally I aggree with you, that it is a bad idea to have the same sourcetypes in multiple indexes, but here it makes sense, as only a select few will have access to data from a specific supplier.
find field/s to identify suppliers, then you can create an automatic field to classify events by Supplier.
e.g. if all hostnames of Supplier1 start with "srv" and hostnames of Supplier2 don't, you can create a calculated field:
| eval Supplier=if(host="srv*","Supplier1","Supplier2")
I still haven't put this one to rest.
If I use the _meta in inputs.conf
[splunktcp-ssl:9990] disabled = 0 _meta userindex::index1 [splunktcp-ssl:9991] disabled = 0 _meta userindex::index2
then I might be able to route everything thru to my transform where I could do something like this
[force_index] DEST_KEY = _MetaData:Index REGEX = userindex::(\w*)\s FORMAT = $1
Do you think this might be a possible solution?
I have not been sucessfull yet, but I'm still working on the principal and have asked another question on how the reference the _meta variable in a transform.
Since you want to do it in HF, you can modify the input stanza to specify the default index.
[tcp://9991] index = supplier1 [tcp://9992] index = supplier2
OR you can add the props & transforms
transforms.conf (if you want to filter you can use sourcekey & regex)
[tcp9991_syslog_supplier1] SOURCE_KEY = MetaData:Host REGEX = (10.*.*.*) DEST_KEY = _MetaData:Index FORMAT = supplier1 [tcp9992_syslog_supplier2] SOURCE_KEY = MetaData:Host REGEX = (10.*.*.*) DEST_KEY = _MetaData:Index FORMAT = supplier2
Yes, that is the easy way, my problem is that I'm getting data from a Universal Forwarder, that I do not have any control over, it is located at a vendor, but I'm also required to store the data. The Universal Forwarders have used the index stanza, so there is not an easy way to overwrite that.