That seems like a good solution. Unfortunately I can't come up with a good filter that will reduce the entries significantly, while still leaving enough data in there to be able to identify a faulty client.
I think if it's not possible to de-duplicate the logs before indexing (which I don't think is possible) then there may not be a good solution available for me.
However, your reply did answer the question so would it be good form for me to mark your answer as 'Accepted'?
... View more
I have a number of windows clients using the Universal forwarder to send a small log file to Splunk. Typically around 15kb per day per client.
However, when testing this I found a client that is sending almost 1gb a day rather than the expected 15kb. It appears as though this client is having issues and is writing a massive amount of errors to the log daily.
If I scale up the deployment of the UF for this app to more clients, then I am concerned that multiple clients having this issue could push my data ingest up to an unsustainable level.
I need to be able to reduce the amount of data this client (and any future clients that have the same issue) are sending, but I don't want to exclude it entirely as then I won't be able to see which clients are having this manic log writing issue.
What is the best way to solve this? Can I limit the total data that can be forwarded per client for this app, or can I do some de-duplication on the data prior to forwarding in order to reduce the amount sent? It writes the same log lines repeatedly within the same timestamp
Thanks for any advice you can offer.
... View more