We have an environment where we directly write data to Splunk indexers via TCP inputs.
The reason for this kind of set up is kafka consumers consume data from Kafka and later this is writtent into splunk.
Sometimes this tcp takes too much time and data doesnt get into Indexers.
Is there a way to monitor if this data really gets into splunk?
Also how can we find out how much data is writtent into any indexer at any given time? ( we do not have a clustered Indexer and would like to know the how much gets into every Indexer - we have like 100 Indexers)
@Harishma,
Sometimes this tcp takes too much time and data doesn't get into Indexers.
I think you must be using TCP input in Splunk on some port. In that case, all data received on the port will surely be indexed by the Splunk. Make sure on the sendor machine or on receiver there is no throughput set for network data transfer at the operating system level.
Is there a way to monitor if this data really gets into Splunk?
Use query index=<your data index> | stats count
to check all the events are received or not. This query gives the number of events.
If you have multiple indexer in cluster then use query index=<your data index> | stats count by splunk_server
to check events count on different indexers. This represents data distribution in splunk clustered enviorment.
Hope this helps!!!
Interesting, we had yesterday a similar question about syslog at How to calculate volume of syslog traffic on syslog-ng server
You can use the following search on the licence master to see how much each indexer has indexed:
index=_internal sourcetype=splunkd source=*license_usage.log*
| fields b idx splunk_server
| eval MB=b/1024/1024
| stats sum(MB) as bytes by idx splunk_server
@Harishma,
Sometimes this tcp takes too much time and data doesn't get into Indexers.
I think you must be using TCP input in Splunk on some port. In that case, all data received on the port will surely be indexed by the Splunk. Make sure on the sendor machine or on receiver there is no throughput set for network data transfer at the operating system level.
Is there a way to monitor if this data really gets into Splunk?
Use query index=<your data index> | stats count
to check all the events are received or not. This query gives the number of events.
If you have multiple indexer in cluster then use query index=<your data index> | stats count by splunk_server
to check events count on different indexers. This represents data distribution in splunk clustered enviorment.
Hope this helps!!!
HI @VatsalJagani,
Sorry for late reply
Yes you're right, tcp takes too much to write, I'll check or the throughput part may be. Never looked into it.
But are you aware of why TCP collapses occur?
It may have many reasons, like network issues, throughput, etc.