My first suggestion is to use the Distributed Management Console that is built into Splunk.
But you can also run this search:
index=_internal source=*metrics.log group=per_index_thruput | timechart span=1h sum(kb) as kb_indexed by series | rename series as index
This search will help you identify the most active forwarders in your environment:
index=_internal source=*metrics.log group=tcpin_connections | eval sourceHost=coalesce(sourceHost,hostname) | fields sourceHost kb | timechart sum(kb) AS kb_forwarded by sourceHost
If you are on Splunk 6.0 or higher you can go to the following site, click Previous 30 days, then split by index, host, sourcetype, etc... to figure out what's sending all that.
If you don't have access to it, you have to search against your _internal to figure it out which can be a bit trickier, but as a start, the code behind the by-index split on the above page is shown below;
index=_internal source=*license_usage.log type="Usage" | eval h=if(len(h)=0 OR isnull(h),"(SQUASHED)",h) | eval s=if(len(s)=0 OR isnull(s),"(SQUASHED)",s) | eval idx=if(len(idx)=0 OR isnull(idx),"(UNKNOWN)",idx) | bin _time span=1d | stats sum(b) as b by _time, pool, s, st, h, idx | timechart span=1d sum(b) AS volumeB by idx fixedrange=false | foreach * [eval <<FIELD>>=round('<<FIELD>>'/1024/1024/1024, 3)]