Splunk Search

Comparing raw data volume Vs indexed data volume

gnanaraj_mcc
Loves-to-Learn Lots

How do i compare my raw data volume to the indexed data volume for a specific source type?

Can someone help with this query?

We have index clustering, a deployment server, and a distributed management console.

i want to make sure their same data is not indexed more than one time. (dual, triple indexing of same data)

0 Karma

sloshburch
Ultra Champion

To determine duplicate data, you could do a | stats count by _raw, _time, host, source although I promise that will be a slow and painful process.

Indexed data volume is captured in index=_internal source=*/license_usage.log sourcetype=splunkd and then you can specify a sourcetype using the st= field.

Where do you think you have duplication? Starting with the symptoms that motivated your question will help us be more surgical in what would otherwise be a very involved process.

0 Karma
Get Updates on the Splunk Community!

Data Management Digest – December 2025

Welcome to the December edition of Data Management Digest! As we continue our journey of data innovation, the ...

Index This | What is broken 80% of the time by February?

December 2025 Edition   Hayyy Splunk Education Enthusiasts and the Eternally Curious!    We’re back with this ...

Unlock Faster Time-to-Value on Edge and Ingest Processor with New SPL2 Pipeline ...

Hello Splunk Community,   We're thrilled to share an exciting update that will help you manage your data more ...