I'm trying to track down the reason my Data Summary in the Search app is reporting BILLIONS of events going back 15 years. Any ideas on how I can track down where the issue is?
What to Search 241,568,189,244 Events 15 years ago Now INDEXED EARLIEST EVENT LATEST EVENT
While there's probably a bunch of tricky ways to go about this, one of the easiest is to just run a wide-open search against some moderate sized time frame 14 years ago.
index=* earliest=-14y latest=-13y
You could use the time picker drop down instead of specifying
latest. I left it as a year long period because while it's most likely the data is spread fairly smoothly, it's also possible it's clumpy. Let it run until you have 100,000 or other reasonably sizeable number of events then stop it (or let it continue to run while you check - doesn't really matter).
Click on the source, sourcetype, index, host and other fields over on the left and they'll help you narrow down where that data has come from.
Now, how did it get there? History! You likely pointed an input to a set of logs or data that has history going back that far. It is possible it's wrong in that it was timestamped wrong by Splunk, but I don't think that's the case here.
What to do about it? Honestly, if it didn't blow up your storage, I'd just leave it. There are settings in indexes to trim old stuff out, but it doesn't really hurt anything. If you did want to clean that up, we can help with that too though, so just ask. It may or may not be trivially easy to clean it, though.
I think the issue is that there is a field somewhere that splunk is seeing as a time but isn't. Any idea how I can see whats going into input queue?
As a test I used your search above, with a slight modification, index=* earliest=-15y latest=-2y. That search only returns about 1.5 million records.