How can I Identify raw events that are not indexed...

sasankganta · ‎01-20-2021

Index=X sourcetype=Y cribl_pipe=Z when I ran for 1week and 24hrs it showed index , sourcetype field with 100%

Index=X sourcetype=Y cribl_pipe=Z when I ran for 2weeks and 1month index , sourcetype field is not showing up 100% .

I'm searching for single index and single sourcetype but for 1week it's showing 100% field value, for 2 weeks it's not showing 100% what can be the issue ?

How can I Identify raw events that are not indexed ( soure tcp:9997 )

sasankganta · ‎02-05-2021

Okay, I will explian it simple and short :

I'm searching data for Index=abc for last 2 weeks of data and I can see index percentage value is 98% events..

If I split time intervals and search like index=abc for this week and last week I can see 100% events.

So, my question here is when i search data for last 2 weeks 98% events are showing, if i split time intervals and search it's showing 100% events. If data is more and due to buckets policy for last 2 weeks is it showing as 98% or is there any reason?

richgalloway · ‎02-05-2021

I was hoping to see the SPL you are using and the output of that SPL because I'm having difficulty understanding the English description of what is happening.

---
If this reply helps you, Karma would be appreciated.

sasankganta · ‎02-03-2021

Even if there is delay in _time events are getting processing , Here issue is if we run search for 10 days for raw data count of <10lacks index value is showing 100 % ... If raw data count is >10 lacks then index value is showing <100%... How can I rectify this issue ? we have to change limit.conf for data getting from tcp:9997 port or there any other way ?

richgalloway · ‎02-03-2021

Please show the searches you are using and their results.

---
If this reply helps you, Karma would be appreciated.

sasankganta · ‎01-22-2021

What can be that max or min delay time in events which can impact these index percentage ?

richgalloway · ‎01-22-2021

I don't understand this question. There was no mention of percentages before now. Please explain what you want.

---
If this reply helps you, Karma would be appreciated.

sasankganta · ‎01-21-2021

or any search command to verify timestamps , I tried index=x sourcetype=y host= source= | convert ctime(_indextime) AS indextime | delay =_indextime-_time | table _time indextime date_zone host source sourcetype _raw it didn't worked

richgalloway · ‎01-21-2021

One of my pet peeves on this forum is postings that state "it didn't work" without explanation.

The search fails because _indextime was converted to text and then compared to an integer (_time), which is not valid. _indextime is already an integer and so can be compared to _time directly. Like this:

index=x sourcetype=y host=* source=*
| eval indextime=_indextime
| eval delay =indextime-_time 
| table _time indextime delay date_zone host source sourcetype _raw

That will show how long it took for an event to be indexed from the time it was generated (sort of - a numer of factors can skew this number). It won't however, "verify timestamps". Splunk does some of that for you and logs it in _internal. Use this search to find the messages.

index=_internal sourcetype=splunkd component=DateParserVerbose log_level=WARN 
| rex "Context:\s+source=(?<data_source>[^\|]+)\|host=(?<data_host>[^\|]+)\|(?<data_sourcetype>[^\|]+)" 
| stats count as Count values(data_source) values(data_host) dc(data_source) as "Source Count" dc(data_host) as "Host Count" BY data_sourcetype  
| sort 0 - count 
| rename data_sourcetype as Sourcetype

---
If this reply helps you, Karma would be appreciated.

sasankganta · ‎01-21-2021

Can I use https://docs.splunk.com/Documentation/Splunk/6.5.2/Data/Configuretimestamprecognition this docs to verify or you have any command or suggestions to check

richgalloway · ‎01-21-2021

Yes, you can use that document, but it would be better to use a more recent version. Version 6 is not supported.

---
If this reply helps you, Karma would be appreciated.

sasankganta · ‎01-21-2021

Sorry, the above search I already checked before posting here it didn't worked.

Index=X source=tcp:9997 sourcetype=Y cribl_pipe=Z , for last 2 weeks of data fields are not showing 100%.

I'm searching for single index, source, sourcetype, cribl_pipe .. but I'm unable to get raw events which are not indexed

richgalloway · ‎01-21-2021

It's impossible to search for events that are not indexed. Splunk searches its indexes for data so anything not indexed cannot be sought.

Have you tried less-restrictive searches to see if the data is there, but with different attributes? Have you tried different time ranges (including future times) in case event timestamps were mis-interpreted?

---
If this reply helps you, Karma would be appreciated.

richgalloway · ‎01-20-2021

Try this search to find the events that arrived via TCP without a sourcetype.

index=X source="TCP:9997" NOT sourcetype=*

Go to Settings->Data inputs->TCP to change each input to have a sourcetype.

Even better: stop sending events directly to a Splunk TCP/UDP port. Doing so will cause data loss each time the listening instance restarts. Use a dedicated syslog server or other intermediary process.

---
If this reply helps you, Karma would be appreciated.

How can I Identify raw events that are not indexed?

field extraction

Splunk Decoded: Service Maps vs Service Analyzer Tree View vs Flow Maps

What’s New in Splunk Observability – September 2025

Fun with Regular Expression - multiples of nine

Are you a member of the Splunk Community?

How can I Identify raw events that are not indexed?

field extraction

Splunk Decoded: Service Maps vs Service Analyzer Tree View vs Flow Maps

What’s New in Splunk Observability – September 2025

Fun with Regular Expression - multiples of nine