Knowledge Management

How can I Identify raw events that are not indexed?

sasankganta
Path Finder

Index=X sourcetype=Y cribl_pipe=Z when I ran for 1week and 24hrs it showed index , sourcetype field with 100%

Index=X sourcetype=Y cribl_pipe=Z when I ran for 2weeks and 1month  index , sourcetype field is not showing up 100% .

I'm searching for single index and single sourcetype but for 1week it's showing 100% field value, for 2 weeks it's not showing 100% what can be the issue ?

How can I Identify raw events that are not indexed ( soure tcp:9997 )

Labels (2)
0 Karma

sasankganta
Path Finder

Okay, I will explian it simple and short :

I'm searching data for Index=abc for last 2 weeks of data and I can see index percentage value is 98% events..

If I split time intervals and search like index=abc for this week and last week I can see 100% events.

So, my question here is when i search data for last 2 weeks 98% events are showing, if i split time intervals and search it's showing 100% events. If data is more and due to buckets policy for last 2 weeks is it showing as 98% or is there any reason?

 

0 Karma

richgalloway
SplunkTrust
SplunkTrust

I was hoping to see the SPL you are using and the output of that SPL because I'm having difficulty understanding the English description of what is happening.

---
If this reply helps you, an upvote would be appreciated.
0 Karma

sasankganta
Path Finder

Even if there is delay in _time events are getting processing , Here issue is if we run search for 10 days for raw data count of <10lacks index value is showing 100 % ... If raw data count is >10 lacks then index value is showing <100%... How can I rectify this issue ?  we have to change limit.conf for data getting from tcp:9997 port or there any other way ?

0 Karma

richgalloway
SplunkTrust
SplunkTrust

Please show the searches you are using and their results.

---
If this reply helps you, an upvote would be appreciated.
0 Karma

sasankganta
Path Finder

What can be that max or min delay time in events which can impact these index percentage ?

0 Karma

richgalloway
SplunkTrust
SplunkTrust

I don't understand this question.  There was no mention of percentages before now.  Please explain what you want.

---
If this reply helps you, an upvote would be appreciated.
0 Karma

sasankganta
Path Finder

or any search command to verify timestamps , I tried index=x sourcetype=y host= source=  | convert ctime(_indextime) AS indextime | delay =_indextime-_time | table _time indextime date_zone host source sourcetype _raw it didn't worked 

0 Karma

richgalloway
SplunkTrust
SplunkTrust

One of my pet peeves on this forum is postings that state "it didn't work" without explanation.

The search fails because _indextime was converted to text and then compared to an integer (_time), which is not valid.  _indextime is already an integer and so can be compared to _time directly.  Like this:

index=x sourcetype=y host=* source=*
| eval indextime=_indextime
| eval delay =indextime-_time 
| table _time indextime delay date_zone host source sourcetype _raw

That will show how long it took for an event to be indexed from the time it was generated (sort of - a numer of factors can skew this number).  It won't however, "verify timestamps".  Splunk does some of that for you and logs it in _internal.  Use this search to find the messages.

index=_internal sourcetype=splunkd component=DateParserVerbose log_level=WARN 
| rex "Context:\s+source=(?<data_source>[^\|]+)\|host=(?<data_host>[^\|]+)\|(?<data_sourcetype>[^\|]+)" 
| stats count as Count values(data_source) values(data_host) dc(data_source) as "Source Count" dc(data_host) as "Host Count" BY data_sourcetype  
| sort 0 - count 
| rename data_sourcetype as Sourcetype

 

---
If this reply helps you, an upvote would be appreciated.

sasankganta
Path Finder

Can I use https://docs.splunk.com/Documentation/Splunk/6.5.2/Data/Configuretimestamprecognition this docs to verify or you have any command or suggestions to check 

0 Karma

richgalloway
SplunkTrust
SplunkTrust

Yes, you can use that document, but it would be better to use a more recent version.  Version 6 is not supported.

---
If this reply helps you, an upvote would be appreciated.
0 Karma

sasankganta
Path Finder

Sorry, the above search I already checked before posting here it didn't worked.

Index=X source=tcp:9997 sourcetype=Y cribl_pipe=Z , for last 2 weeks of data fields are not showing 100%. 

I'm searching for single index, source, sourcetype, cribl_pipe .. but I'm unable to get raw events which are not indexed 

0 Karma

richgalloway
SplunkTrust
SplunkTrust

It's impossible to search for events that are not indexed.  Splunk searches its indexes for data so anything not indexed cannot be sought.

Have you tried less-restrictive searches to see if the data is there, but with different attributes?  Have you tried different time ranges (including future times) in case event timestamps were mis-interpreted?

---
If this reply helps you, an upvote would be appreciated.
0 Karma

richgalloway
SplunkTrust
SplunkTrust

Try this search to find the events that arrived via TCP without a sourcetype.

index=X source="TCP:9997" NOT sourcetype=*

Go to Settings->Data inputs->TCP to change each input to have a sourcetype.

Even better: stop sending events directly to a Splunk TCP/UDP port.  Doing so will cause data loss each time the listening instance restarts.  Use a dedicated syslog server or other intermediary process.

---
If this reply helps you, an upvote would be appreciated.
0 Karma
.conf21 CFS Extended through 5/20!

Don't miss your chance
to share your Splunk
wisdom in-person or
virtually at .conf21!

Call for Speakers has
been extended through
Thursday, 5/20!