Splunk Search

Why does increasing the time range for a search on 2 indexes give incorrect results in a timechart?

Explorer

Maybe I'm not understanding the way this works, but I have other searches that use it just fine. The only difference is, the searches that give me incorrect results are the ones I am searching two indexes, for instance, search index="example1" OR index="example2".

Anyway, here is a sample search using a 7 day window:

index=example1 OR index=example2 
| eval name1=coalesce(lower(name1),lower(name2)) 
| eventstats values(index) as index values(field7) as field values(field8) as field2 
| eval Countdata=if(field="Enabled" AND NOT like(field2, "%excluded%") AND index!="example2","True",null()) 
| timechart span=1d count(eval(Countdata="True")) AS Count

So there's a simple version of the search. I get accurate numbers when I zero in on specific dates, but when I want to create trend data for a week or larger time range, then I get different counts. The strange thing is, when I click on the day and view events in a different search window I get the correct counts. Is there a way to either correct this or get the weekly trend data without using a 7 day time range?

UPDATE
So I did some major digging to see exactly what events were present on the daily searches but not present on the weekly searches. There are a handful of assets that show up when I search individual days but don't show up when I search the entire week. There isn't really anything unique about them. It seems to be stemming from the search conditions that specifies the index name. I'm not sure why the condition is fulfilled when I search the day, but not when I search the week. I have tried using where statements as well is using the if condition and I have the same issues. I have also tried to use append/join instead of the OR in the search and I still get it as well.

UPDATE 2
So it looks like the issue is with how the data is being laid out in the eventstats command.

| eventstats values(index) AS index dc(index) as Indexcount BY Name

When an item from days 1-5 for instance, goes from being in one index to both indexes, it shows as having been in both indexes for all days, for some reason. Now to figure out why.

Thanks
UPDATE 3 Working

Here is the search that ended up giving me correct numbers - sorting by time seemed to work better then date_mday which was another option, but it limits me

index=index1 OR index=index2 | eval Name=coalesce(lower(Name),lower(Name2)) | bucket time span=1d| eventstats values(index) as index BY Name,time | where Status="Enabled" AND (Type="Type1" OR Type="Type2" OR Type="Type3") and index!="bhlhencryption" | timechart span=1d dc(Name)

0 Karma

Legend

Why do you need the eventstats? Also, you seem to be calculating a bunch of stuff that you never use. Here is a simplified version of your search, as I see it:

 index=example1 OR index=example2 
 | eventstats values(index) AS combinedIndex dc(index) as Indexcount  values(field7) as field values(field8) as field2 BY Name
 | search field="Enabled" AND NOT field2="%excluded" AND combinedIndex!="example2"
 | timechart span=1d count

It appears that you are using the eventstats to examine and combine the values of various fields. By doing this, you will end up with multi-valued fields. For example, a given Name has 3 events, 2 from example1 and 1 event from example2: the combinedIndex field will appear in all 3 events with the same value - "example1,example2". Is that what you want?

Also, there is a difference between fieldA!="xyz" and NOT fieldA="xyz".
The first ( fieldA!="xyz" ) means "fieldA exists, but has some value other than xyz."
The second ( NOT fieldA="xyz" ) means "either fieldA does not exist, or it has a value other than xyz."

Can you explain the conditions that you are searching for?

0 Karma

Explorer

I just find eventstats to be easier because it adds the _time field and doesn't convert to epoch times like it does when you add via stats. I actually figured it out, not sure if it's the best way, but I got it going. You were basically right when you say I had to many unnecessary fields. They were clouding up the numbers. The only thing I needed to group together in an mvfield was the indexes.

The data would be accurate unless a particular item was either removed from an index or added to an index throughout the week. If this occurred then when looking over a 7day time range it would show that the item always had two indexes in the index field, even though for certain days that was not the case.

Here is the search that ended up giving me correct numbers - sorting by time seemed to work better then date_mday which was another option, but it limits me

index=index1 OR index=index2 | eval Name=coalesce(lower(Name),lower(Name2)) | bucket time span=1d| eventstats values(index) as index BY Name,time | where Status="Enabled" AND (Type="Type1" OR Type="Type2" OR Type="Type3") and index!="bhlhencryption" | timechart span=1d dc(Name)

0 Karma

Legend

_time is actually epoch time, Splunk just presents it as formatted when possible. You can always format it yourself using eval (or fieldformat) with the strftime function.

I would definitely avoid using date_mday because it doesn't consider the time zone - it is simply the day of the month extracted from the raw data.

0 Karma

SplunkTrust
SplunkTrust

Can you give the example numbers - if you do day by day for the past week you get these values, but if you do the whole range, you get those numbers? That might give us a clue.

Also can you take just this one search and try it comparing day-by-day with weekly range over each index separately?

0 Karma

Explorer

Here are the numbers I get for when I perform the search over the past week with a 1d span

Count _time
51 2015-11-04
51 2015-11-05
204 2015-11-06
51 2015-11-07
51 2015-11-08
51 2015-11-09
53 2015-11-10

When I run the search on each individual day I get this

Count _time
66 2015-11-04
60 2015-11-05
224 2015-11-06
56 2015-11-07
56 2015-11-08
58 2015-11-09
63 2015-11-10

I can't really break up the search because I'm using this method to combine a field of names and the data is reliant on fields from both searches.

0 Karma