Splunk Search

How to timechart on a single set of logs each 24 hour period into Splunk?

daryllj
Path Finder

Hi there- I have a simple dashboard that allows me to see growth around the number of Live / Archived accounts we manage in Google.

We currently have a daily pull of the directory service into Splunk, which allows for the following query to be run (I have a few like this with Archived / Live being the adjustments I make):

index="google" sourcetype="*directory*" "emails{}.address"="*@mydomain.com" 
| timechart count by archived span=1d cont=FALSE

In the last week or so we have had some issues in that sometimes we get two or three directory pulls into Splunk, which results in the graph displaying double / triple the count of data (see attached image)Screen Shot 2022-02-09 at 7.27.36 AM.png

My question is as follows:

Are there any additional variables I can add into my query to ONLY interpret one data pull per 24 hour period?    This will allow for consistent reporting in the face of inconsistent directory pulls into Splunk.

I have poked around a bit with Timechart but feel I perhaps I should be using a stats command instead...?  any direction on which approach to use is appreciated!

Labels (1)
0 Karma
1 Solution

moliminous
Path Finder

Can you instead try a distinct count, assuming the archived account values are what is unique?
Something like this:

index="google" sourcetype="*directory*" "emails{}.address"="*@mydomain.com" 
| timechart dc(archived) span=1d cont=FALSE

View solution in original post

moliminous
Path Finder

Can you instead try a distinct count, assuming the archived account values are what is unique?
Something like this:

index="google" sourcetype="*directory*" "emails{}.address"="*@mydomain.com" 
| timechart dc(archived) span=1d cont=FALSE

ITWhisperer
SplunkTrust
SplunkTrust

It depends what is unique and what is duplicated in the events pulled on the same day

0 Karma

daryllj
Path Finder

in this case, it is a full directory dump of a few thousand account names and email address- with it being a point in time reference to an existing directory at the time it was exported.

0 Karma

yuanliu
SplunkTrust
SplunkTrust

Do any of dumps overlap in time? Some rough ideas if not:

All records in a dump have same timestampUse earliest of the day
Dumps are periodicBucket time according to period, then use the first period of the day
Dumps are random but sufficiently separate from one anotherUse a time-based transaction (expensive)
Each dump has a unique identifierUse earliest of the day
0 Karma

daryllj
Path Finder

the good news is that they do not have time overlaps- looking at the directory dumps they seem to come in every hour or so- so I feel we are on the right track for "use earliest of the day"- now I can do a bit more digging to figure out the code for that.

I am going to explore variables in the query a bit more to see if there are some extra flags that can reference the earliest entry in a day.....let's see how far I can get!

0 Karma

yuanliu
SplunkTrust
SplunkTrust

If that hourly cycle is reliable, first-hour-of-the-day events can be filtered by

| where date_hour=0

date_hour is a meta field that Splunk automatically provides.  No need to addinfo.

Get Updates on the Splunk Community!

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...

Introducing Splunk Enterprise 9.2

WATCH HERE! Watch this Tech Talk to learn about the latest features and enhancements shipped in the new Splunk ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...