Getting Data In

Average Daily Unique IPs by Weekday in Web Log Files

jason_hunsberge
Path Finder

Hi all,
I'm having a great time with Splunk and using it to analyze some IIS web logs. I've been successful in creating a search that counts the daily unique IPs. But I am wanting to use that same data and show the average number of unique IPs by weekday over different periods of time (3 months, 6 months, all time).

My issue is that I have created a successful search for 3 months, but when I try to run it for anything greater than that, the calculations are incorrect. I presume I am hitting some sort of summarization limit and it is grouping the data in an odd way.

Here is my search string that works at three months or less:

sourcetype=iis
| bin _time span=1d 
| dedup c_ip
| timechart count(c_ip) as daily_uniques
| eval weekday = strftime(_time, "%w")
| stats avg(daily_uniques) by weekday

Why would this search work correctly for the last three months but not for 6 or 12 month periods?

Tags (2)
1 Solution

jason_hunsberge
Path Finder

Thanks to both linu1988 and martin_mueller, my question was answered. The correct query for what I am trying to accomplish is:

sourcetype=iis
| timechart span=1d dc(c_ip) as daily_uniques
| eval weekday = strftime(_time, "%w")
| stats avg(daily_uniques) by weekday

View solution in original post

0 Karma

jason_hunsberge
Path Finder

Thanks to both linu1988 and martin_mueller, my question was answered. The correct query for what I am trying to accomplish is:

sourcetype=iis
| timechart span=1d dc(c_ip) as daily_uniques
| eval weekday = strftime(_time, "%w")
| stats avg(daily_uniques) by weekday

0 Karma

linu1988
Champion

That would give you the redundant count of IP. I was assuming that the c_ip was the IP extracted field. So it should be done as dc(IP) by date_wday that would be appropriate.

0 Karma

jason_hunsberge
Path Finder

Brilliant. Just what i needed to know. Thanks!

0 Karma

martin_mueller
SplunkTrust
SplunkTrust

Yeah, dedup is not aware of any bucketing but instead operates over the entire dataset.

jason_hunsberge
Path Finder

martin, are you saying that dedup is not deduping on the time bucket of 1 day but across the entire dataset?

0 Karma

martin_mueller
SplunkTrust
SplunkTrust

Your original query is incorrect because of the way you use dedup. If someone visits on two days you're only counting him for one of the days, dedup throws out the other visit before it was counted for that day.

0 Karma

linu1988
Champion

logically they should work equally, but the timechart aggregation should be faster. Honestly i don't know internally how would splunk process but would only prefer a line of code rather than 3 lines doing the same thing.

would like to see this running.

|timechart dc(IP) by date_wday

0 Karma

jason_hunsberge
Path Finder

so i ran the command you gave above:
sourcetype=iis
| timechart span=1d dc(c_ip) as daily_uniques
| eval weekday = strftime(_time, "%w")
| stats avg(daily_uniques) by weekday

and i'm getting results, but the counts are different than with my original query. can you tell me how these two command sequences are different?

timechart span=1d dc(c_ip) as daily_uniques

vs.

bin _time span=1d
| dedup c_ip
| timechart count(c_ip) as daily_uniques

why would is structure a search as you did versus the way i did?

0 Karma

jason_hunsberge
Path Finder

looks like date_wday is present. I ran the following command on the last 3 months of data and got results as expected:

sourcetype=iis
| timechart count by date_wday

0 Karma

linu1988
Champion

date_wday is available in splunk or not when you run with sourcetype=iis?

sourcetype=iis
| timechart span=1d dc(c_ip) as daily_uniques
| eval weekday = strftime(_time, "%w")
| stats avg(daily_uniques) by weekday

could you try if date_wday is not present?

0 Karma

jason_hunsberge
Path Finder

when i do that, i get "No results found"

0 Karma

linu1988
Champion

it was an after thought 😛

0 Karma

jason_hunsberge
Path Finder

huh, the query didn't show when i first looked at your comment. trying now.

0 Karma

linu1988
Champion

did you try the query which is provided? what was the result?

0 Karma

jason_hunsberge
Path Finder

Sorry, I should have clarified that point.

So the commands above from sourcetype through timechart group the data by day, then dedup by client IP address, then counts the number of daily uniques. From there I organize the unique IP counts by weekday and calculate the average number of unique IPs by weekday.

When I choose a time slice larger than three months, however, my averages are no longer calculated by day of the week, but calculate to some unknown grouping.

This has the effect of not showing every single work day or showing hundreds (instead of 10s) of unique IPs as the average.

0 Karma

linu1988
Champion

how do you say it's not working?

sourcetype=iis
| dedup c_ip,date_wday
|timechart avg(c_ip) by date_wday

could you try this and see where do you see the issue?

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...