I'm trying to put together a time chart that's basically a representation of many separate searches. A stacked column chart would be ideal for this. Here's my base search...
index=main extra.user_id=* AND Create_Date>'2014-02-25 AND Create_Date<'2014-03-03 |dedup extra.user_id date_mday |timechart count
Create_Date represents the date the account was created. (That's pulled in via a lookup table if you're really interested.)
So, what I'm looking to create is a chart of user activity over typically a weeks time where the total of the column will represent the number of unique active users that day. That part is easy.
What I'd like to do is have the column made up of columns based on the range the account was created. I don't mind having to specify the Create_Date date ranges (which are weeks) manually to make up the components.
If someone just wants to give me a template then I can likely adapt it out from there.
A template would look like this:
base search | eval category = case(...) | timechart count by category
The inside of the case()
expression needs to compute the category for each event.
Edit: You might be better off using this to do weekly grouping:
base search | eval Create_Date = "Week starting ".strftime(relative_time(strptime(Create_Date, "%F"), "@w1"), "%F") | timechart count by Create_Date
Change the @w1 to @w0 if your week starts on Sunday rather than Monday.
A template would look like this:
base search | eval category = case(...) | timechart count by category
The inside of the case()
expression needs to compute the category for each event.
Edit: You might be better off using this to do weekly grouping:
base search | eval Create_Date = "Week starting ".strftime(relative_time(strptime(Create_Date, "%F"), "@w1"), "%F") | timechart count by Create_Date
Change the @w1 to @w0 if your week starts on Sunday rather than Monday.
Thanks! I went ahead and up voted it and marked it as answered. :beer:
I've added the most recent search to the answer. As for that pint, I'll be at Splunk Live Hamburg on May 6th and at Comply&Secure Kiel on June 26th/27th... 🙂
Ok, I've weeded out 99% of the NULLs so I think we're good.
Martin, I can't thank you enough. I'd love to buy you a pint sometime if the opportunity comes up. Would you mind posting the ultimate solution of the weekly grouping in a separate answer so I can up vote that correct answer and you get credit for it?
Thanks!
Ah, that would make sense, thanks for pointing that out.
The weekly grouping looks a ton better thanks! But, since nothing ever goes quite to plan, I'm getting some NULL values for Create_Date in the stats. Looking at the events that seem to be generating them there is a valid Create_Date field on them and I see it correctly being converted into a Week Starting value.
I'm not confused, you're confusing Splunk by telling it to subtract 4 and 25 from 2014... add double quotes around strings.
However, you might be better off using this to do weekly grouping:
base search | eval Create_Date = "Week starting ".strftime(relative_time(strptime(Create_Date, "%F"), "@w1"), "%F") | timechart count by Create_Date
Change the @w1 to @w0 if your week starts on Sunday rather than Monday.
Oh, I see where you're getting confused, the value of Create_Date is YYYY-MM-DD and represents the date the account was created. What I'm trying to do is create a stacked column chart showing active users for each day. The components of the column will be the week the account was created. That way I can see of the active users that day what portion were users created in week1, what component were created in week2, etc.
Those comparisons in the case()
look very weird, even wrong. Here's what I see in natural language:
Take the value of Create_Date, and compare with 2014-02-25... first, evaluate (2014 - 02 - 25) to get 1987, then compare that number with a string...
That's bound to fail. To verify, leave off the timechart
at the end and check if your category values are filled as you expect.
index=main extra.user_id=* AND extra.endpoint=set_playback_position |dedup extra.user_id date_mday|eval category = case(Create_Date>=2014-02-25 AND Create_Date<=2014-03-03, "week1") |timechart count by category
I don't get what I expect, I get results but, no stats or chart showing just events where the create_date falls in the range specified in the case statement.
I've cleaned up the csv file and got rid of quotations and spaces and so that's clean. A basic search works and I can get results tightened down to the date range I'm looking for. So I'm sure I'm just not getting the case() ok
Getting somewhere, the single quotes in the field value seem to be messing things up. So, I'm modifying the script that generates the CSV lookup table to not have the single quotes in the field values AND getting rid of any spaces in the values.
Stay tuned.
case()
takes pairs of arguments. The first member of the pair is a boolean expression, if true the second member is the result of the case. Here's an example with numbers and colours:
... | eval range = case(errors<=0, "green", errors<10, "yellow", errors < 100, "orange", errors>=100, "red") | ...
I'm not sure what you want to do with the date, so I can't come up with a reasonable example. If you just want weeks you can use bucket
:
base search | bucket Create_Date span=1w | timechart count by Create_Date
bucket
will "round down" the field value to the nearest week.
Here's what I've tried initially:
index=main extra.user_id=* AND extra.endpoint=set_playback_position |eval category = case(Create_Date>'2014-02-25 AND Create_Date<'2014-03-03, "week1") |timechart count by category
Could you give me some insight into making case work? I've not used it before sorry and so far after searching around a bit I'm still at a loss.