All Apps and Add-ons

Calculate percentage total over time using timewrap

demkic
Explorer

Hello there,

I am struggling to incorporate the eval command into a timewrap command. I would like to calculate the percent_total, over time. Specifically, day over day, for the last 4 weeks. Using the example provided in the custom timewrap app is not super helpful:

| inputlookup rides-by-hour.csv | eval _time=strptime(Date_Time, "%D %R") | timechart span=1h sum(Rides) as "Uber Rides"

I tried to tweak it as shown below, however, it is not broken down by Month as nicely as in the example from above.

base search |
| bucket _time span=1h
| stats count(eval(payment_method="credit_card")) as creditcard, count(payment_method) AS total by _time
| eval percent=(creditcard*100/total)

Any help is much appreciated.

0 Karma
1 Solution

sfishel18
Explorer

Hi @demkic,

Just to clarify for some of the previous responders, since you're using the timewrap custom visualization there's no need to wrap the results in your SPL. The visualization takes in a single time series and wraps it for you. There are, however, a couple of limitations to the visualization:

  1. It assumes that the time series is uniformly binned with no gaps. Using | bin _time will not fill in time gaps that exist in your data.
  2. It only looks at the first two columns of your results, and assumes that the first contains the timestamps and the second contains the values.

So I think if you switch to using the timechart command, and remove the intermediate calculation fields from the final results, you'll get what you're looking for:

base search
| timechart span=1h count(eval(payment_method="credit_card")) as creditcard, count(payment_method) AS total
| eval percent=(creditcard*100/total)
| fields _time, percent

Let me know if that helps!

View solution in original post

sfishel18
Explorer

Hi @demkic,

Just to clarify for some of the previous responders, since you're using the timewrap custom visualization there's no need to wrap the results in your SPL. The visualization takes in a single time series and wraps it for you. There are, however, a couple of limitations to the visualization:

  1. It assumes that the time series is uniformly binned with no gaps. Using | bin _time will not fill in time gaps that exist in your data.
  2. It only looks at the first two columns of your results, and assumes that the first contains the timestamps and the second contains the values.

So I think if you switch to using the timechart command, and remove the intermediate calculation fields from the final results, you'll get what you're looking for:

base search
| timechart span=1h count(eval(payment_method="credit_card")) as creditcard, count(payment_method) AS total
| eval percent=(creditcard*100/total)
| fields _time, percent

Let me know if that helps!

demkic
Explorer

Yes, precisely! I see now. If I wanted to see a week-over-week view for the month of February, would I tweak the above with the following?:

base search | where match(Date, "Feb") 
| eval _time = strptime(Date, "%d-%b-%y") 
| timechart span=1h count(eval(payment_method="credit_card")) as creditcard, count(payment_method) AS total 
| eval percent=(creditcard*100/total) 
| fields _time, percent
0 Karma

sfishel18
Explorer

Yep, you're on the right track. First off, if you're creating a week-over-week view, you'll want to use the span=1d option to timechart so that the granularity of the results match what the visualization is going to show.

0 Karma

sfishel18
Explorer

And I'm guessing you can remove the | eval _time=strptime(Date_Time, "%D %R"). The examples use that because they're pulling the raw data from a CSV lookup file so they have to manually create the _time field. If you're searching over indexed data, the _time field will already exist.

0 Karma

demkic
Explorer

thank you for your advice

0 Karma

demkic
Explorer

It seems that when I try to use the sample query's from the "Timewrap Chart Gallery" in the custom timewrap app, I am not able to translate the SPL correctly to fit my data. I am having the same struggles with trying to plot the percent, hour over hour, for March 8, 2017 for example.

base search |
| eval _time=strptime(Date_Time, "%D %R") 
| timechart span=1h count(eval(payment_method="credit_card")) as creditcard, count(payment_method) AS total 
| eval percent=(creditcard*100/total) 
| fields _time, percent
0 Karma

woodcock
Esteemed Legend

Like this:

Spoof your data:

|makeresults | eval raw="02/01/2017   cc::
02/01/2017   cc::
02/01/2017   cc::
02/01/2017   dc::
02/01/2017   dc::
02/01/2017   r::
02/01/2017   r::
02/02/2017   cc::
02/02/2017   dc::
02/02/2017   ach::
02/03/2017   ach::
02/03/2017   dp::
02/04/2017   dp::
02/04/2017   dp"
| makemv delim="::" raw
| mvexpand raw
| eval _time = strptime(raw, "%m/%d/%y")
| rename raw AS _raw
| rex "\s+(?<PaymentMethod>\D+)"

Now the solution(s):

| bin _time span=1d
| top PaymentMethod BY _time

Or, doing the top magic ourselves:

| bin _time span=1d
| timechart span=1d count BY PaymentMethod
| untable _time PaymentMethod count
| eventstats sum(count) AS total BY _time
| eval pct = 100*(count/total)
| timechart span=1d first(pct) BY PaymentMethod

demkic
Explorer

Can you please explain your reasoning for the untable and first command within the query? It is much appreciated, thank you.

0 Karma

woodcock
Esteemed Legend

You see how timechart takes distinct values of a single BY field (one column) and makes each value a new field name (many columns)? The untable command allows you to reverse that and take many columns and return them to a single column so that we can do the row-only math that we need to do. I chose first when rolling back up because all the values are the same and first would be the most efficient; I could have used just about anything to get the same result (e.g. last, avg, man, max, etc.).

0 Karma

DalJeanis
Legend

man -> min

0 Karma

woodcock
Esteemed Legend

HA!, yes, true.

0 Karma

demkic
Explorer

Thank you so much for your response, this result is really wonderful to have. However, it doesn't show week-over-week for only the PaymentMethod=cc. I am trying to see the % of total for PaymentMethod=cc for this week as compared to the last week as compared to 2 weeks ago. Essentially in the x-axis I should have Monday, Tuesday, Wednesday, etc...and the legend will be this week, last week. 2 weeks ago.

What this will tell me is that out of all payment methods used (cc, dc, dd, ach), credit cards were used 40% of the time on Wednesday of this week, but 36% of the time on Wednesday of last week, etc...

0 Karma

DalJeanis
Legend

woodcock has pretty much done that for you.

 your base search that produces one record for each transaction with only _time and PaymentMethod
 | bin _time span=1d
 | stats count as countPM by PaymentMethod _time
 | eventstats sum(countPM) as countTot by _time
 | eval countPct = round(100*countPM/countTot,2)
 | where PaymentMethod="cc"
 | table _time PaymentMethod countPM countTot countPct

demkic
Explorer

I appreciate both of your inputs, thank you. I was initially using the custom timewrap extension but am seeing that timechart could be an easier method sometimes.

0 Karma

woodcock
Esteemed Legend

Be sure to Up-Vote good/working answers, and Accept the best one.

0 Karma

woodcock
Esteemed Legend

OK, like this:
| bin _time span=1d
| timechart span=1d count BY PaymentMethod
| untable _time PaymentMethod count
| eventstats sum(count) AS total BY _time
| eval pct = 100*(count/total)
| search PaymentMethod="cc"
| timechart span=1d first(pct) BY PaymentMethod
| timewrap 1w

DalJeanis
Legend

Can you provide specific numbers as an example?

For instance, suppose we just wanted to do it for week over week with this data for two weeks - what would the expected output look like?

DATE         CC   Total

02/01/2017   30   100
02/02/2017   25    90   
02/03/2017   30   100
02/04/2017   38   120
02/05/2017   40    85
02/06/2017   31   100 
02/07/2017   30   113

02/08/2017   30   100
02/09/2017   26   124
02/10/2017   30   102
02/11/2017   42   105
02/12/2017   32    87
02/13/2017   28   136
02/14/2017   30   100

Here's some run-anywhere code to generate test data, for when I know what kind of a report you are actually looking for.

| makeresults count=28 | streamstats count as dayno
| eval cc=tonumber(substr(tostring(random()),2,2)) + tonumber(substr(tostring(random()),2,2))
| eval total=cc+tonumber(substr(tostring(random()),2,2)) + tonumber(substr(tostring(random()),2,2))+tonumber(substr(tostring(random()),2,2)) + tonumber(substr(tostring(random()),2,2))
| eval _time=strptime("01/28/2017","%m/%d/%Y")+dayno*86400
| table _time dayno cc total
| eval weekstart=strftime(relative_time(_time,"@w"),"%Y-%m-%d")
| eval cc_pct=cc/total
| delta cc_pct p=7

demkic
Explorer

Hello,

Here is an example of what the data looks like:

 DATE          PaymentMethod  

 02/01/2017   cc   
 02/01/2017   cc     
 02/01/2017   cc   
 02/01/2017   dc   
 02/01/2017   dc   
 02/01/2017   r   
 02/01/2017   r   

 02/02/2017   cc   
 02/02/2017   dc   
 02/02/2017   ach   
 02/03/2017   ach   
 02/03/2017   dp    
 02/04/2017   dp   
 02/04/2017   dp   

I would like to plot only the percent where payment_method=cc was used for week-over-week or day-over-day. The total will be adding up all the columns (maybe using eventstats?). In the example above, essentially total for 02/01/2017=7 and therefore percent_creditcard=3/7=43% for that day.

0 Karma

DalJeanis
Legend

woodcock has pretty much done that for you.

 your base search that produces one record for each transaction with only _time and PaymentMethod
 | bin _time span=1d
 | stats count as countPM by PaymentMethod _time
 | eventstats sum(countPM) as countTot by _time
 | eval countPct = round(100*countPM/countTot,2)
 | where PaymentMethod="cc"
 | table _time PaymentMethod countPM countTot countPct

That gives you a single series of one record per day, with counts and percentages.

| delta countPct as changePct1day p=1
| delta countPct as changePct7days p=7

That gives you the change from the prior day and from the same day the prior week.

Get Updates on the Splunk Community!

Splunk Enterprise Security 8.0.2 Availability: On cloud and On-premise!

Splunk Enterprise Security 8.0 revolutionizes the SOC workflow experience from the ground up. Now security ...

Logs to Metrics

Logs and Metrics Logs are generally unstructured text or structured events emitted by applications and written ...

Developer Spotlight with Paul Stout

Welcome to our very first developer spotlight release series where we'll feature some awesome Splunk ...