Hi folks,
I have Splunk version 6.2.7 and am trying to create a report to display the top 10 products sold within the last 4 hours (in quantity and displayed per product name) and compare those results to the results of 30 days ago.
Ideally, what I would like to be able to see is what are the "new" products (if any) that have been purchased within the last 4 hours that were possibly not purchased 30 days ago. I am trying to find anomalies in products sold.
I am not quite sure my search is displaying any "new" products that may have been sold within the last 4 hours as it compares to 30 days ago.
sourcetype=.... earliest=-30d@d | stats count(product_name) as monthproduct by productname | sort 10 - monthproduct | streamstats count as MonthRank | append [ search sourcetype=... earliest=-4h@h | stats count(product_name) as hourproduct by productname | sort 10 - hourproduct | streamstats count as HourRank ] | stats first(MonthRank) as MonthRank first(HourRank) as HourRank by product_name | eval diff=MonthRank-HourRank | sort HourRank | table product_name, HourRank, MonthRank, diff
Is it appropriate to use append
before the subsearch even?
Thanks so much for any insights.
Give this a try
sourcetype=.... earliest=-30d@d | stats count(product_name) as monthproduct by product_name | sort 10 - monthproduct | streamstats count as MonthRank | append [ search sourcetype=... earliest=-4h@h | stats count(product_name) as hourproduct by product_name | sort 10 - hourproduct | streamstats count as HourRank ] | stats first(MonthRank) as MonthRank first(HourRank) as HourRank values(monthproduct) as monthproduct values(hourproduct) as hourproduct by product_name | eval diff=MonthRank-HourRank | sort HourRank | table product_name, HourRank, MonthRank, diff, monthproduct , hourproduct
Give this a try
sourcetype=.... earliest=-30d@d | stats count(product_name) as monthproduct by product_name | sort 10 - monthproduct | streamstats count as MonthRank | append [ search sourcetype=... earliest=-4h@h | stats count(product_name) as hourproduct by product_name | sort 10 - hourproduct | streamstats count as HourRank ] | stats first(MonthRank) as MonthRank first(HourRank) as HourRank values(monthproduct) as monthproduct values(hourproduct) as hourproduct by product_name | eval diff=MonthRank-HourRank | sort HourRank | table product_name, HourRank, MonthRank, diff, monthproduct , hourproduct
That did the trick. My only concern is that the query runs for a really long time now. I am not sure whether this will be manageable in the long-run (for analysis purposes). I did try to break the initial search down as best as I can but it is just really taking a long time. Is there any way to remedy this? Thanks!
Well, you're querying about 31 days worth of data and it's bound to be slower. See this this work better:-
sourcetype=.... earliest=-30d@d | top 10 product_name showperc=f | streamstats count as MonthRank | append [ search sourcetype=... earliest=-4h@h | top 10 product_name showperc=f | streamstats count as HourRank ] | stats first(MonthRank) as MonthRank first(HourRank) as HourRank values(monthproduct) as monthproduct values(hourproduct) as hourproduct by product_name | eval diff=MonthRank-HourRank | sort HourRank | table product_name, HourRank, MonthRank, diff, monthproduct , hourproduct
OR
sourcetype=.... earliest=-30d@d latest=now | eval Type=if(_time>relative_time(now(),"-4h@h") ,"Hour", "Month") | top 10 product_name by Type showperc=f | streamstats count as rank by Type | chart values(count) as count values(rank) as rank over product_name by Type | rename "count: Hour" as hourproduct "count: Month" as monthproduct "rank: Hour" as HourRank "rank: Month" as MonthRank
| eval diff=MonthRank-HourRank | sort HourRank | table product_name, HourRank, MonthRank, diff, monthproduct , hourproduct
There's is a typo in your search, productname vs product_name in the last segement. Other than that, your search should return right results. Why do you believe is missing in the result?
Do you happen to know what can I add extra to the query to display the actual quantities of products sold in addition to what is already displayed? Unfortunately "table product_name, HourRank, MonthRank, diff, monthproduct, hourproduct" doesn't do the trick.
Also, thank you!!
Oh yes, thank you regarding that typo. Actually, it seems as if it is returning the results that I wanted. The only part that it is not able to display on the table for me is the actual quantity of the products for "monthproduct" and "hourproduct" (...when I add in the last command: table product_name, HourRank, MonthRank, diff, monthproduct, hourproduct)
I believe this is because the streamstats command and because I am turning the quantities into rankings. However, how would I be able to tweak the query to also create a column named "monthproduct" and "hourproduct" and display the number of products sold per product?
Many thanks in advance!