Splunk Search

Help with syntax to search across all bins

tcouture37
Explorer

Howdy,

I'm trying to come up with a query that charts the most occurring x_forwarded_for and respective count in each of the bins over whatever window. Currently, the below query creates a sorted chart of the most occurring x_forwarded_for and their respective count over the entire lookback window, instead of each bin. I think I need to fit head 1 in there somewhere.

It's likely some or all of the x_forwarded_for's across those bins are repeats and I'd like that charted, so no unique counts.

Any help is appreciated!

 

 

 

 

index="canvas_*" cluster="*" | where isnull(user_id)| bin _time span=5m | stats count by x_forwarded_for | sort - count

 

 

 

 

Labels (3)
0 Karma
1 Solution

PickleRick
SplunkTrust
SplunkTrust

As I said, if you want top X from each bin, you have to count and filter 🙂

So instead of the | stats first... you do

| streamstats count as rank by _time
| where rank<=X

Oh, and if you have so many results, add limit of 0 to sort. Like instead of

| sort _time - count

 do

| sort 0 _time - count

(i keep forgetting about it)

View solution in original post

0 Karma

PickleRick
SplunkTrust
SplunkTrust

You start pretty well but then you lose the time information along the way

index="canvas_*" cluster=* NOT user_id=*
| bin _time span=5m
| stats count by x_forwarded_for _time
| sort _time - count
| stats first(*) by _time

If I understand correctly what you want to achieve.

If you want more than one "top" result, instead of stats first you'd need to do streamstats count to add rank order and do a where to limit results by this rank value.

tcouture37
Explorer

If you want more than one "top" result, instead of stats first you'd need to do streamstats count to add rank order and do a where to limit results by this rank value.


Should I interpret this as in more than one "top" result across the first bin or across all bins?

The query you provided almost  gets me there. I was hoping it would get the top value across all bins. This seems to just return the highest count value for the first bin.

0 Karma

PickleRick
SplunkTrust
SplunkTrust

Hmm. That should have given you top values for every bin. At least that's how I understood your specs.

Drop the last stats and see what you get.

You should get a table of counted x_forwarded_for values sorted by count, groupped by time (divided into 5-minute bins)

0 Karma

tcouture37
Explorer

@PickleRick wrote:

Hmm. That should have given you top values for every bin. At least that's how I understood your specs.

Drop the last stats and see what you get.

You should get a table of counted x_forwarded_for values sorted by count, groupped by time (divided into 5-minute bins)


 

x_forwarded_for_timecount
x.x.x.12022-02-10 13:20:00191
x.x.x.22022-02-10 13:20:0077
x.x.x.32022-02-10 13:20:0066

every ip in the table is in the same 13:20:00 _time bin when I remove the last stat.

One thing that might be interesting to note, when I do execute the query with stats first(*) by _time, halfway through the execution, it seems like there's a chart with a row for every bin with an x_forwarded_for value and count associated with that _time bin. By the end of the execution, that list is compressed to a single row.

0 Karma

PickleRick
SplunkTrust
SplunkTrust

Ok. If you drop the last stats you should get the table like you showed.

After all the rows with decreasing counts at 2022-02-10 13:20:00, you should get similarily sorted entries from 13:25, then from 13:30 and so on (that's where stats first picks from).

Look at this:

Screenshot_20220210-225051_Firefox Beta.jpg

0 Karma

tcouture37
Explorer

Is there a way to only display the top 10 results from each bin? With my dataset, there are so many different x_forwarded_for that there are 100 pages (max return) with 100 x_forwarded_for ips per page for just the first bin in the lookback window.

0 Karma

PickleRick
SplunkTrust
SplunkTrust

As I said, if you want top X from each bin, you have to count and filter 🙂

So instead of the | stats first... you do

| streamstats count as rank by _time
| where rank<=X

Oh, and if you have so many results, add limit of 0 to sort. Like instead of

| sort _time - count

 do

| sort 0 _time - count

(i keep forgetting about it)

0 Karma

tcouture37
Explorer

Ohhhh, you really meant creating rank as a variable and using that. This is working like a charm!

You are a rockstar and have helped me better understand bin and TIL streamstats. Thank you very much PickleRick.

0 Karma

PickleRick
SplunkTrust
SplunkTrust

I'm happy I could have been of help 🙂

Have much fun with your splunk!

0 Karma
Get Updates on the Splunk Community!

Developer Spotlight with Paul Stout

Welcome to our very first developer spotlight release series where we'll feature some awesome Splunk ...

State of Splunk Careers 2024: Maximizing Career Outcomes and the Continued Value of ...

For the past four years, Splunk has partnered with Enterprise Strategy Group to conduct a survey that gauges ...

Data-Driven Success: Splunk & Financial Services

Splunk streamlines the process of extracting insights from large volumes of data. In this fast-paced world, ...