Splunk Search

How to modify my search to find IP addresses that hit exactly one URL?

sfrazer
Explorer

I'm trying to find IP addresses that hit a specific url and no other. I tried to use set diff but it's not returning results I expect.

If this search gives the IP addresses of everyone who hit url_a, let's say this returns 447 results:

sourcetype=weblogs request="GET /url_a/ HTTP*" | dedup ip | table ip | sort ip

And this request gives the IP addresses of everyone who hit a url underneath there, let's say this returns 314 results:

sourcetype=weblogs | regex request="^GET /url_a/[0-9a-z].* HTTP.*" | dedup ip | table ip | sort ip

I'm trying to find the list of IPs in the first list that are not in the second. set diff will also return items in the second search that aren't in the first, which is not what I want.

The other thing I tried was a subsearch like this:

sourcetype=weblogs request="GET /url_a/ HTTP*"  NOT [ search sourcetype=weblogs  | regex request="^GET /url_a/[0-9a-z].* HTTP.*" | dedup ip | table ip | sort ip] | dedup ip | table ip | sort ip

But this returns entries that are also in the second search, so it cannot be correct. Does anyone know of an effective way to do this?

Thanks!

0 Karma
1 Solution

mhpark
Path Finder

maybe something like this?
but, do you really need the regex?

sourcetype=weblogs
request="GET /url_a/*"
| regex request="^GET /url_a/([0-9a-z].*)? HTTP"
| stats values(request) as requests by ip
| search requests="GET /url_a/ HTTP*"
| sort ip

View solution in original post

mhpark
Path Finder

maybe something like this?
but, do you really need the regex?

sourcetype=weblogs
request="GET /url_a/*"
| regex request="^GET /url_a/([0-9a-z].*)? HTTP"
| stats values(request) as requests by ip
| search requests="GET /url_a/ HTTP*"
| sort ip

sfrazer
Explorer

I'm using the regex because request="GET /url_a/* will include both the following urls:

GET /url_a/
GET /url_a/url_b/

and I only want it to return the second of those two entries. url_b in this case could be one of a number of urls that start with a-z or 0-9

Your search is getting me closer. The stats values() piece seems to make a collection of urls for each IP, correct? The issue is that I'm still getting results that have multiple urls in their collection something like this:

ip  requests
1.1.1.1 
    GET /url_a/ HTTP/1.1
    GET /url_a/ad/ HTTP/1.1
    GET /url_a/do/ HTTP/1.1
    GET /url_a/ho/ HTTP/1.1
    GET /url_a/ju/ HTTP/1.1
    GET /url_a/of/ HTTP/1.1
1.1.1.2 
    GET /url_a/ HTTP/1.1
1.1.1.3 
    GET /url_a/ HTTP/1.1
    GET /url_a/di/ HTTP/1.1
1.1.1.4 
    GET /url_a/ HTTP/1.1
1.1.1.5 
    GET /url_a/ HTTP/1.1
    GET /url_a/al/ HTTP/1.1
    GET /url_a/ba/ HTTP/1.1
    GET /url_a/bu/ HTTP/1.1
    GET /url_a/gr/ HTTP/1.1
    GET /url_a/wh/ HTTP/1.1
1.1.1.6 
    GET /url_a/ HTTP/1.1
1.1.1.7 
    GET /url_a/ HTTP/1.0
    GET /url_a/bl/ HTTP/1.0

Out of those results I really only want 1.1.1.2, 1.1.1.4 and 1.1.1.6

0 Karma

sfrazer
Explorer

Aha, this helped me. And you're correct that the regex isn't needed in the code snippet you gave, but I did need it to do what I wanted. Here's the final form:

sourcetype=weblogs
 request="GET /url_a/*"
 | stats values(request) as requests by ip
 | search requests="GET /url_a/ HTTP*" | regex requests!="^GET /url_a/[0-9a-z]"
 | sort ip

Thanks for your help!

mhpark
Path Finder

You might rather go like

| search requests="GET /url_a/ HTTP*" and mvcount(requests) == 1

cause regex costs a lot.

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...