I have fields for user and URL parsed into splunk from a proxy log and am trying to collate a table which displays me deduplicated users which have visited at least two of four or five URLs. E.G:
user1 - URL1, URL2, URL3
user 2 - URL2, URL5
etc...
What would be the best way of accomplishing this? I am not sure if I should be trying to transform or just format or something else entirely.
@Lewis1 Some of important details that can affect Splunk approach can include whether user and URL appear in the same events, what is the form of the "known short list", are there extra characters in URL from the event that can complicate matching the known list, and so on.
Now, assume that URL and user are exactly as you wanted. Further Assume that your "known short list" is compiled in a lookup file or KV store called "knownshortlist" in the following form
URL | IOC |
url1 | ioc1 |
url2 | ioc2 |
url3 | ioc3 |
url4 | ioc4 |
url5 | ioc5 |
You can then do something like
| lookup URL
| where isnotnull(IOC)
| stats values(URL) as URLs values(IOC) as IOCs by user
| where mvcount(IOCs) > 1
Hi @Lewis1 ...
Some more details are required about how users and urls are available in your logs..
maybe, give this a try..
your search
| eval urls=split(""/url/one","/url/two","/url/three"")
| table user url
| chart count OVER user BY url
Sorry for lack of details, I am at that place of not even quite understanding what i'm asking for yet lol!
Let me try and describe: I have a bunch of data ingested from a proxy log that I'm not actually interested in, but two of the accelerated fields are user and url which both populate from access logs (beyond this, I am not sure what extra detail would be helpful). I want to search for URLs which are IOCs (I have a known short list of them), and then separately compile a report of if any user has interacted with more than one of these URLs.
I don't really need the volume of access, primarily just the usernames of any user that has indeed hit more than one criteria. I had started creating searches/reports individually for each URL and was going to make a lookup table with the results which I could then produce a further report from but this feels overly complicated for what seems like a simple enough task. Thanks for your help!
@Lewis1 Some of important details that can affect Splunk approach can include whether user and URL appear in the same events, what is the form of the "known short list", are there extra characters in URL from the event that can complicate matching the known list, and so on.
Now, assume that URL and user are exactly as you wanted. Further Assume that your "known short list" is compiled in a lookup file or KV store called "knownshortlist" in the following form
URL | IOC |
url1 | ioc1 |
url2 | ioc2 |
url3 | ioc3 |
url4 | ioc4 |
url5 | ioc5 |
You can then do something like
| lookup URL
| where isnotnull(IOC)
| stats values(URL) as URLs values(IOC) as IOCs by user
| where mvcount(IOCs) > 1
This is really helpful thank you - my last question is how much more complex would this be if I said that my URL IOCs are specific domains but the logs might contain any number of variants of these (e.g. google.com, google.com/search?q=....) and I want to merge all of these potentially different items of traffic into one "hit")?
Can I just use wildcards to allow more flexibility in the search results?
Wildcard lookup is configurable. It may not be the best option depending on actual data. (Wildcard lookup has its own quirks although for long-ish strings like URL is perhaps OK.) If IOC only concerns domains, why not just extract domain from URL?
| rex field=URL "https*://(?<domain>[^\/]+)"