How to search suspicious user-agent in web request...

j0hnn1ck · ‎11-16-2020

I put web request logs into Splunk.

I did a lookup csv file that included suspicious user-agents characters like below.

bad_user_agent

nmap

python

java

...

I need alert if user_agent field in web request log contains any word in csv file.

How can I do a query?

Example:

user_agent="Mozilla/5.0 (Windows NT 6.2; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.90 Safari/537.36" --> no alert

user_agent="Java/14.0.2" --> ALERT

user_agent="Mozilla/5.0 (compatible; Nmap Scripting Engine; https://nmap.org/book/nse.html)" --> ALERT

Thank you.

j0hnn1ck · ‎11-23-2020

Thanks.

Now error is gone but my query is not show any result.

I tested by adding a word "java" into my bad_user_agent list.

Result is 0 even if user_agent field have a word "java".

Or if you have another solution for my task, please feel free to tell me.

Thank you.

j0hnn1ck · ‎11-22-2020

It always show this error.

Lookup definition:

Lookup table file:

Lookup file content:

I already extracted user_agent field from the log.

Richfez · ‎11-23-2020

I think you are almost there.

You don't have two columns in your CSV file, so ... you can either add one, or you can just OUTPUT the original field as "found".

index=X
| lookup bad_user_agent user_agent OUTPUT user_agent AS found
| search found=*

Do note that there are definitely other ways to do this, too. Most have more side effects than this way does though, or is more fiddly and finicky or is less scalable.

Richfez · ‎11-17-2020

There's a common pattern to doing this.

Assuming we have no problems with lookups with special characters in them (I *think* that forward slashes won't bother anything? that semicolon may mess it up but I'm not sure... anyway, this is all testable, and able to be worked around if it causes problems!)...

And, I hope you have at least one other field in that lookup? Let's assume for a second you have a second field in there, you can use it like so:

index=foo sourcetype=weblog extra_search_stuff_goes_here
| lookup <yourlookupname> user_agent OUTPUT <fieldX> AS found
| search found=*

That would snag all your web logs, then run a lookup against them using your lookup, using the user_agent as the key, and would output the contents of that other field you had into a new field named "found". Lastly, just search for where, after all that, the "found" field is there and set to something.

If you don't have a second field in the CSV lookup yet, you can add a field to it and make it easy for yourself by calling it "found" and setting it to 1 everywhere in the CSV. Then your search is a little simpler.

index=foo sourcetype=weblog extra_search_stuff_goes_here
| lookup <yourlookupname> user_agent OUTPUT found
| search found=*

Right? Because if found is already a field in the CSV, then ... we just output that.

If you find that, say, forward slashes cause problems, you can remove them with rex from your data before doing the lookup, and in that case just remove them from the lookup too before resaving the csv file.

in any case, hopefully this answers your question. If it doesn't, then by all means let me know what we've missed and we can try again.

Also if this didn't answer your question, I can tell you in no uncertain terms that the solution will NOT involve shenanigans with `inputlookup` or `join`. So don't fall for the bad answers out there trying to get you to use those.

Happy Splunking,.

Rich

How to search suspicious user-agent in web request logs?

lookup

search job inspector

transaction

How to Monitor Google Kubernetes Engine (GKE)

Index This | How can you make 45 using only 4?

Splunk Education Goes to Washington | Splunk GovSummit 2024