Alerting

How to update my regular expression to find special characters or long strings in source?

rachaelcrook89
Explorer

I have an Apache Tomcat web server that logs a file each time an authentication attempt is made. The name of the file includes the username, a time stamp, and then ends in .xml. I would like to set up an alert that searches the source name and detects if a file name is out of the norm.

I've tried the following search and it just returns the full source path instead of showing filenames with special characters.

index="MyIndex"  source="D:\\software\\dev\\current\\logs\\*\\*"  | rex field=source "(?.*[^\w^\\^.\^:].*.xml)"| table filename

So for example, these are two log files that would be normal.

D:\software\dev\current\logs\2017_01_23\JSMITH1_1485182355831_0.xml
D:\software\dev\current\logs\2017_01_22\JDOE2_1485128364222_0.xml

This would not be considered normal and should trigger an alert.

D:\software\dev\current\logs\2017_01_22\SELECT * () from 2832 ^&%_1485128364222_0.xml

Any help is great appreciated!

0 Karma
1 Solution

somesoni2
Revered Legend

Try something like this. This is a faster search which gets all the source values available in MyIndex for selected time range and returns name of sources which are not following the pattern \w+_\d+_\w+\.xml (username_epochtimestamp_something.xml) for file name.

| tstats count WHERE index=MyIndex source="D:\\software\\dev\\current\\logs\\*\\*"  by source|  regex source!="\w:(\\\[^\\\]+)+\\\(\w+_\d+_\w+\.xml)"

View solution in original post

somesoni2
Revered Legend

Try something like this. This is a faster search which gets all the source values available in MyIndex for selected time range and returns name of sources which are not following the pattern \w+_\d+_\w+\.xml (username_epochtimestamp_something.xml) for file name.

| tstats count WHERE index=MyIndex source="D:\\software\\dev\\current\\logs\\*\\*"  by source|  regex source!="\w:(\\\[^\\\]+)+\\\(\w+_\d+_\w+\.xml)"

View solution in original post

DalJeanis
SplunkTrust
SplunkTrust

somesoni2 - what's the purpose of the second parenthesis in that regular expression?

0 Karma

rachaelcrook89
Explorer

Thank you, this worked! You solved my regex dilemma. I just added this to the end to only show results that are over the normal length.

| eval length=len(source) | search length>73 | table source count length

0 Karma

somesoni2
Revered Legend

Its just my habit to enclose the regex in braces. easier when you're converting them to field extractions.

0 Karma

DalJeanis
SplunkTrust
SplunkTrust

This seems to work.

| makeresults 
| eval source="D:\software\dev\current\logs\2017_01_23\JSMITH1_1485182355831_0.xml"
| append [|makeresults | eval source=" D:\software\dev\current\logs\2017_01_22\JDOE2_1485128364222_0.xml"]
| append [|makeresults | eval source=" D:\software\dev\current\logs\2017_01_22\SELECT * () from 2832 ^&%_1485128364222_0.xml"]
| rex field=source "(?<filename>[\w]:.*\.xml)"
| rex field=source "(?<testname>[\w]:[\w\\\\]*\.xml)"
| eval nameokay = if(filename=testname,1,0)
| table filename testname nameokay

before the table command, |search nameokay=0 and you get only your bad results.

0 Karma
Did you miss .conf21 Virtual?

Good news! The event's keynotes and many of its breakout sessions are now available online, and still totally FREE!