On my search results, I need to hide some specific events from the output? Currently I am running a search to find if there are any credit card data available in the logs. I am using luhn lookup to fetch the results. But, this is creating 100's of false positive alerts a day. Mostly the false positive results are generated based on the session id or support id which we knows for sure, they are false positive.
As a solution, I have extracted the session id and support id and assigned it to a variable named CARD using regex. Then i have used that variable to check whether its a false positive or not? If false positive dont display the result.
Here is my logic, could you please correct my logic as it is not working?
Note: session id and support id is 18-20 alphanumeric. say u8956742397238567a. So if the digits matches PAN logic, then it will triggers the false positive alerts.
rex field=orig_raw "\sessiond_id:\s<[a-z]?(?\d+) [a-z]?>" | eval falsepositive = if (PAN == CARD, "0" ,"1") | where falsepositive = "0"
not sure where you're getting this, I used this example:
| stats count | eval foo="tom,sally,foo"| makemv foo delim="," | eval bar="tom, bob, world"| makemv foo delim="," | makemv bar delim="," | mvexpand foo | mvexpand bar | eval equals=if(foo==bar,"1", "0") | where equals="1"
this gives the result:
count bar equals foo
0 tom 1 tom
you should also be able to skip a step and do:
| stats count | eval foo="tom,sally,foo"| makemv foo delim="," | eval bar="tom, bob, world"| makemv foo delim="," | makemv bar delim="," | mvexpand foo | mvexpand bar | where foo=bar
and forget the eval. if you need to escape the field, wrap it in $
| stats count | eval foo="tom,sally,foo"| makemv foo delim="," | eval bar="tom, bob, world"| makemv foo delim="," | makemv bar delim="," | mvexpand foo | mvexpand bar | where foo=$bar$
seems like your rex may not be mirroring the fields.
Hi bbingham, thanks for your prompt response. My actual search is this - NOT sourcetype=stash |
get_integer_seq | lookup luhnlookup _raw OUTPUTNEW pii,piiclean | eval piilength=len(piiclean) | lookup iinlookup iin as piiclean,length as piilength OUTPUTNEW iinissuer | search iinissuer=* | `geteventid` | rename eventid as origeventid | eval origraw=raw | fields - raw | fields + origeventid,origraw,host,pii,iinissuer | eval piihash=sha1(pii) | eval origtime=time |rex field=origraw "\sessiondid:\s<[a-z]?(?CARD\d+) [a-z]?>" | eval falsepositive = if (PAN == CARD, "0" ,"1") | where falsepositive = "0" rename pii as PAN, orig_raw AS "Event log" .. there was a typo in the earlier
Alerts which are generating false positive,
2017-02-16T08:30:25+00:00 abc-f5-01 crit dcc................... sessionid:
2017-02-16T09:30:25+00:00 abc-f5-01 crit dcc................... sessionid:
2017-02-16T10:30:25+00:00 abc-f5-01 crit dcc................... sessionid:<745648609869456bc>
2017-02-16T11:30:25+00:00 abc-f5-01 crit dcc................... sessionid:<765756896789367389>
As per your reply i tried this - correct me if I am wrong - rex field=origraw "\sessiondid:\s<(?[a-z]?)(?\d+)(?[a-z]?>" | eval CARD="char1,CARD,char2" | makemv CARD delim="" allowempty=t | --- here i am confused, which one i need to use to compare against.
k so what I did was make "fake" events, I was just proving your logic functions. all of this "| stats count | eval foo="tom,sally,foo"| makemv foo delim="," | eval bar="tom, bob, world"| makemv foo delim="," | makemv bar delim="," | mvexpand foo | mvexpand bar | " is just to make fake events, then I run the where clause as you had. In your case, I have a feeling the where clause is breaking due to something in the event piece of the search not extracting the vars right. Can you send a couple of events that have the fields you're looking for broken out? like:
timestamp host level pid session
2017-02-16T08:30:25+00:00 abc-f5-01 crit dcc................... session_id
and then explain what makes the fals positive? is it the fact there is no session id?
thanks a lot for your prompt response. I will explain in detail. We are using luhn_lookup command to check whether any given logs has 16 digit credit card number (16 digits must be in sequence, same like 16 digit long number on our credit card). To check a 16 digit number is a credit card number or not use this URL (https://planetcalc.com/2465/)
For eg: if an URL has a 16 digit number and if it matches the credit card number (through luhnlookup algorithm), it will create an alert. It is bit weird as it generates 100's of false positive alerts a day. It is not just restricted to URL, session id/support id in a log, kernel error.
Currently, I am interested in sessionid in the log which are generated by f5 devices. sessionid will be generated in 4 types:
1. starts with an alphabet then only digits like - a68789758978678
2. starts and ends with alphabets and digits in the middle - h8676589765897608976c
3. starts with digits and ends with alphabet - 979679067896078609784a
4. only digits - 4586749867496749409
From the above 4 types, if the luhn lookup finds a match with a credit card number then it triggers an alert.
Our business units agreed that this is a false alert and as a team we have to make sure that on the F5 logs, if the alert is triggered on sessionid we have to suppress the alert. Apart from the sessionid field, if it triggers the alert, then we need to display it as a notable event.
As an example: 2017-02-16T08:30:25+00:00 abc-f5-01 crit dcc:0131000 [secsv] Request violations: HTTP protocol compliance failed........ HTTP1.1\r\n\Content-type:text:xml; charset=uf.,username:, sessionid:<340586286825705c>
luhnlookup matches the above sessionid,( if you copy paste these numbers to the planetcalc.com link it says, the issuer is American Express.) But we know this is a sessionid and not related to any credit card number. So its a false positive.
One more sessionid:<5178081683798099> - this is also a false positive.
if i get a session id:<0123456789101112> - this wont trigger any alert as it is not a credit card number as per the luhnlookup
Hope this helps.
where is falsepositive coming from? And you're extracting "origraw" correctly? I think you might be using rex wrong. You need to specify what field to run rex on "rex field=", that's probably _raw if you haven't extracted origraw. Then you'll want to specify a field name in your capture group "\sessiond_id:\s<[a-z]?(?\d+) [a-z]?>" etc.
What I would do is identify the session Id at the very beginning, and then sed it out of the _raw so that none of the other algorithms need to be consulted. Probably save you a whole lot of lookup and cleaning time.
Do something like this at the very front of the search -
| eval origraw=raw
| rex field=raw mode=sed "s/sessionid:\s\w+//g"
Thanks DalJeanis: Valid point, but it was initially turned down by Security team. Forgot the actual reason. That's why I am going through this painful process.
Um, I'm not saying to change the actual _raw, I m saying that the session ID is not relevant to what you are trying to detect, so remove the session ID from the data to be inspected before you pass it through the credit card number detector.
If Security has a problem with that strategy, just ask, bluntly, "What method would be used to hide a credit card number within a session ID, that could not be used to hide the credit card more thoroughly than that?"
If a bad actor has the ability to use the session ID or other fields for encryption, then they could use ANYTHING for encryption and you will never detect the card number.
If you were to share representative events, I am sure that we could create an optimal search that does what you need.