I am filtering some logs came from Nessus in order to identify vulnerable machines based on their OS, and the issue I have is when a host's OS is not adequately identified resulting in many "os" fields. An example is the below:
start_time="Mon Feb 16 03:56:07 2015"
end_time="Mon Feb 16 03:57:42 2015"
os="Microsoft Windows 2000" os="Microsoft Windows XP for Embedded Systems" os="Microsoft
The query that I created for that (which only works sufficiently when 1 OS is found) is the following:
sourcetype=nessus severity!=informational | rex "start_time=\"(?<start>.*)\"\send.*\s\sos=\"(?<OS>.*)\""
What I would like ideally to do, is to just find a way to filter out the " (double quote" symbol from within the extracted field. This is because apart from Windows machines, there are other printers and access points that are interpreted as many other mixed OSs.
So, it should be something like this:
sourcetype=nessus severity!=informational | rex "start_time=\"(?<start>.*)\"\send.*\s\sos=\"(?<OS>.*[^\"])\""
but it doesn't work.
Okay, here we are.
I guess I haven't stated my problem correctly. I do not want to remove the double quotes, actually, I want to only keep the first occurence of OS field in the rare cases that more than one appears!
Here is what I managed to do with sed, but I am not there quite yet.
sourcetype=nessus severity!=informational earliest=-5w@w1 latest=now|rex field=os mode=sed "s/.*\(os=\"[^\"]*\"\).*$/\1/g" | rex ".*os=\"(?P<OS>.*)\"\s.*\""
Any suggestions? 🙂
... View more