I have some Nessus vulnerability scanner exports I am trying to properly parse in Splunk. The output is CSV (I know there's an app for Nessus and I have reasons for not using it). I'm grabbing KV pairs via props.conf. However, the OS signature is not extracting well.
What I would like to do is capture the text after "Remote operating system : " until the end of the line.
Here's the relevant props.conf line: EXTRACT-OS = Remote\soperating\ssystem\s:\s(?.*)Confidence
The reason for the "Confidence" word in there is because the regex in props.conf will not work with typical end-of-line notations, like $, or \r\n, or \n. Basically, I'm seeing that props.conf is including line breaks automatically in searches, but not allowing me to use them. This works for most OS signatures except those that list multiple OSes.
Here's a relevant section from the CSV:
Remote operating system : CISCO IOS 15
CISCO IOS 12
Cisco IOS XE
CISCO PIX
Confidence Level : 69
Method : SSH
When I open the original CSV in Notepad++, it shows each line ends with a "LF" (line feed, I believe). And at the very end of all of it, there is an "LF" and a "CR". It seems to treat "$" as the "CR." How do I get my props.conf line to stop reading after the very first "LF", the "CISCO IOS 15" in this specific example?
This works for me using your example data.
Remote\soperating\ssystem\s:\s([\w ]*)
This works for me using your example data.
Remote\soperating\ssystem\s:\s([\w ]*)
This worked perfectly. I'm curious as to why, though, especially putting in a space after the \w
. I didn't think the pcre would even recognize that, much less use it. And why would the line breaks be included without the \w
?
Apparently, there is an implicit (?s) flag (do you have SHOULD_LINEMERGE set?) which means the dot specifier will match line ends. Using [\w ]
will match word characters and spaces, but not newlines. You can put any character within brackets (some need to be escaped) and regex will match on them.
Check out regex101 for a great regex test tool.