EDIT: Solved. Used regex to target the printable portion first then converted to ascii
For a couple dashboards, I'm using the following to display the plain text of hex data:
[search] | eval ascii=(ltrim(replace(data,"([A-F0-9]{2})","%\1"),"0x")) | table ascii
This works great for most everything.
However, when using it on snort's ET POLICY ZIP file download events, it gives me nothing.
Any ideas on why this is failing for specifically these alerts?
Things I'm aware of:
zip files are not plaintext. The filenames within them, however, are. The plan is to use regex to locate and extract filenames after.
Things I've confirmed:
The relevant field is labeled as "data" in working and non working examples.
The data field contains ONLY hex data
No lowercase, spaces, dashes, etc are used in the data field.
The data fields do contain the strings I'm trying to extract.
Ok, I think I've got a solution:
[search]
| rex field=data "504B.{56}(?<target>.{2,100}2E.{6})"
| [previous urldecode solution]
That should handle detecting any path/filename.ext up to 50+3.
It will still fail to detect files without an extension, but I'm at least at a 90% solution
Ok, I think I've got a solution:
[search]
| rex field=data "504B.{56}(?<target>.{2,100}2E.{6})"
| [previous urldecode solution]
That should handle detecting any path/filename.ext up to 50+3.
It will still fail to detect files without an extension, but I'm at least at a 90% solution
Not an actual production example for obvious reasons, but this would be representative:
0049454E44AE426082504B03041400020008003D6E4B5421BFC68AF27C0100D97C01000C00000043617074757265322E706E67009F4C60B389504E470D0A1A0A0000000D4948445200000204000000F10806000000B4C2AF15000000017352474200AECE1CE90000000467414D410000B18F0BFC6105000000097048597300000EC300000EC301C76FA86400000021744558744372656174696F6E2054696D6500323032323A30323A31312031333A34373A32357BE61C240000FF7849444154785EECFD079C6DD955DF89AF9B2BD7CBAF5F87D7E9756ED1516AB5720689248960F81BF3071BCF60ECB1FD99F90FC6FECC7804181B86F10C607B666CE380B10D260983050809E5DC
What I'd expect to get is:
.IEND®B`.PK........=nKT!¿Æ.ò|..Ù|......Capture2.png..L`³.PNG
.
...
IHDR.......ñ.....´Â¯.....sRGB.®Î.é....gAMA..±..üa.... pHYs...Ã...Ã.Ço¨d...!tEXtCreation Time.2022:02:11 13:47:25{æ.$..ÿxIDATx^ìý..mÙUß.¯.+×˯_.×éunÑQjµr..$.`ø.ó..Ï`ì±ý.ù.ÆþÌx....ñ.`{flã.±
& ... åÜ
from which I can offset from the PK and pull the file name Capture.png
Can you give example of working and non-working contents of data?
Looks like I replied to my own post rather than yours.
I think it's the excessive amount of non-printable characters that's breaking it. I'm going to try a regex to trim it to the target first.