ok, so I am trying to pull some fields from the following log file entry:
"",11/21/2019 8:19:49 PM,11/21/2019 8:19:49 PM,"\CS\Projects\Sample\Development Environment",10429,"Config","Info","7016943","local:{d597da58-6b69-4a9a-b494-0e97e49a43b8}","31C6E90FC53FAAE9B1273378DB1FF34D2338195D","0","0","SIGNING_AUDIT","745","{""Algorithm"":""SHA256"",""CommandLine"":""\""C:\\Program Files\\Microsoft Office\\Root\\Office16\\WINWORD.EXE\"" \/n \""C:\\Users\\tb\\Documents\\Evaluation Guide Supplement.docx"",""Executable"":""C:\\Program Files\\Microsoft Office\\Root\\Office16\\WINWORD.EXE"",""ExecutableHash"":""A5EE905C1E7372904AF2BFD2695337B1214440D0DB89033D26BD070360838905"",""ExecutableSigner"":""CN=Microsoft Corporation, O=Microsoft Corporation, L=Redmond, S=Washington, C=US"",""ExecutableSize"":1951728,""Key"":""31C6E90FC53FAAE9B1273378DB1FF34D2338195D"",""Machine"":""07WKSWIN150536"",""PlaintextBase64"":""DslN3Fo9lTUEJZkwGdYQ1uua+9zkVsji9nZJD3M1qV4="",""PrefixedUniversal"":""local:{d597da58-6b69-4a9a-b494-0e97e49a43b8}"",""WindowsUser"":""ad\\tb""}","CS - Signing Successful","A signing request with key 31C6E90FC53FAAE9B1273378DB1FF34D2338195D from user tb@redacted.com was successfully completed.
Code Signing Audit record:
Key: 31C6E90FC53FAAE9B1273378DB1FF34D2338195D
Artifact: {0E, C9, 4D, DC, 5A, 3D, 95, 35, 04, 25, 99, 30, 19, D6, 10, D6, EB, 9A, FB, DC, E4, 56, C8, E2, F6, 76, 49, 0F, 73, 35, A9, 5E}
Hashing Algorithm: SHA256
Machine: 07WKSWIN150536
Remote Account: tony.hadfield
Authenticated User: tb@redacted.com Command: ""C:\Program Files\Microsoft Office\Root\Office16\WINWORD.EXE"" /n ""C:\Users\tb\Documents\Evaluation Guide Supplement.docx
Application Hash: A5EE905C1E7372904AF2BFD2695337B1214440D0DB89033D26BD070360838905
The regex I am using in my transforms.conf works fine on regex101.com:
Here is my transforms.conf:
REGEX = (?:\"\")(\w+)(?:\"\":)(?:\"\")(.*?)(?<!\\\\)(?:\"\")
FORMAT = $1::$2
And my props.conf:
KV_MODE = none
category = custom
pulldown_type = true
TRANSFORMS-MyCustomType = MyStringValues
The issue I am having, is the matches are only partially working. It pulling out a bunch of stuff not related to my regex and destroying my regex results. Here is what is pulled out into the index:
Algorithm = SHA256C=US = CommandLine = \Corporation, = Corporation, = Executable = C:\ProgramExecutableHash = A5EE905C1E7372904AF2BFD2695337B1214440D0DB89033D26BD070360838905ExecutableSigner = CN=MicrosoftFiles\Microsoft = Key = 31C6E90FC53FAAE9B1273378DB1FF34D2338195DL=Redmond, = Machine = 07WKSWIN150536O=Microsoft = Office\Root\Office16\WINWORD.EXE = PlaintextBase64 = DslN3Fo9lTUEJZkwGdYQ1uua+9zkVsji9nZJD3M1qV4=PrefixedUniv
Notice it's pulling a bunch of "= " garbage values. It's completely confused by my escaped quotes withing the file paths. Any ideas of what I am doing wrong?
Here is my regex approach:
Note: It will not capture values that are not escaped (e.g. ExecutableSize"":1951728). For those values I would write a new extraction.
I had bad experience before with Splunk regex and look ahead/behind.
Marko P.
Thanks to4kawa, but if you don't mind me asking - how would I use this? I see how well it works in the search window, but how would I set this up for ongoing use? For example, I want to create an app or source type that does this each time. How would this be used? Any hints or videos/articles to get this figured out would be appreciated?
Do you read collect
please output the results to summary index using Reports .
your dashboard can search index=your_summary_index
Splunk Knowledge Object: Detail discussion on Summary Index@youtube
Use summary indexing@Splunk>docs
| makeresults
| eval _raw="\"\",11/21/2019 8:19:49 PM,11/21/2019 8:19:49 PM,\"\\CS\\Projects\\Sample\\Development Environment\",10429,\"Config\",\"Info\",\"7016943\",\"local:{d597da58-6b69-4a9a-b494-0e97e49a43b8}\",\"31C6E90FC53FAAE9B1273378DB1FF34D2338195D\",\"0\",\"0\",\"SIGNING_AUDIT\",\"745\",\"{\"\"Algorithm\"\":\"\"SHA256\"\",\"\"CommandLine\"\":\"\"\\\"\"C:\\\\Program Files\\\\Microsoft Office\\\\Root\\\\Office16\\\\WINWORD.EXE\\\"\" \\/n \\\"\"C:\\\\Users\\\\tb\\\\Documents\\\\Evaluation Guide Supplement.docx\"\",\"\"Executable\"\":\"\"C:\\\\Program Files\\\\Microsoft Office\\\\Root\\\\Office16\\\\WINWORD.EXE\"\",\"\"ExecutableHash\"\":\"\"A5EE905C1E7372904AF2BFD2695337B1214440D0DB89033D26BD070360838905\"\",\"\"ExecutableSigner\"\":\"\"CN=Microsoft Corporation, O=Microsoft Corporation, L=Redmond, S=Washington, C=US\"\",\"\"ExecutableSize\"\":1951728,\"\"Key\"\":\"\"31C6E90FC53FAAE9B1273378DB1FF34D2338195D\"\",\"\"Machine\"\":\"\"07WKSWIN150536\"\",\"\"PlaintextBase64\"\":\"\"DslN3Fo9lTUEJZkwGdYQ1uua+9zkVsji9nZJD3M1qV4=\"\",\"\"PrefixedUniversal\"\":\"\"local:{d597da58-6b69-4a9a-b494-0e97e49a43b8}\"\",\"\"WindowsUser\"\":\"\"ad\\\\tb\"\"}\",\"CS - Signing Successful\",\"A signing request with key 31C6E90FC53FAAE9B1273378DB1FF34D2338195D from user tb@redacted.com was successfully completed.
Code Signing Audit record:
Key: 31C6E90FC53FAAE9B1273378DB1FF34D2338195D
Artifact: {0E, C9, 4D, DC, 5A, 3D, 95, 35, 04, 25, 99, 30, 19, D6, 10, D6, EB, 9A, FB, DC, E4, 56, C8, E2, F6, 76, 49, 0F, 73, 35, A9, 5E}
Hashing Algorithm: SHA256
Machine: 07WKSWIN150536
Remote Account: tony.hadfield
Authenticated User: tb@redacted.com
Command: \"\"C:\\Program Files\\Microsoft Office\\Root\\Office16\\WINWORD.EXE\"\" /n \"\"C:\\Users\\tb\\Documents\\Evaluation Guide Supplement.docx
Application Hash: A5EE905C1E7372904AF2BFD2695337B1214440D0DB89033D26BD070360838905
| rex "(?s)(?<json>\"{\".+?\"}\"),(?<message>.+)"
| eval json=trim(replace(json,"\"\"","\""),"\"")
| spath input=json
| rex "^(?<clientip>[^,]+),(?<ctime>[^,]+),(?<atime>[^,]+),(?<project>[^,]+)"
| appendpipe
[eval message=split(message,"
| mvexpand message
| rex max_match=20 field=message "(?im)\s+(?<fieldname>[A-Z].+): (?<unit>.+$)"
| eval {fieldname}=unit
| stats values(*) as *
| fields - fieldname unit]
| selfjoin Machine
| fields - _raw _time json message
REGEX = (?:\"\")(\w+)(?:\"\":)(\d+|((?:\"\")(.+?)(?:\"\")))(?:,|})
FORMAT = $1::$4
I tried a lot, but eventually came to the conclusion that it was better to cut it in transforms.conf.
is useful for extracting by search
so,Instead of doing it in transforms.conf
there is also a way to run my query and make it a summary index with collect
Thanks to4kawa, this looks fantastic and is exactly the type of output I was hoping to see. How would you take this same approach for doing this at time of ingestion or index? Any pointers to either video or tutorial, I am pretty new at this... 🙂
I amended my answer, please confirm.