ok, so I am trying to pull some fields from the following log file entry:
"127.0.0.1",11/21/2019 8:19:49 PM,11/21/2019 8:19:49 PM,"\CS\Projects\Sample\Development Environment",10429,"Config","Info","7016943","local:{d597da58-6b69-4a9a-b494-0e97e49a43b8}","31C6E90FC53FAAE9B1273378DB1FF34D2338195D","0","0","SIGNING_AUDIT","745","{""Algorithm"":""SHA256"",""CommandLine"":""\""C:\\Program Files\\Microsoft Office\\Root\\Office16\\WINWORD.EXE\"" \/n \""C:\\Users\\tb\\Documents\\Evaluation Guide Supplement.docx"",""Executable"":""C:\\Program Files\\Microsoft Office\\Root\\Office16\\WINWORD.EXE"",""ExecutableHash"":""A5EE905C1E7372904AF2BFD2695337B1214440D0DB89033D26BD070360838905"",""ExecutableSigner"":""CN=Microsoft Corporation, O=Microsoft Corporation, L=Redmond, S=Washington, C=US"",""ExecutableSize"":1951728,""Key"":""31C6E90FC53FAAE9B1273378DB1FF34D2338195D"",""Machine"":""07WKSWIN150536"",""PlaintextBase64"":""DslN3Fo9lTUEJZkwGdYQ1uua+9zkVsji9nZJD3M1qV4="",""PrefixedUniversal"":""local:{d597da58-6b69-4a9a-b494-0e97e49a43b8}"",""WindowsUser"":""ad\\tb""}","CS - Signing Successful","A signing request with key 31C6E90FC53FAAE9B1273378DB1FF34D2338195D from user tb@redacted.com was successfully completed.
Code Signing Audit record:
Key: 31C6E90FC53FAAE9B1273378DB1FF34D2338195D
Artifact: {0E, C9, 4D, DC, 5A, 3D, 95, 35, 04, 25, 99, 30, 19, D6, 10, D6, EB, 9A, FB, DC, E4, 56, C8, E2, F6, 76, 49, 0F, 73, 35, A9, 5E}
Hashing Algorithm: SHA256
Machine: 07WKSWIN150536
Remote Account: tony.hadfield
Authenticated User: tb@redacted.com Command: ""C:\Program Files\Microsoft Office\Root\Office16\WINWORD.EXE"" /n ""C:\Users\tb\Documents\Evaluation Guide Supplement.docx
Application Hash: A5EE905C1E7372904AF2BFD2695337B1214440D0DB89033D26BD070360838905
"
The regex I am using in my transforms.conf works fine on regex101.com:
(?:\"\")(\w+)(?:\"\":)(\"\".*?(?<!\\)\"\")
Here is my transforms.conf:
[MyStringValues]
REGEX = (?:\"\")(\w+)(?:\"\":)(?:\"\")(.*?)(?<!\\\\)(?:\"\")
FORMAT = $1::$2
REPEAT_MATCH = true
WRITE_META = true
And my props.conf:
[myCustomType]
KV_MODE = none
NO_BINARY_CHECK = true
SHOULD_LINEMRGE = true
category = custom
pulldown_type = true
TRANSFORMS-MyCustomType = MyStringValues
The issue I am having, is the matches are only partially working. It pulling out a bunch of stuff not related to my regex and destroying my regex results. Here is what is pulled out into the index:
Algorithm = SHA256C=US = CommandLine = \Corporation, = Corporation, = Executable = C:\ProgramExecutableHash = A5EE905C1E7372904AF2BFD2695337B1214440D0DB89033D26BD070360838905ExecutableSigner = CN=MicrosoftFiles\Microsoft = Key = 31C6E90FC53FAAE9B1273378DB1FF34D2338195DL=Redmond, = Machine = 07WKSWIN150536O=Microsoft = Office\Root\Office16\WINWORD.EXE = PlaintextBase64 = DslN3Fo9lTUEJZkwGdYQ1uua+9zkVsji9nZJD3M1qV4=PrefixedUniv
Notice it's pulling a bunch of "= " garbage values. It's completely confused by my escaped quotes withing the file paths. Any ideas of what I am doing wrong?
Hi,
Here is my regex approach:
(?:\"\")(\w+)(?:\"\":)(\"\"[\w\W]+?\"\")(?:,|})
Note: It will not capture values that are not escaped (e.g. ExecutableSize"":1951728). For those values I would write a new extraction.
I had bad experience before with Splunk regex and look ahead/behind.
BR,
Marko P.
Thanks to4kawa, but if you don't mind me asking - how would I use this? I see how well it works in the search window, but how would I set this up for ongoing use? For example, I want to create an app or source type that does this each time. How would this be used? Any hints or videos/articles to get this figured out would be appreciated?
Do you read collect
docs?
please output the results to summary index using Reports .
your dashboard can search index=your_summary_index
cf.
Splunk Knowledge Object: Detail discussion on Summary Index@youtube
Use summary indexing@Splunk>docs
UPDATE:
| makeresults
| eval _raw="\"127.0.0.1\",11/21/2019 8:19:49 PM,11/21/2019 8:19:49 PM,\"\\CS\\Projects\\Sample\\Development Environment\",10429,\"Config\",\"Info\",\"7016943\",\"local:{d597da58-6b69-4a9a-b494-0e97e49a43b8}\",\"31C6E90FC53FAAE9B1273378DB1FF34D2338195D\",\"0\",\"0\",\"SIGNING_AUDIT\",\"745\",\"{\"\"Algorithm\"\":\"\"SHA256\"\",\"\"CommandLine\"\":\"\"\\\"\"C:\\\\Program Files\\\\Microsoft Office\\\\Root\\\\Office16\\\\WINWORD.EXE\\\"\" \\/n \\\"\"C:\\\\Users\\\\tb\\\\Documents\\\\Evaluation Guide Supplement.docx\"\",\"\"Executable\"\":\"\"C:\\\\Program Files\\\\Microsoft Office\\\\Root\\\\Office16\\\\WINWORD.EXE\"\",\"\"ExecutableHash\"\":\"\"A5EE905C1E7372904AF2BFD2695337B1214440D0DB89033D26BD070360838905\"\",\"\"ExecutableSigner\"\":\"\"CN=Microsoft Corporation, O=Microsoft Corporation, L=Redmond, S=Washington, C=US\"\",\"\"ExecutableSize\"\":1951728,\"\"Key\"\":\"\"31C6E90FC53FAAE9B1273378DB1FF34D2338195D\"\",\"\"Machine\"\":\"\"07WKSWIN150536\"\",\"\"PlaintextBase64\"\":\"\"DslN3Fo9lTUEJZkwGdYQ1uua+9zkVsji9nZJD3M1qV4=\"\",\"\"PrefixedUniversal\"\":\"\"local:{d597da58-6b69-4a9a-b494-0e97e49a43b8}\"\",\"\"WindowsUser\"\":\"\"ad\\\\tb\"\"}\",\"CS - Signing Successful\",\"A signing request with key 31C6E90FC53FAAE9B1273378DB1FF34D2338195D from user tb@redacted.com was successfully completed.
Code Signing Audit record:
Key: 31C6E90FC53FAAE9B1273378DB1FF34D2338195D
Artifact: {0E, C9, 4D, DC, 5A, 3D, 95, 35, 04, 25, 99, 30, 19, D6, 10, D6, EB, 9A, FB, DC, E4, 56, C8, E2, F6, 76, 49, 0F, 73, 35, A9, 5E}
Hashing Algorithm: SHA256
Machine: 07WKSWIN150536
Remote Account: tony.hadfield
Authenticated User: tb@redacted.com
Command: \"\"C:\\Program Files\\Microsoft Office\\Root\\Office16\\WINWORD.EXE\"\" /n \"\"C:\\Users\\tb\\Documents\\Evaluation Guide Supplement.docx
Application Hash: A5EE905C1E7372904AF2BFD2695337B1214440D0DB89033D26BD070360838905
\""
| rex "(?s)(?<json>\"{\".+?\"}\"),(?<message>.+)"
| eval json=trim(replace(json,"\"\"","\""),"\"")
| spath input=json
| rex "^(?<clientip>[^,]+),(?<ctime>[^,]+),(?<atime>[^,]+),(?<project>[^,]+)"
| appendpipe
[eval message=split(message,"
")
| mvexpand message
| rex max_match=20 field=message "(?im)\s+(?<fieldname>[A-Z].+): (?<unit>.+$)"
| eval {fieldname}=unit
| stats values(*) as *
| fields - fieldname unit]
| selfjoin Machine
| fields - _raw _time json message
transfoms.conf
[MyStringValues]
REGEX = (?:\"\")(\w+)(?:\"\":)(\d+|((?:\"\")(.+?)(?:\"\")))(?:,|})
FORMAT = $1::$4
REPEAT_MATCH = true
WRITE_META = true
https://regex101.com/r/P613Br/1
I tried a lot, but eventually came to the conclusion that it was better to cut it in transforms.conf.
spath
is useful for extracting by search
so,Instead of doing it in transforms.conf
there is also a way to run my query and make it a summary index with collect
Thanks to4kawa, this looks fantastic and is exactly the type of output I was hoping to see. How would you take this same approach for doing this at time of ingestion or index? Any pointers to either video or tutorial, I am pretty new at this... 🙂
@thadfield
I amended my answer, please confirm.