Hello Gurus, I have a log file which is almost structured . I need to extract all the fields from it. Its working fine for few of the fields but not all the fields are not present in the interesting field corner. I need to extract fields like (PID , TID , PROC , INSTANCE )
Below is the log.
2020-01-27-15.00.10.349880-480 I930031A600 LEVEL: Error
PID : 30868490 TID : 180042 PROC : db2sysc 0
INSTANCE: db2prd2 NODE : 000 DB : PRODDW_2
APPHDL : 0-55088 APPID: 170.2.78.74.45949.200127223832
UOWID : 101 ACTID: 1
AUTHID : DWFLDREP HOSTNAME: db2udb04.us164.corpintra.net
EDUID : 180042 EDUNAME: db2agnts (PRODDW_2) 0
FUNCTION: DB2 UDB, runtime interpreter, sqlrisrt, probe:3312
DATA #2 : Hexdump, 4 bytes
0x0A000000A83FD4C4 : 800F 0003 ....
Collapse
host = ip-172-31-46-255.us-east-2.compute.internalsource = db2diag.sample.logsourcetype = Swaroop_task
I tried using Regex but was not able to as its not working for all. Can someone please help.
Hi @Nilesh3110,
regexes is the easiest way to extract fields for a structured log as your.
This is a regex from your sample to check because I saw that it isn't regular (sometimes there's a space betweeen field names and two dots and sometimes not).
Anyway try this regex that you can test at https://regex101.com/r/aZHqxZ/1
| rex "(?ms)LEVEL:\s+(?<LEVEL>.*)PID\s+:\s(?<PID>\d+)\s+TID\s:\s+(?<TID>\d+)\s+PROC\s+:\s+(?<PROC>\w+)\s+\d+\s+INSTANCE:\s+(?<INSTANCE>\w+)\s+NODE\s+:\s+(?<NODE>\d+)\s+DB\s+:\s+(?<DB>\w+)\s+APPHDL\s+:\s+(?<APPHDL>[^ ]+)\s+APPID:\s+(?<APPID>[^ ]+)UOWID\s+:\s+(?<UOWID>\d+)\s+ACTID:\s+(?<ACTID>\d+)\s+AUTHID\s+:\s+\w+\s+HOSTNAME:\s+(?<HOSTNAME>[^ ]+)\s+EDUID\s+:\s+(?<EDUID>\d+)\s+EDUNAME:\s+(?<EDUNAME>.*)\s+FUNCTION:\s+(?<FUNCTION>\w+)"
Ciao.
Giuseppe
Hi
Check this
| makeresults
| eval test="2020-01-27-15.00.10.349880-480 I930031A600 LEVEL: ErrorPID : 30868490 TID : 180042 PROC : db2sysc 0 INSTANCE: db2prd2 NODE : 000 DB : PRODDW_2 APPHDL : 0-55088 APPID: 170.2.78.74.45949.200127223832 UOWID : 101 ACTID: 1 AUTHID : DWFLDREP HOSTNAME: db2udb04.us164.corpintra.netr EDUID : 180042 EDUNAME: db2agnts (PRODDW_2) 0 FUNCTION: DB2 UDB, runtime interpreter, sqlrisrt, probe:3312 DATA #2 : Hexdump, 4 bytes 0x0A000000A83FD4C4 : 800F 0003"
| rex field=test max_match=0 "(?P<temp>\s{0,}\w+\s{0,}:\s{0,}\w+)"
| mvexpand temp
| rex field=temp "(?P<key>\w+)\s{0,}:\s{0,}(?P<value>\w+)"
| table key value
| eval key=trim(key),value=trim(value)
| transpose 0 header_field=key
| fields - column
Try this. Note that this may extract part of values for the fields values containing spaces and '\n' (like PROC, EDUNAME). You can use rex to extract them.
| makeresults | eval _raw=replace("2020-01-27-15.00.10.349880-480 I930031A600 LEVEL: Error
PID : 30868490 TID : 180042 PROC : db2sysc 0
INSTANCE: db2prd2 NODE : 000 DB : PRODDW_2
APPHDL : 0-55088 APPID: 170.2.78.74.45949.200127223832
UOWID : 101 ACTID: 1
AUTHID : DWFLDREP HOSTNAME: db2udb04.us164.corpintra.net
EDUID : 180042 EDUNAME: db2agnts (PRODDW_2) 0
FUNCTION: DB2 UDB, runtime interpreter, sqlrisrt, probe:3312
DATA #2 : Hexdump, 4 bytes
0x0A000000A83FD4C4 : 800F 0003", "\s*:\s*", ":") | extract pairdelim=" \n" kvdelim=":" | rex field=_raw "PROC:(?<PROC>.*)\sINSTANCE"
Hi @Nilesh3110,
regexes is the easiest way to extract fields for a structured log as your.
This is a regex from your sample to check because I saw that it isn't regular (sometimes there's a space betweeen field names and two dots and sometimes not).
Anyway try this regex that you can test at https://regex101.com/r/aZHqxZ/1
| rex "(?ms)LEVEL:\s+(?<LEVEL>.*)PID\s+:\s(?<PID>\d+)\s+TID\s:\s+(?<TID>\d+)\s+PROC\s+:\s+(?<PROC>\w+)\s+\d+\s+INSTANCE:\s+(?<INSTANCE>\w+)\s+NODE\s+:\s+(?<NODE>\d+)\s+DB\s+:\s+(?<DB>\w+)\s+APPHDL\s+:\s+(?<APPHDL>[^ ]+)\s+APPID:\s+(?<APPID>[^ ]+)UOWID\s+:\s+(?<UOWID>\d+)\s+ACTID:\s+(?<ACTID>\d+)\s+AUTHID\s+:\s+\w+\s+HOSTNAME:\s+(?<HOSTNAME>[^ ]+)\s+EDUID\s+:\s+(?<EDUID>\d+)\s+EDUNAME:\s+(?<EDUNAME>.*)\s+FUNCTION:\s+(?<FUNCTION>\w+)"
Ciao.
Giuseppe