Splunk Search

Extract all fields from a log file

Nilesh3110
Explorer

Hello Gurus, I have a log file which is almost structured . I need to extract all the fields from it. Its working fine for few of the fields but not all the fields are not present in the interesting field corner. I need to extract fields like (PID , TID , PROC , INSTANCE )
Below is the log.

2020-01-27-15.00.10.349880-480 I930031A600 LEVEL: Error
PID : 30868490 TID : 180042 PROC : db2sysc 0
INSTANCE: db2prd2 NODE : 000 DB : PRODDW_2
APPHDL : 0-55088 APPID: 170.2.78.74.45949.200127223832
UOWID : 101 ACTID: 1
AUTHID : DWFLDREP HOSTNAME: db2udb04.us164.corpintra.net
EDUID : 180042 EDUNAME: db2agnts (PRODDW_2) 0
FUNCTION: DB2 UDB, runtime interpreter, sqlrisrt, probe:3312
DATA #2 : Hexdump, 4 bytes
0x0A000000A83FD4C4 : 800F 0003 ....
Collapse
host = ip-172-31-46-255.us-east-2.compute.internalsource = db2diag.sample.logsourcetype = Swaroop_task

I tried using Regex but was not able to as its not working for all. Can someone please help.

0 Karma
1 Solution

gcusello
SplunkTrust
SplunkTrust

Hi @Nilesh3110,
regexes is the easiest way to extract fields for a structured log as your.
This is a regex from your sample to check because I saw that it isn't regular (sometimes there's a space betweeen field names and two dots and sometimes not).
Anyway try this regex that you can test at https://regex101.com/r/aZHqxZ/1

| rex "(?ms)LEVEL:\s+(?<LEVEL>.*)PID\s+:\s(?<PID>\d+)\s+TID\s:\s+(?<TID>\d+)\s+PROC\s+:\s+(?<PROC>\w+)\s+\d+\s+INSTANCE:\s+(?<INSTANCE>\w+)\s+NODE\s+:\s+(?<NODE>\d+)\s+DB\s+:\s+(?<DB>\w+)\s+APPHDL\s+:\s+(?<APPHDL>[^ ]+)\s+APPID:\s+(?<APPID>[^ ]+)UOWID\s+:\s+(?<UOWID>\d+)\s+ACTID:\s+(?<ACTID>\d+)\s+AUTHID\s+:\s+\w+\s+HOSTNAME:\s+(?<HOSTNAME>[^ ]+)\s+EDUID\s+:\s+(?<EDUID>\d+)\s+EDUNAME:\s+(?<EDUNAME>.*)\s+FUNCTION:\s+(?<FUNCTION>\w+)"

Ciao.
Giuseppe

View solution in original post

vnravikumar
Champion

Hi

Check this

| makeresults 
| eval test="2020-01-27-15.00.10.349880-480 I930031A600 LEVEL: ErrorPID : 30868490 TID : 180042 PROC : db2sysc 0 INSTANCE: db2prd2 NODE : 000 DB : PRODDW_2 APPHDL : 0-55088 APPID: 170.2.78.74.45949.200127223832 UOWID : 101 ACTID: 1 AUTHID : DWFLDREP HOSTNAME: db2udb04.us164.corpintra.netr EDUID : 180042 EDUNAME: db2agnts (PRODDW_2) 0 FUNCTION: DB2 UDB, runtime interpreter, sqlrisrt, probe:3312 DATA #2 : Hexdump, 4 bytes 0x0A000000A83FD4C4 : 800F 0003" 
| rex field=test max_match=0 "(?P<temp>\s{0,}\w+\s{0,}:\s{0,}\w+)" 
| mvexpand temp 
| rex field=temp "(?P<key>\w+)\s{0,}:\s{0,}(?P<value>\w+)" 
| table key value 
| eval key=trim(key),value=trim(value) 
| transpose 0 header_field=key 
| fields - column
0 Karma

manjunathmeti
Champion

Try this. Note that this may extract part of values for the fields values containing spaces and '\n' (like PROC, EDUNAME). You can use rex to extract them.

| makeresults | eval _raw=replace("2020-01-27-15.00.10.349880-480 I930031A600 LEVEL: Error
PID : 30868490 TID : 180042 PROC : db2sysc 0
INSTANCE: db2prd2 NODE : 000 DB : PRODDW_2
APPHDL : 0-55088 APPID: 170.2.78.74.45949.200127223832
UOWID : 101 ACTID: 1
AUTHID : DWFLDREP HOSTNAME: db2udb04.us164.corpintra.net
EDUID : 180042 EDUNAME: db2agnts (PRODDW_2) 0
FUNCTION: DB2 UDB, runtime interpreter, sqlrisrt, probe:3312
DATA #2 : Hexdump, 4 bytes
0x0A000000A83FD4C4 : 800F 0003", "\s*:\s*", ":") | extract pairdelim=" \n" kvdelim=":" | rex field=_raw "PROC:(?<PROC>.*)\sINSTANCE"
0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @Nilesh3110,
regexes is the easiest way to extract fields for a structured log as your.
This is a regex from your sample to check because I saw that it isn't regular (sometimes there's a space betweeen field names and two dots and sometimes not).
Anyway try this regex that you can test at https://regex101.com/r/aZHqxZ/1

| rex "(?ms)LEVEL:\s+(?<LEVEL>.*)PID\s+:\s(?<PID>\d+)\s+TID\s:\s+(?<TID>\d+)\s+PROC\s+:\s+(?<PROC>\w+)\s+\d+\s+INSTANCE:\s+(?<INSTANCE>\w+)\s+NODE\s+:\s+(?<NODE>\d+)\s+DB\s+:\s+(?<DB>\w+)\s+APPHDL\s+:\s+(?<APPHDL>[^ ]+)\s+APPID:\s+(?<APPID>[^ ]+)UOWID\s+:\s+(?<UOWID>\d+)\s+ACTID:\s+(?<ACTID>\d+)\s+AUTHID\s+:\s+\w+\s+HOSTNAME:\s+(?<HOSTNAME>[^ ]+)\s+EDUID\s+:\s+(?<EDUID>\d+)\s+EDUNAME:\s+(?<EDUNAME>.*)\s+FUNCTION:\s+(?<FUNCTION>\w+)"

Ciao.
Giuseppe

Get Updates on the Splunk Community!

Get Operational Insights Quickly with Natural Language on the Splunk Platform

In today’s fast-paced digital world, turning data into actionable insights is essential for success. With ...

Stay Connected: Your Guide to August Tech Talks, Office Hours, and Webinars!

What are Community Office Hours?Community Office Hours is an interactive 60-minute Zoom series where ...

Unleash the Power of Splunk MCP and AI, Meet Us at .Conf 2025, and Find Even More New ...

Splunk Lantern is a Splunk customer success center that provides advice from Splunk experts on valuable data ...