Splunk Search

Field extraction from lines with different fields from same source

ashabc
Contributor

I have a source from which I am collecting logs via syslog. My challenge is that the log files send by same source contain lines that are not consistent in terms of fields. Pl see below.

Mar 11 21:19:08 10.10.10.10  11/03/2016:10:12:47  APAP-XXXX01 0-PPE-0 : TCP CONN_TERMINATE 5454405 0 :  Source 10.20.20.20:80 - Destination 10.30.30.30:4172 - Start Time 11/03/2016:10:12:47  - End Time 11/03/2016:10:12:47  - Total_bytes_send 0 - Total_bytes_recv 1 

Mar 11 19:55:23 10.10.10.10  11/03/2016:08:49:02  APAP-XXXX01 0-PPE-0 : SNMP TRAP_SENT 5441806 0 :  entityup (entityName = "server_svc_NSSVC_DNS_10.50.50.50:53(nameserve...", sysIpAddress = 10.10.10.10)

Mar 11 19:55:23  10.10.10.10  11/03/2016:08:49:02  APAP-XXXX01 0-PPE-0 : EVENT DEVICEUP 5441805 0 :  Device "server_svc_NSSVC_DNS_10.50.50.50:53(nameserver_10.50.50.50_53)" - State UP

Mar 11 21:18:57 10.10.10.10  11/03/2016:10:12:36  APAP-XXXX01 0-PPE-0 : TCP CONN_DELINK 5454373 0 :  Source 10.10.20.20:64920 - Vserver 10.20.10.30:443 - NatIP 127.0.0.2:25769 - Destination 127.0.0.1:80 - Delink Time 11/03/2016:10:12:36  - Total_bytes_send 336 - Total_bytes_recv 9066

My question is:
Is there a way to select lines based on some key words e.g. CONN_TERMINATE or CONN_DELINK or SNMP or EVENT and so on and then apply specific regex pattern in my props.conf file to extract fields from those lines? Lines containing keywords such as CONN_TERMINATE or CONN_DELINK seem to have similar fields, but other are not.

0 Karma
1 Solution

somesoni2
Revered Legend

You can setup different field extractions for each type of log lines in your props.conf like this (on search heads)

[yoursourcetype]
EXTRACT-conn_terminate=CONN_TERMINATE\s+[^:]+:\s+Source\s+(?<src>\S+)\s+-\s+Destination\s+(?<dst>\S+)\s+-\s+Start\s+Time\s+(?<start_time>\S+)\s+-\s+End\s+Time\s+(?<end_time>\S+)\s+-\s+Total_bytes_send\s+(?<bytes_sent>\d+)\s+-\s+Total_bytes_recv\s+(?<bytes_recv>\d+)
EXTRACT-event=EVENT\s+DEVICEUP\s+5441805\s+0\s+:\s+Device\s+"(?<device>\S+)"\s+-\s+State\s+(?<state>\w+)
...other types of lines

View solution in original post

somesoni2
Revered Legend

You can setup different field extractions for each type of log lines in your props.conf like this (on search heads)

[yoursourcetype]
EXTRACT-conn_terminate=CONN_TERMINATE\s+[^:]+:\s+Source\s+(?<src>\S+)\s+-\s+Destination\s+(?<dst>\S+)\s+-\s+Start\s+Time\s+(?<start_time>\S+)\s+-\s+End\s+Time\s+(?<end_time>\S+)\s+-\s+Total_bytes_send\s+(?<bytes_sent>\d+)\s+-\s+Total_bytes_recv\s+(?<bytes_recv>\d+)
EXTRACT-event=EVENT\s+DEVICEUP\s+5441805\s+0\s+:\s+Device\s+"(?<device>\S+)"\s+-\s+State\s+(?<state>\w+)
...other types of lines

ashabc
Contributor

Managed to get it working. Thank you for pointing me to the right direction.

0 Karma

ashabc
Contributor

Thank you for your response.

How can I match "CONN_TERMINATE" and then come back to the beginning of the line to capture fields like node_ip, which is 10.10.10.10 or or node_name, which is APAP-XXXX01. these field values appear before CONN_TERMINATE in the same line?

0 Karma

ashabc
Contributor

OK, I came up with the following EXTRACT in my props.conf for respective sourcetype. However, it does not work. Any idea?

Regex seems to work fine, if I test with online regex testing tool with my sample data.

EXTRACT-CONN_TERMINATE =^(?P<capture_1>.*)(?P<messagetype>CONN_TERMINATE) (\d+).*Source (?P<src>\d+\.\d+\.\d+\.\d+):(?P<srcport>\d+).*Destination (?P<dst>\d+\.\d+\.\d+\.\d+):(?P<dstport>\d+).*Start Time (?P<startdate>\d+\/\d+\/\d+):(?P<starttime>\d+:\d+:\d+).*End Time (?P<enddate>\d+\/\d+\/\d+):(?P<endtime>\d+:\d+:\d+).*send (?P<bytessend>\d+).*recv (?P<bytesreceived>\d+).*\n

EXTRACT-CONN_DELINK =^(?P<capture_1>.*)(?P<messagetype>CONN_DELINK) (\d+).*Source (?P<src>\d+\.\d+\.\d+\.\d+):(?P<srcport>\d+).*Vserver (?P<vs>\d+\.\d+\.\d+\.\d+):(?P<vsport>\d+).*NatIP (?P<natip>\d+\.\d+\.\d+\.\d+):(?P<natport>\d+).*Destination (?P<dst>\d+\.\d+\.\d+\.\d+):(?P<dstport>\d+).*Time (?P<delinktdate>\d+\/\d+\/\d+):(?P<delinktime>\d+:\d+:\d+).*send (?P<bytessend>\d+).*recv (?P<bytesreceived>\d+).*\n

EXTRACT-EVENT =^(?P<capture_1>.*)(?P<messagetype>EVENT) (?P<event>\w+) \d+ \d+ \:  \w+ (?P<message>.*-).*State (?P<state>\w+).*\n

EXTRACT-SNMP =^(?P<capture_1>.*)(?P<messagetype>SNMP).*?:  (?P<snmpstatus>\w+) (?P<message>.*).*\n

If I can get field extraction working OK, then I can use another searchtime regex on capture_1 and extract remaining fields within capture_1

0 Karma
Get Updates on the Splunk Community!

Enterprise Security Content Update (ESCU) | New Releases

In December, the Splunk Threat Research Team had 1 release of new security content via the Enterprise Security ...

Why am I not seeing the finding in Splunk Enterprise Security Analyst Queue?

(This is the first of a series of 2 blogs). Splunk Enterprise Security is a fantastic tool that offers robust ...

Index This | What are the 12 Days of Splunk-mas?

December 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...