Splunk Search

Field extraction from lines with different fields from same source

Contributor

I have a source from which I am collecting logs via syslog. My challenge is that the log files send by same source contain lines that are not consistent in terms of fields. Pl see below.

Mar 11 21:19:08 10.10.10.10  11/03/2016:10:12:47  APAP-XXXX01 0-PPE-0 : TCP CONN_TERMINATE 5454405 0 :  Source 10.20.20.20:80 - Destination 10.30.30.30:4172 - Start Time 11/03/2016:10:12:47  - End Time 11/03/2016:10:12:47  - Total_bytes_send 0 - Total_bytes_recv 1 

Mar 11 19:55:23 10.10.10.10  11/03/2016:08:49:02  APAP-XXXX01 0-PPE-0 : SNMP TRAP_SENT 5441806 0 :  entityup (entityName = "server_svc_NSSVC_DNS_10.50.50.50:53(nameserve...", sysIpAddress = 10.10.10.10)

Mar 11 19:55:23  10.10.10.10  11/03/2016:08:49:02  APAP-XXXX01 0-PPE-0 : EVENT DEVICEUP 5441805 0 :  Device "server_svc_NSSVC_DNS_10.50.50.50:53(nameserver_10.50.50.50_53)" - State UP

Mar 11 21:18:57 10.10.10.10  11/03/2016:10:12:36  APAP-XXXX01 0-PPE-0 : TCP CONN_DELINK 5454373 0 :  Source 10.10.20.20:64920 - Vserver 10.20.10.30:443 - NatIP 127.0.0.2:25769 - Destination 127.0.0.1:80 - Delink Time 11/03/2016:10:12:36  - Total_bytes_send 336 - Total_bytes_recv 9066

My question is:
Is there a way to select lines based on some key words e.g. CONN_TERMINATE or CONN_DELINK or SNMP or EVENT and so on and then apply specific regex pattern in my props.conf file to extract fields from those lines? Lines containing keywords such as CONN_TERMINATE or CONN_DELINK seem to have similar fields, but other are not.

0 Karma
1 Solution

Revered Legend

You can setup different field extractions for each type of log lines in your props.conf like this (on search heads)

[yoursourcetype]
EXTRACT-conn_terminate=CONN_TERMINATE\s+[^:]+:\s+Source\s+(?<src>\S+)\s+-\s+Destination\s+(?<dst>\S+)\s+-\s+Start\s+Time\s+(?<start_time>\S+)\s+-\s+End\s+Time\s+(?<end_time>\S+)\s+-\s+Total_bytes_send\s+(?<bytes_sent>\d+)\s+-\s+Total_bytes_recv\s+(?<bytes_recv>\d+)
EXTRACT-event=EVENT\s+DEVICEUP\s+5441805\s+0\s+:\s+Device\s+"(?<device>\S+)"\s+-\s+State\s+(?<state>\w+)
...other types of lines

View solution in original post

Revered Legend

You can setup different field extractions for each type of log lines in your props.conf like this (on search heads)

[yoursourcetype]
EXTRACT-conn_terminate=CONN_TERMINATE\s+[^:]+:\s+Source\s+(?<src>\S+)\s+-\s+Destination\s+(?<dst>\S+)\s+-\s+Start\s+Time\s+(?<start_time>\S+)\s+-\s+End\s+Time\s+(?<end_time>\S+)\s+-\s+Total_bytes_send\s+(?<bytes_sent>\d+)\s+-\s+Total_bytes_recv\s+(?<bytes_recv>\d+)
EXTRACT-event=EVENT\s+DEVICEUP\s+5441805\s+0\s+:\s+Device\s+"(?<device>\S+)"\s+-\s+State\s+(?<state>\w+)
...other types of lines

View solution in original post

Contributor

Managed to get it working. Thank you for pointing me to the right direction.

0 Karma

Contributor

Thank you for your response.

How can I match "CONN_TERMINATE" and then come back to the beginning of the line to capture fields like node_ip, which is 10.10.10.10 or or node_name, which is APAP-XXXX01. these field values appear before CONN_TERMINATE in the same line?

0 Karma

Contributor

OK, I came up with the following EXTRACT in my props.conf for respective sourcetype. However, it does not work. Any idea?

Regex seems to work fine, if I test with online regex testing tool with my sample data.

EXTRACT-CONN_TERMINATE =^(?P<capture_1>.*)(?P<messagetype>CONN_TERMINATE) (\d+).*Source (?P<src>\d+\.\d+\.\d+\.\d+):(?P<srcport>\d+).*Destination (?P<dst>\d+\.\d+\.\d+\.\d+):(?P<dstport>\d+).*Start Time (?P<startdate>\d+\/\d+\/\d+):(?P<starttime>\d+:\d+:\d+).*End Time (?P<enddate>\d+\/\d+\/\d+):(?P<endtime>\d+:\d+:\d+).*send (?P<bytessend>\d+).*recv (?P<bytesreceived>\d+).*\n

EXTRACT-CONN_DELINK =^(?P<capture_1>.*)(?P<messagetype>CONN_DELINK) (\d+).*Source (?P<src>\d+\.\d+\.\d+\.\d+):(?P<srcport>\d+).*Vserver (?P<vs>\d+\.\d+\.\d+\.\d+):(?P<vsport>\d+).*NatIP (?P<natip>\d+\.\d+\.\d+\.\d+):(?P<natport>\d+).*Destination (?P<dst>\d+\.\d+\.\d+\.\d+):(?P<dstport>\d+).*Time (?P<delinktdate>\d+\/\d+\/\d+):(?P<delinktime>\d+:\d+:\d+).*send (?P<bytessend>\d+).*recv (?P<bytesreceived>\d+).*\n

EXTRACT-EVENT =^(?P<capture_1>.*)(?P<messagetype>EVENT) (?P<event>\w+) \d+ \d+ \:  \w+ (?P<message>.*-).*State (?P<state>\w+).*\n

EXTRACT-SNMP =^(?P<capture_1>.*)(?P<messagetype>SNMP).*?:  (?P<snmpstatus>\w+) (?P<message>.*).*\n

If I can get field extraction working OK, then I can use another searchtime regex on capture_1 and extract remaining fields within capture_1

0 Karma
State of Splunk Careers

Access the Splunk Careers Report to see real data that shows how Splunk mastery increases your value and job satisfaction.

Find out what your skills are worth!