I have a source from which I am collecting logs via syslog. My challenge is that the log files send by same source contain lines that are not consistent in terms of fields. Pl see below.
Mar 11 21:19:08 10.10.10.10 11/03/2016:10:12:47 APAP-XXXX01 0-PPE-0 : TCP CONN_TERMINATE 5454405 0 : Source 10.20.20.20:80 - Destination 10.30.30.30:4172 - Start Time 11/03/2016:10:12:47 - End Time 11/03/2016:10:12:47 - Total_bytes_send 0 - Total_bytes_recv 1
Mar 11 19:55:23 10.10.10.10 11/03/2016:08:49:02 APAP-XXXX01 0-PPE-0 : SNMP TRAP_SENT 5441806 0 : entityup (entityName = "server_svc_NSSVC_DNS_10.50.50.50:53(nameserve...", sysIpAddress = 10.10.10.10)
Mar 11 19:55:23 10.10.10.10 11/03/2016:08:49:02 APAP-XXXX01 0-PPE-0 : EVENT DEVICEUP 5441805 0 : Device "server_svc_NSSVC_DNS_10.50.50.50:53(nameserver_10.50.50.50_53)" - State UP
Mar 11 21:18:57 10.10.10.10 11/03/2016:10:12:36 APAP-XXXX01 0-PPE-0 : TCP CONN_DELINK 5454373 0 : Source 10.10.20.20:64920 - Vserver 10.20.10.30:443 - NatIP 127.0.0.2:25769 - Destination 127.0.0.1:80 - Delink Time 11/03/2016:10:12:36 - Total_bytes_send 336 - Total_bytes_recv 9066
My question is:
Is there a way to select lines based on some key words e.g. CONN_TERMINATE or CONN_DELINK or SNMP or EVENT and so on and then apply specific regex pattern in my props.conf file to extract fields from those lines? Lines containing keywords such as CONN_TERMINATE or CONN_DELINK seem to have similar fields, but other are not.
You can setup different field extractions for each type of log lines in your props.conf like this (on search heads)
[yoursourcetype]
EXTRACT-conn_terminate=CONN_TERMINATE\s+[^:]+:\s+Source\s+(?<src>\S+)\s+-\s+Destination\s+(?<dst>\S+)\s+-\s+Start\s+Time\s+(?<start_time>\S+)\s+-\s+End\s+Time\s+(?<end_time>\S+)\s+-\s+Total_bytes_send\s+(?<bytes_sent>\d+)\s+-\s+Total_bytes_recv\s+(?<bytes_recv>\d+)
EXTRACT-event=EVENT\s+DEVICEUP\s+5441805\s+0\s+:\s+Device\s+"(?<device>\S+)"\s+-\s+State\s+(?<state>\w+)
...other types of lines
You can setup different field extractions for each type of log lines in your props.conf like this (on search heads)
[yoursourcetype]
EXTRACT-conn_terminate=CONN_TERMINATE\s+[^:]+:\s+Source\s+(?<src>\S+)\s+-\s+Destination\s+(?<dst>\S+)\s+-\s+Start\s+Time\s+(?<start_time>\S+)\s+-\s+End\s+Time\s+(?<end_time>\S+)\s+-\s+Total_bytes_send\s+(?<bytes_sent>\d+)\s+-\s+Total_bytes_recv\s+(?<bytes_recv>\d+)
EXTRACT-event=EVENT\s+DEVICEUP\s+5441805\s+0\s+:\s+Device\s+"(?<device>\S+)"\s+-\s+State\s+(?<state>\w+)
...other types of lines
Managed to get it working. Thank you for pointing me to the right direction.
Thank you for your response.
How can I match "CONN_TERMINATE" and then come back to the beginning of the line to capture fields like node_ip, which is 10.10.10.10 or or node_name, which is APAP-XXXX01. these field values appear before CONN_TERMINATE in the same line?
OK, I came up with the following EXTRACT in my props.conf for respective sourcetype. However, it does not work. Any idea?
Regex seems to work fine, if I test with online regex testing tool with my sample data.
EXTRACT-CONN_TERMINATE =^(?P<capture_1>.*)(?P<messagetype>CONN_TERMINATE) (\d+).*Source (?P<src>\d+\.\d+\.\d+\.\d+):(?P<srcport>\d+).*Destination (?P<dst>\d+\.\d+\.\d+\.\d+):(?P<dstport>\d+).*Start Time (?P<startdate>\d+\/\d+\/\d+):(?P<starttime>\d+:\d+:\d+).*End Time (?P<enddate>\d+\/\d+\/\d+):(?P<endtime>\d+:\d+:\d+).*send (?P<bytessend>\d+).*recv (?P<bytesreceived>\d+).*\n
EXTRACT-CONN_DELINK =^(?P<capture_1>.*)(?P<messagetype>CONN_DELINK) (\d+).*Source (?P<src>\d+\.\d+\.\d+\.\d+):(?P<srcport>\d+).*Vserver (?P<vs>\d+\.\d+\.\d+\.\d+):(?P<vsport>\d+).*NatIP (?P<natip>\d+\.\d+\.\d+\.\d+):(?P<natport>\d+).*Destination (?P<dst>\d+\.\d+\.\d+\.\d+):(?P<dstport>\d+).*Time (?P<delinktdate>\d+\/\d+\/\d+):(?P<delinktime>\d+:\d+:\d+).*send (?P<bytessend>\d+).*recv (?P<bytesreceived>\d+).*\n
EXTRACT-EVENT =^(?P<capture_1>.*)(?P<messagetype>EVENT) (?P<event>\w+) \d+ \d+ \: \w+ (?P<message>.*-).*State (?P<state>\w+).*\n
EXTRACT-SNMP =^(?P<capture_1>.*)(?P<messagetype>SNMP).*?: (?P<snmpstatus>\w+) (?P<message>.*).*\n
If I can get field extraction working OK, then I can use another searchtime regex on capture_1 and extract remaining fields within capture_1