I have multiple lines to extract and break down into separate fields, I have a search I've been using, but I am running into issues trying to extract multiple fields.
The Data:
<30>2018-11-29T15:59:59.128110+00:00 myserver.iskindof.broken.com 111.222.333.444 ["Nov 29 10:59:59" myserver] the_log[5930]: #012#012Alert Name: Inactive nic Since. Alert Description: nics Inactive for a given period of time.#012Current value: 3.0 #012Base query value: 0.0 #012Threshold: 0.0 #012Query period: 11/29/18 - 11/29/18 #012Alert Classification: #012Category: #012Severity: INFO#012#012Recommended Action:#012#012 #012#012Alert Details#012S-TAP Host DB Server Type Status Last Response Primary Host Name KTAP Installed TEE Installed Shared Memory Driver Installed DB2 Shared Memory Driver Installed LHMON Driver Installed Named Pipes Driver Installed Hunter DBS App Server Installed Count
#012111.111.111.110 ORACLE Inactive 2018-08-29 12:46:00 111.111.111.200 Yes No No No Yes No No 1
#012111.111.111.110:FAM Inactive 2018-10-27 03:25:00 111.111.111.200 No No No No Yes No No 1
#012111.111.111.110 ORACLE Inactive 2018-08-27 08:59:14 111.111.111.200 Yes No No No Yes No NULL No 1
#012111.111.111.110 ORACLE Inactive 2018-10-11 19:09:46 111.111.111.200 No No No No Yes No No 1
#012111.111.111.110 ORACLE Inactive 2018-10-11 17:31:04 111.111.111.200 No No No No Yes No No 1
#012111.111.111.110 MSSQL Inactive 2018-08-27 08:59:14 111.111.111.200 Yes No No No Yes Yes No 1
#012111.111.111.110 MSSQL Inactive 2018-08-27 08:59:14 111.111.111.200 Yes No No No Yes Yes No 1
#012111.111.111.110 MSSQL Inactive 2018-08-27 08:59:14 111.111.111.200 Yes No No No Yes Yes No 1
#012111.111.111.110 MSSQL Inactive 2018-08-27 08:59:14 111.111.111.200 Yes No No No Yes Yes No 1
#012111.111.111.110 MSSQL Inactive 2018-08-27 08:59:14 111.111.111.200 Yes No No No No Yes No 1
#012111.111.111.110 ORACLE Inactive 2018-08-27 08:59:14 111.111.111.200 No No No No Yes No NULL No 1
#012111.111.111.110 ORACLE Inactive 2018-08-27 08:59:14 111.111.111.200 No No No No Yes No NULL No 1
#012111.111.111.110 ORACLE Inactive 2018-08-27 08:59:14 111.111.111.200 No No No No Yes No NULL No 1
#012111.111.111.110 ORACLE Inactive 2018-08-27 08:59:14 111.111.111.200 No No No No Yes No NULL No 1
#012111.111.111.110:FAM Inactive 2018-10-20 09:18:01 111.111.111.200 No No No No Yes No No 1
#012111.111.111.110:FAM Inactive 2018-11-28 01:39:26 111.111.111.200 No No No No Yes No No 1
#012111.111.111.110 ORACLE Inactive 2018-08-27 08:59:13 111.111.111.200 No No No No Yes No No 1
#012111.111.111.110 MSSQL Inactive 2018-08-27 08:59:13 111.111.111.200 Yes No No No Yes Yes No 1
#012111.111.111.110 MSSQL Inactive 2018-08-27 08:59:13 111.111.111.200 Yes No No No Yes Yes No 1
#012111.111.111.110 MSSQL Inactive 2018-08-27 08:59:13 111.111.111.200 Yes No No No Yes Yes No 1
#012111.111.111.110 MSSQL Inactive 2018-08-27 08:59:13 111.111.111.200 Yes No No No Yes Yes No 1
#012111.111.111.110 MSSQL Inactive 2018-08-27 08:59:13 111.111.111.200 Yes No No No Yes Yes No 1
#012111.111.111.110 MSSQL Inactive 2018-08-27 08:59:13 111.111.111.200 Yes No No No Yes Yes No 1
#012111.111.111.110 MSSQL Inactive 2018-08-27 08:59:13 111.111.111.200 Yes No No No Yes Yes No 1
#012111.111.111.110 MSSQL Inactive 2018-08-27 08:59:13 111.111.111.200 Yes No No No Yes Yes No 1
#012111.111.111.110 MSSQL Inactive 2018-08-27 08:59:13 111.111.111.200 Yes No No No Yes Yes No 1
#012111.111.111.110 MSSQL Inactive 2018-08-27 08:59:13 111.111.111.200 Yes No No No Yes Yes No 1
#012111.111.111.110:FAM Inactive 2018-08-23 15:16:36 111.111.111.200 Yes No No No Yes No No 1
#012111.111.111.110:FAM Inactive 2018-08-23 15:16:41 111.111.111.200 Yes No No No Yes No No 1
#012111.111.111.110:FAM Inactive 2018-10-20 09:23:01 111.111.111.200 No No No No Yes No No 1
The Search:
index=database* Inactive
| rex field=_raw "^[^ \n]* (?P[^ ]+)"
| rex field=_raw "^(?:[^ \n]* ){2}(?P[^ ]+)"
| rex field=_raw "(?\s)"
| rex field=_raw "[#]\d{3}(?\d\s+\w{8}\s\d+\S\d+\S\d+\s\d+\S\d+\S\d+(\S\d)?\s+(\d+\S\d+\S\d+\S\d+)?)" max_match=0
| rex field=_raw "[#]\d{3}(?\w+\S\w{13}\S\w{3}\s+\w{8}\s\d+\S\d+\S\d+\s\d+\S\d+\S\d+(\S\d)?\s+(\d+\S\d+\S\d+\S\d+)?)" max_match=0
| rex field=_raw "[#]\d{3}(?\d+\S\d+\S\d+\S\d+\S*\w*\s+\w*\s*\w{8}\s\d+\S\d+\S\d+\s\d+\S\d+\S\d+(\S\d)?\s+(\d+\S\d+\S\d+\S\d+)?)" max_match=0
| mvexpand Alert
| rex field=Alert "(?P\d)\s+(?\w{8}\s(?\d+)\S(?\d+)\S(?\d+)\s(?\d+)\S(?\d+)\S(?\d+)(\S\d)?)\s+(?\d+\S\d+\S\d+\S\d+)?"
| rex field=Alert "(?P\w+\S\w{13}\S\w{3})\s+(?\w{8}\s(?\d+)\S(?\d+)\S(?\d+)\s(?\d+)\S(?\d+)\S(?\d+)(\S\d)?)\s+(?\d+\S\d+\S\d+\S\d+)?"
| rex field=Alert "(?P\d+\S\d+\S\d+\S\d+)\S*\w*\s+\w*\s*(?\w{8}\s(?\d+)\S(?\d+)\S(?\d+)\s(?\d+)\S(?\d+)\S(?\d+)(\S\d)?)\s+(?\d+\S\d+\S\d+\S\d+)?"
| eval nic=case(isnull(nic),"Nothing",1=1,nic)
| eval db=case(isnull(db),"Nothing",1=1,db)
| lookup dnslookup clientip as db OUTPUT clienthost as Host
| eval Hostname=case(isnull(Host),"Unknown",1=1,Host)
| table server, ip, db, type, Host, date_down, nic
What I am trying to get, but separated into individual lines:
server IP db type Inactive date nic
1brokenserver.com 100.00.000.1 111.111.111.110:FAM oracle or NULL Inactive 2018-10-20 09:23:01 111.111.111.200
2brokenserver.com 100.00.000.1 111.111.111.110 oracle or NULL Inactive 2018-10-20 09:23:01 111.111.111.200
abrokenserver.com 100.00.000.1 111.111.111.110:FAM oracle or NULL Inactive 2018-10-20 09:23:01 111.111.111.200
etc..etc..
Is there something I am missing or is there a book for idiots that could assist me or does someone have advice?
I think that this will get you most of the way there:
| makeresults
| eval _raw="<30>2018-11-29T15:59:59.128110+00:00 myserver.iskindof.broken.com 111.222.333.444 [\"Nov 29 10:59:59\" myserver] the_log[5930]: #012#012Alert Name: Inactive nic Since. Alert Description: nics Inactive for a given period of time.#012Current value: 3.0 #012Base query value: 0.0 #012Threshold: 0.0 #012Query period: 11/29/18 - 11/29/18 #012Alert Classification: #012Category: #012Severity: INFO#012#012Recommended Action:#012#012 #012#012Alert Details#012S-TAP Host DB Server Type Status Last Response Primary Host Name KTAP Installed TEE Installed Shared Memory Driver Installed DB2 Shared Memory Driver Installed LHMON Driver Installed Named Pipes Driver Installed Hunter DBS App Server Installed Count
#012111.111.111.110 ORACLE Inactive 2018-08-29 12:46:00 111.111.111.200 Yes No No No Yes No No 1
#012111.111.111.110:FAM Inactive 2018-10-27 03:25:00 111.111.111.200 No No No No Yes No No 1
#012111.111.111.110 ORACLE Inactive 2018-08-27 08:59:14 111.111.111.200 Yes No No No Yes No NULL No 1
#012111.111.111.110 ORACLE Inactive 2018-10-11 19:09:46 111.111.111.200 No No No No Yes No No 1
#012111.111.111.110 ORACLE Inactive 2018-10-11 17:31:04 111.111.111.200 No No No No Yes No No 1
#012111.111.111.110 MSSQL Inactive 2018-08-27 08:59:14 111.111.111.200 Yes No No No Yes Yes No 1
#012111.111.111.110 MSSQL Inactive 2018-08-27 08:59:14 111.111.111.200 Yes No No No Yes Yes No 1
#012111.111.111.110 MSSQL Inactive 2018-08-27 08:59:14 111.111.111.200 Yes No No No Yes Yes No 1
#012111.111.111.110 MSSQL Inactive 2018-08-27 08:59:14 111.111.111.200 Yes No No No Yes Yes No 1
#012111.111.111.110 MSSQL Inactive 2018-08-27 08:59:14 111.111.111.200 Yes No No No No Yes No 1
#012111.111.111.110 ORACLE Inactive 2018-08-27 08:59:14 111.111.111.200 No No No No Yes No NULL No 1
#012111.111.111.110 ORACLE Inactive 2018-08-27 08:59:14 111.111.111.200 No No No No Yes No NULL No 1
#012111.111.111.110 ORACLE Inactive 2018-08-27 08:59:14 111.111.111.200 No No No No Yes No NULL No 1
#012111.111.111.110 ORACLE Inactive 2018-08-27 08:59:14 111.111.111.200 No No No No Yes No NULL No 1
#012111.111.111.110:FAM Inactive 2018-10-20 09:18:01 111.111.111.200 No No No No Yes No No 1
#012111.111.111.110:FAM Inactive 2018-11-28 01:39:26 111.111.111.200 No No No No Yes No No 1
#012111.111.111.110 ORACLE Inactive 2018-08-27 08:59:13 111.111.111.200 No No No No Yes No No 1
#012111.111.111.110 MSSQL Inactive 2018-08-27 08:59:13 111.111.111.200 Yes No No No Yes Yes No 1
#012111.111.111.110 MSSQL Inactive 2018-08-27 08:59:13 111.111.111.200 Yes No No No Yes Yes No 1
#012111.111.111.110 MSSQL Inactive 2018-08-27 08:59:13 111.111.111.200 Yes No No No Yes Yes No 1
#012111.111.111.110 MSSQL Inactive 2018-08-27 08:59:13 111.111.111.200 Yes No No No Yes Yes No 1
#012111.111.111.110 MSSQL Inactive 2018-08-27 08:59:13 111.111.111.200 Yes No No No Yes Yes No 1
#012111.111.111.110 MSSQL Inactive 2018-08-27 08:59:13 111.111.111.200 Yes No No No Yes Yes No 1
#012111.111.111.110 MSSQL Inactive 2018-08-27 08:59:13 111.111.111.200 Yes No No No Yes Yes No 1
#012111.111.111.110 MSSQL Inactive 2018-08-27 08:59:13 111.111.111.200 Yes No No No Yes Yes No 1
#012111.111.111.110 MSSQL Inactive 2018-08-27 08:59:13 111.111.111.200 Yes No No No Yes Yes No 1
#012111.111.111.110 MSSQL Inactive 2018-08-27 08:59:13 111.111.111.200 Yes No No No Yes Yes No 1
#012111.111.111.110:FAM Inactive 2018-08-23 15:16:36 111.111.111.200 Yes No No No Yes No No 1
#012111.111.111.110:FAM Inactive 2018-08-23 15:16:41 111.111.111.200 Yes No No No Yes No No 1
#012111.111.111.110:FAM Inactive 2018-10-20 09:23:01 111.111.111.200 No No No No Yes No No 1 "
| rename COMMENT AS "Everything above generates sample event data; everything below is your solution"
| rex max_match=0 "[\r\n]+\#\d{3}(?<raw>[^\r\n]+)"
| rename host AS server
| table server raw
| mvexpand raw
| rename raw AS _raw
| rex max_match=0 "(?<nic>\S+)\s+(?:(?<db>\S+)\s+)?(?<type>\S+)\s+(?<date>\d{4}-\d{2}-\d{2}\s+\d{2}:\d{2}:\d{2})\s+(?<IP>\S+)\s+(?<bools>(Yes|No).*?)\s*$"
| fields - _raw
Two tips:
- looks like this data is being passed through rsyslog? You might want to configure rsyslog with escapecontrolcharactersonreceive off
such that those #012 are just printed as the newlines they are meant to be. If you don't have control over the syslog daemon, then I guess SEDCMD-fixnewlines = s/#012/\n/g
(in props.conf) should also do the trick.
This should make your data look like this in Splunk:
<30>2018-11-29T15:59:59.128110+00:00 myserver.iskindof.broken.com 111.222.333.444 ["Nov 29 10:59:59" myserver] the_log[5930]:
Alert Name: Inactive nic Since. Alert Description: nics Inactive for a given period of time.
Current value: 3.0
Base query value: 0.0
Threshold: 0.0
Query period: 11/29/18 - 11/29/18
Alert Classification:
Category:
Severity: INFO
Recommended Action:
Alert Details
S-TAP Host DB Server Type Status Last Response Primary Host Name KTAP Installed TEE Installed Shared Memory Driver Installed DB2 Shared Memory Driver Installed LHMON Driver Installed Named Pipes Driver Installed Hunter DBS App Server Installed Count
111.111.111.110 ORACLE Inactive 2018-08-29 12:46:00 111.111.111.200 Yes No No No Yes No No 1
111.111.111.110:FAM Inactive 2018-10-27 03:25:00 111.111.111.200 No No No No Yes No No 1
111.111.111.110 ORACLE Inactive 2018-08-27 08:59:14 111.111.111.200 Yes No No No Yes No NULL No 1
111.111.111.110 ORACLE Inactive 2018-10-11 19:09:46 111.111.111.200 No No No No Yes No No 1
| multikv forceheader=12
(to use line 12 as the header line). If you make sure you extract all the header fields from above the table before you perform the multikv command, those general fields will be copied over to all the undividually split events generated by multikv.That would help but I do not have access to the .conf. I actually have to generate a SED syntax, but when I do that it bricks my search. This was a search I generated a year ago and I am unable to remember how I set this up. That's why I am trying to get assistance in rebuilding it, since I am being asked to generate a different result export to include the STAP, Last Response, db IP, dbhost, type, total time down(new request).
| rex mode=sed " s/#012/\n/g"
That should fix the #012 junk.
Then see what your data looks like and try applying | multikv
. I'm getting some mixed results testing it with your sample data, but that might be because of how this has been copy pasted from your system to this discussion, to my system.
Yes, Lines 1 through 31 are all as one event. That's why I am trying to extract each part of the event line by line.
Device Device IP Inactive Date db IP db Host TAP total time down
server1 0.0.0.0 aug 1 1900 0.0.0.0 dbserver 0.0.0.0 7 days
'' '' '' '' '' '' '' '' '' '' '' '' 2 weeks
'' '' '' '' '' '' '' '' '' '' '' '' 1 month
'' '' '' '' '' '' '' '' '' '' '' '' 4 hours
etc etc
While trying that with your sample data, it seems the multikv command doesn't work very nicely with the column names that contain a space. So you might need to do a little SED preprocessing to fix that.
Hey,
Your data looks kind of mixed (first line in comparison to the rest).
How much information do you need from line #1?
The Events, starting from line #2 look structured enough to let splunk auto-extract fields. On top of those events, some spl should yield into the desired result.
None of line one is needed. However the auto extract does not allow me to me pull the multiple fields and individualize the data I want to extract. Also the rex generator will not allow me to extract what is needed. I keep getting "The extraction failed. If you are extracting multiple fields, try removing one or more fields. Start with extractions that are embedded within longer text strings." message. But I don't have any extractions in the first place. Please advise.
Okay ... as for me, there are two approaches:
(A): dynamically use the header within your first line (I think it is the text after the last occurance of #012
) at index time (not detailled here)
(B): Use rex
on _raw
in a slightly different way. I used \S
(capitalized) to match everything but white spaces and \s
(lower case) to match white space. There is one odd field (third to the end) that sometimes is filled with NULL, sometimes not at all. Moreover the "DB Server Type" is not always present.
I used your raw data from above where I had a leading space in each line. I think that is due to copying it from here. That is the reason for me starting the regex with ^\s{0,1}...
.
Using this spl/regex I have the data in fields to start working with:
index="main" sourcetype="..."
| rex field=_raw "^\s{0,1}(?<field_01>#\d{3})(?<field_02>\S+)\s{6}(?<field_03>\S+)?\s+(?<field_04>\S+)\s+(?<field_05>\S+)\s+(?<field_06>\S+)\s+(?<field_07>\S+)\s+(?<field_08>\S+)\s+(?<field_09>\S+)\s+(?<field_10>\S+)\s+(?<field_11>\S+)\s+(?<field_12>\S+)\s+(?<field_13>\S+)\s+(?<field_14>NULL\s+)?(?<field_15>\S+)\s+(?<field_16>\S+)\s+$"
| eval "Last Response" = field_05." ".field_06
| rename
field_01 as "PREFIX"
field_02 as "S-TAP Host"
field_03 as "DB Server Type"
field_04 as "Status"
field_07 as "Primary Host Name"
field_08 as "KTAP Installed"
field_09 as "TEE Installed"
field_10 as "Shared Memory Driver Installed"
field_11 as "DB2 Shared Memory Driver Installed"
field_12 as "LHMON Driver Installed"
field_13 as "Named Pipes Driver Installed"
field_14 as "UNKNOWN"
field_15 as "Hunter DBS App Server Installed"
field_16 as "Count"
| table "PREFIX" "S-TAP Host" "DB Server Type" "Status" "Last Response" "Primary Host Name" "KTAP Installed" "TEE Installed" "Shared Memory Driver Installed" "DB2 Shared Memory Driver Installed" "LHMON Driver Installed" "Named Pipes Driver Installed" "UNKNOWN" "Hunter DBS App Server Installed" "Count"
I tried this and it worked but it displayed nothing in the lines but the columns were relabeled. Could it be I need to have a fillnull for each column? Or would I need to plugin the original query as well or make another rex for each item I am trying to extract?
Most probably your the values are tab separated. I assumed spaces. You could verify this in an good text editor or by using cat
:
cat -T yourfile.txt
... tabs will be displayed as ^I
Some are tab others are individualized. But since all of the data is in one event, I am still trying to figure out how to separate the data in each event.