I am attempting to import a ws_ftp log, but I am having issues parsing the log data. I can either get it to have no fields extracted or I end up with hundreds of entries for each event as it does not appear to break properly.
Sample log data:
<?xml version="1.0" encoding="utf-8" ?>
<log>
<entry>
<log_time>20170214-10:59:58</log_time>
<description><![CDATA[Authentication succeed]]></description>
<service>COM_API</service>
<sessionid>00000001</sessionid>
<type>0</type> <severity>1</severity>
<user>test</user>
<host>ftp</host>
<cmd>Login</cmd>
<sguid>278AA2E9-04A9-4484-9EAC-DF1EACBDF372</sguid>
</entry>
<entry>
<log_time>20170214-11:01:39</log_time>
<description><![CDATA[Created user test on host ftp]]></description>
<service>COM_API</service>
<sessionid>00000001</sessionid>
<type>0</type> <severity>1</severity>
<user>test</user>
<host>ftp</host>
<cmd>CreateUser</cmd>
<sguid>278AA2E9-04A9-4484-9EAC-DF1EACBDF372</sguid>
</entry>
<entry>
<log_time>20170214-11:01:39</log_time>
<description><![CDATA[User test sysadmin set to TRUE on host ftp]]></description>
<service>COM_API</service>
<sessionid>00000001</sessionid>
<type>0</type> <severity>1</severity>
<user>test</user>
<host>ftp</host>
<cmd>SetSysAdmin</cmd>
<sguid>278AA2E9-04A9-4484-9EAC-DF1EACBDF372</sguid>
</entry>
</log>
From some post I have created a props.conf file of:
[WS_FTP]
TIME_PREFIX = \<log_time\>
TIME_FORMAT = %Y\%m\%d-%H:%M:%S
SHOULD_LINEMERGE = false
LINE_BREAKER = \>\s*(?=\<entry\>)
REPORT-xmlext = xml-extr
and a transforms.conf:
[xml-extr]
REGEX = <([^>]+)>([^<]*)<\/\1>
FORMAT = $1::$2
MV_ADD = true
REPEAT_MATCH = true
I need to have each entry listed with the associated data as opposed to what i am getting now where there is an event for: , 278AA2E9-04A9-4484-9EAC-DF1EACBDF372, etc.
It seems to be right there, but still something is not working. I have tried without the transforms and only that props.conf, but that too yields similar results, so any help in getting each "entry" properly extracted would be much appreciated.
Try using KV_MODE=xml in props.conf and remove your transforms.conf. This appears to be working fine for me:
[WS_FTP]
BREAK_ONLY_BEFORE = <entry>
DATETIME_CONFIG =
NO_BINARY_CHECK = true
TIME_FORMAT = %Y%m%d-%H:%M:%S
TIME_PREFIX = \<log_time\>
KV_MODE = xml
You'll end up with fields prefaced with the "entry" label, like "entry.description", "entry.sguid", etc. You'll want to play with the line breaking if you want to get rid of the preface of "entry.*".
Edit: Using your LINE_BREAKER, kind of like this:
[WS_FTP]
DATETIME_CONFIG =
KV_MODE = xml
LINE_BREAKER = \>\s*(?=\<entry\>)
NO_BINARY_CHECK = true
TIME_FORMAT = %Y%m%d-%H:%M:%S
TIME_PREFIX = \<log_time\>
Awesome work, the LINE_BREAKER works. Before you posted that, I also figured out how to do it with the FIELDALIAS and then EVAL null the original values, like this:
#[WS_FTP]
#BREAK_ONLY_BEFORE = <entry>
#DATETIME_CONFIG =
#NO_BINARY_CHECK = true
#TIME_FORMAT = %Y%m%d-%H:%M:%S
#TIME_PREFIX = \<log_time\>
#FIELDALIAS-rootfields = entry.log_time as Time entry.description as Description entry.user as User entry.cmd as Command entry.service as Service entry.severity as Severity entry.sguid as SGUID entry.host as Host entry.sessionid as Session_ID entry.type as Type
#EVAL-entry.log_time = null
#EVAL-entry.description = null
#EVAL-entry.user = null
#EVAL-entry.cmd = null
#EVAL-entry.service = null
#EVAL-entry.severity = null
#EVAL-entry.sguid = null
#EVAL-entry.host = null
#EVAL-entry.sessionid = null
#EVAL-entry.type = null
#KV_MODE = xml
In the end, both seem to work, but the key was the main extract so thanks a million
Very cool! I've run into this with a few logs and this might help me clean them up a bit.
Thanks!
Try using KV_MODE=xml in props.conf and remove your transforms.conf. This appears to be working fine for me:
[WS_FTP]
BREAK_ONLY_BEFORE = <entry>
DATETIME_CONFIG =
NO_BINARY_CHECK = true
TIME_FORMAT = %Y%m%d-%H:%M:%S
TIME_PREFIX = \<log_time\>
KV_MODE = xml
You'll end up with fields prefaced with the "entry" label, like "entry.description", "entry.sguid", etc. You'll want to play with the line breaking if you want to get rid of the preface of "entry.*".
Edit: Using your LINE_BREAKER, kind of like this:
[WS_FTP]
DATETIME_CONFIG =
KV_MODE = xml
LINE_BREAKER = \>\s*(?=\<entry\>)
NO_BINARY_CHECK = true
TIME_FORMAT = %Y%m%d-%H:%M:%S
TIME_PREFIX = \<log_time\>
super awesome, this worked. Fields extracted as you noted.
I added the below field alias, but then I ended up with both the formatted and unformatted fields. Not sure if I will keep it or just add the search time rename as noted below.
FIELDALIAS-rootfields = entry.log_time as Time entry.description as Description entry.user as User entry.cmd as Command entry.service as Service entry.severity as Severity entry.sguid as SGUID
For the record, I want to make sure I put this into the correct props.conf file. I added it under system as there is no app for ws_ftp to add it to. Or should it go under the search app?
If you use the LINE_BREAKER in my edit, it will automatically remove the line you are breaking the events on, which in turn removes the tree'd out format of "entry.$field$" (because it no longer exists in the log).
System should be fine. I personally create a new app on a per-system basis for organizational purposes, but in search or system should be fine. If you place it in search, it's possible you won't be able to use the extractions within other apps -- in that case you'll need to share the objects in the search app globally.
If you want to get rid of all of the entry
prefixes in the field names, you can always do:
| rename entry.* AS *
Should strip off the prefix.
I was curious, is there any way to make the search time rename a permanent thing? I did the field alias, but I end up with the entry.description and Description then.
I was looking at the docs, but nothing was super obvious. Hoping this too is something simple.