I'm trying to ingest airwatch syslog events but not all fields are searchable only those with Field=Value
in the message are searchable. The logs contain two different kv formats in the syslog events, those with Field: Value
and Field=Value
. In the first half of the message the values run into the next field name which I believe is the part that Splunk is having trouble with.
In my example the first kv pair is Event Type: Console
with my field extractions in place the fields and values appear correct, however searching for EventType=Console
yields no results.
I've implemented regex extractions via inline extractions as a single field extraction with capture groups per field and each field as a separate extraction. I tried using delimiters via the field extraction wizard, this allows me to search like normal but the values include the next field name just like the logs, it requires regex or something further to tell Splunk where the value should end. I figured I'm missing something here, just not sure if using regex for field extractions this way is correct if there's some other piece that is required when using regex for search-time field extractions.
Mar 9 13:52:33 10.0.2.24 March 09 19:52:33 AirWatch AirWatch Syslog Details are as follows Event Type: ConsoleEvent: DeviceDataModfiedUser: domain\JoeUser Source: ServerEvent Module: DashboardEvent Category: DeviceEvent Data: Device=User iPhone iOS 10.2.1 X0XX;DeviceData=OwnerGroup;LoginSessionID=xx1xxx0xxxxx
my greedy regex.
EXTRACT-Device = Device=(?P<Device>[^ ]+);
EXTRACT-DeviceData = DeviceData=(?P<DeviceData>[^ ]+);
EXTRACT-Event = Event:\s+(?P<Event>\w+)User
EXTRACT-EventCategory = Category:\s+(?P<EventCategory>\w+)Event
EXTRACT-EventModule = Module:\s+(?P<EventModule>\w+)Event
EXTRACT-EventType = Event Type:\s(?P<EventType>\S+)Event
EXTRACT-User = User:\s+(?P<User>[^ ]+)Event
EXTRACT-EventSource = Source:\s+(?P<EventSource>\w+)Event
I finally fixed it with a fields.conf in etc/system/local for INDEXED_VALUE = false.
I had been putting fields.conf in the local folder of my app and NOT etc/system/local. that's the answer.
I finally fixed it with a fields.conf in etc/system/local for INDEXED_VALUE = false.
I had been putting fields.conf in the local folder of my app and NOT etc/system/local. that's the answer.
Hi,
One approach you could try is to rule out the 'unknown' characters problem by trying your extractions as rex
commands first.
I tried this with:
| makeresults | fields - _time
| eval _raw="Mar 9 13:52:33 10.0.2.24 March 09 19:52:33 AirWatch AirWatch Syslog Details are as follows Event Type: ConsoleEvent: DeviceDataModfiedUser: domain\JoeUser Source: ServerEvent Module: DashboardEvent Category: DeviceEvent Data: Device=User iPhone iOS 10.2.1 X0XX;DeviceData=OwnerGroup;LoginSessionID=xx1xxx0xxxxx"
| rex "Device=(?P<Device>[^;]+);"
| rex "DeviceData=(?P<DeviceData>[^;]+);"
| rex "Event:\s+(?P<Event>\w+)User"
| rex "Category:\s+(?P<EventCategory>\w+)Event"
| rex "Module:\s+(?P<EventModule>\w+)Event"
| rex "Event\sType:\s+(?P<EventType>\w+)Event"
| rex "User:\s+(?P<User>[^\s]+)"
| rex "Source:\s+(?P<EventSource>\w+)Event"
A number of your regex statements were slightly off, so I've modified them. This now returns:
I'm not 100% sure it would be the cause, but I'd avoid using the 'space' character in the regex, especially when you put them in props.conf
I always go with the \s
just to be explicit.
For example. I've avoid this:
EXTRACT-EventType = Event Type:\s+(?P<EventType>\S+)Event
And go instead with:
EXTRACT-EventType = Event\sType:\s+(?P<EventType>\w+)Event
It should be fine, it's more just force of habit for me.
If the rex command work but then don't when you move them over to the props.conf file, you can check the config for that sourcetype with btool
.
For example, if the sourcetype for your data was 'extract-test' you could run the command:
./splunk btool props list extract-test --debug
This will give you all of the props config for that sourcetype and which file the config is coming from. It would look a bit like this:
/Splunk/etc/system/local/props.conf [extract-test]
/Splunk/etc/system/default/props.conf ANNOTATE_PUNCT = True
/Splunk/etc/system/default/props.conf AUTO_KV_JSON = true
/Splunk/etc/system/default/props.conf BREAK_ONLY_BEFORE =
/Splunk/etc/system/default/props.conf BREAK_ONLY_BEFORE_DATE = True
/Splunk/etc/system/default/props.conf CHARSET = UTF-8
/Splunk/etc/system/default/props.conf DATETIME_CONFIG = /etc/datetime.xml
/Splunk/etc/system/local/props.conf EXTRACT-Device = Device=(?P<Device>[^;]+);
/Splunk/etc/system/local/props.conf EXTRACT-DeviceData = DeviceData=(?P<DeviceData>[^;]+);
/Splunk/etc/system/local/props.conf EXTRACT-Event = Event:\s+(?P<Event>\w+)User
/Splunk/etc/system/local/props.conf EXTRACT-EventCategory = Category:\s+(?P<EventCategory>\w+)Event
/Splunk/etc/system/local/props.conf EXTRACT-EventModule = Module:\s+(?P<EventModule>\w+)Event
/Splunk/etc/system/local/props.conf EXTRACT-EventSource = Source:\s+(?P<EventSource>\w+)Event
/Splunk/etc/system/local/props.conf EXTRACT-EventType = Event\sType:\s+(?P<EventType>\w+)Event
/Splunk/etc/system/local/props.conf EXTRACT-User = User:\s+(?P<User>[^\s]+)
/Splunk/etc/system/default/props.conf HEADER_MODE =
/Splunk/etc/system/default/props.conf LEARN_MODEL = true
/Splunk/etc/system/default/props.conf LEARN_SOURCETYPE = true
/Splunk/etc/system/default/props.conf LINE_BREAKER_LOOKBEHIND = 100
/Splunk/etc/system/default/props.conf MATCH_LIMIT = 100000
/Splunk/etc/system/default/props.conf MAX_DAYS_AGO = 2000
/Splunk/etc/system/default/props.conf MAX_DAYS_HENCE = 2
/Splunk/etc/system/default/props.conf MAX_DIFF_SECS_AGO = 3600
/Splunk/etc/system/default/props.conf MAX_DIFF_SECS_HENCE = 604800
/Splunk/etc/system/default/props.conf MAX_EVENTS = 256
/Splunk/etc/system/default/props.conf MAX_TIMESTAMP_LOOKAHEAD = 128
/Splunk/etc/system/default/props.conf MUST_BREAK_AFTER =
/Splunk/etc/system/default/props.conf MUST_NOT_BREAK_AFTER =
/Splunk/etc/system/default/props.conf MUST_NOT_BREAK_BEFORE =
/Splunk/etc/system/default/props.conf SEGMENTATION = indexing
/Splunk/etc/system/default/props.conf SEGMENTATION-all = full
/Splunk/etc/system/default/props.conf SEGMENTATION-inner = inner
/Splunk/etc/system/default/props.conf SEGMENTATION-outer = outer
/Splunk/etc/system/default/props.conf SEGMENTATION-raw = none
/Splunk/etc/system/default/props.conf SEGMENTATION-standard = standard
/Splunk/etc/system/default/props.conf SHOULD_LINEMERGE = True
/Splunk/etc/system/default/props.conf TRANSFORMS =
/Splunk/etc/system/default/props.conf TRUNCATE = 10000
/Splunk/etc/system/default/props.conf detect_trailing_nulls = false
/Splunk/etc/system/default/props.conf maxDist = 100
/Splunk/etc/system/default/props.conf priority =
/Splunk/etc/system/default/props.conf sourcetype =
Have a go with some of the tweaked regular expressions via rex
first, then move them into props.conf
, restart then check with btool
and see where you're at.
Your approach is fine and should work, there's just going to be a small niggle tripping you up somewhere!
With the regex and props.conf
above, I think I get the result you're looking for in my test:
thanks for the input, I've been testing this out with REX commands in search but the results are still different from my results with props.conf and transforms.conf.
I'm currently using the following regex pattern in transforms with field aliases for event and user, since the regex cuts off the first letter on those fields with the pattern below. Regardless I am still unable to search for some values. In this example, searching for domain\JoeUser has been working with my regex, but in another set of events where User=sysadmin it's not working and those make up 99% of the logs I have, strange that's it is different behavior with what would be expected the same outcome.
(?<_KEY_1>\w+)(:\s)(?<_VAL_1>[a-zA-Z0-9\\]+)(U|E)
I just tested with your props.conf, noticed the User field was grabbing the string "Event" from the data following the user value, I was scrubbing the data when I pasted it to answers and removed the trailing "Event" string from the user value, here's another example with no sensitive data.
Mar 9 13:52:32 10.0.2.24 March 09 19:52:32 AirWatch AirWatch Syslog Details are as follows Event Type: DeviceEvent: RemoveProfileRequestedUser: sysadminEvent Source: ServerEvent Module: DashboardEvent Category: CommandEvent Data: Profile=iOS Visual Privacy Webclip
Notice no spaces here from airwatch to separate one value from the next field/key. It's like this by default in the airwatch syslog settings.
{Event Type}{Event}{User}{Event Source}{Event Module}{Event Category}{Event Data}
I added this to your EXTRACT-User line:
EXTRACT-User = User:\s+(?P<User>[^\s]+)Event
now adding User=sysadmin gives me no results, when I should have results.
Hi,
Looks like the logging format for these events is a bit of a pain!
I'm not 100% sure your:
(?<_KEY_1>\w+)(:\s)(?<_VAL_1>[a-zA-Z0-9\\]+)(U|E)
Is going to work for you? What if one of the values you're trying to capture (like a Username), ends in a U or and E?
I also noticed that the Keys in the 'Event Data' are different in the two event samples. You're also going to want to cater for this and the associated spaces. The default Splunk Key/Value extractor is only going to get you:
Profile=iOS
Whereas you probably want:
Profile=iOS Visual Privacy Webclip
Ultimately you may just need to keep refining your regex(s) until you cater for 100% of your data. I get that this is pretty dull...
And whilst I'd always go with the 'keep it simple' wherever possible, you may end up with something slightly more complex.
For example, I don't like it, but this below works for both of your sample data events:
| makeresults
| eval _raw="Mar 9 13:52:33 10.0.2.24 March 09 19:52:33 AirWatch AirWatch Syslog Details are as follows Event Type: ConsoleEvent: DeviceDataModfiedUser: domain\JoeUserEvent Source: ServerEvent Module: DashboardEvent Category: DeviceEvent Data: Device=User iPhone iOS 10.2.1 X0XX;DeviceData=OwnerGroup;LoginSessionID=xx1xxx0xxxxx"
| append
[| makeresults
| eval _raw="Mar 9 13:52:32 10.0.2.24 March 09 19:52:32 AirWatch AirWatch Syslog Details are as follows Event Type: DeviceEvent: RemoveProfileRequestedUser: sysadminEvent Source: ServerEvent Module: DashboardEvent Category: CommandEvent Data: Profile=iOS Visual Privacy Webcli"]
| fields - _time
| rex "Event\sType:\s(?<EventType>[^\s]+)Event:\s(?<Event>[^\s]+)User:\s(?<User>[^\s]+)Event\sSource:\s(?<EventSource>[^\s]+)Event\sModule:\s(?<EventModule>[^\s]+)Event\sCategory:\s(?<EventCategory>[^\s]+)Event\sData:\s(?<EventData>(?:(?:Device=(?<Device>[^;\r\n]+);?)|(?:DeviceData=(?<DeviceData>[^;\r\n]+);?)|(?:LoginSessionID=(?<LoginSessionID>[^;\r\n]+);?)|(?:Profile=(?<Profile>[^;\r\n]+);?))*)"
| table EventType Event User EventSource EventModule EventCategory EventData Device DeviceData LoginSessionID Profile
You can play and test with it here: https://regex101.com/r/GEpAOP/1/
I.e. You may need to expand the (?<EventData>)
grouping to cater for different Keys in that section.
Good luck - I hope you find a much simpler way!
again, REX gives different results than using the EXTRACTs in props and the REGEX patterns in transforms. I understand you're using it for testing I appreciate the effort but it's not the same in reality for some strange reason. I can extract the fields using your patterns all day but I'm not able to search these events on the fields extracted using props and transforms.
I think I'm going to remove all the field extractions and add REX to all my saved searches with this data because extractions aside the whole point would be to search on those extracted fields, which is my issue with Splunk at the moment and I've been able to search a field once it's REXed.
I'm pretty sure you should be able to search an extracted field so this has got to be a bug.
so i just tested using only REX, search works and after some typos I have the results in an email alert, so as I expected REX will do to get these alerts and this is my workaround.
Hi,
I'm pleased to hear that you've got a workaround.
But you should still be able to get the EXTRACTs working; there's nothing wrong with your approach.
As such, it might be worth raising a Support Case with Splunk to try and get to the bottom of what's going on.
Just incase there is a bug, or something else is happening within your setup. You don't want to come across it down the line!
You can setup delimiter based extraction by updating props.conf and transforms.conf. See this
https://www.splunk.com/blog/2008/02/12/delimiter-based-key-value-pair-extraction/
props.conf (on search heads)
[yoursourceytpe]
REPORT-colonfields = colon_delimited_fields
transforms.conf (on search heads)
[colon_delimited_fields]
DELIMS = " ", ":"
Seems like there must be some special characters in there that are not appearing in your question. perhaps tab characters?
Event Type: ConsoleEvent: DeviceDataModfiedUser: domain\JoeUser Source: ServerEvent Module: DashboardEvent Category: DeviceEvent Data: Device=User iPhone iOS 10.2.1 X0XX;DeviceData=OwnerGroup;LoginSessionID=xx1xxx0xxxxx
parses to my eyes as
Event Type: ((null)
ConsoleEvent: ((null))
DeviceDataModfiedUser: domain\JoeUser
Source: ServerEvent
Module: DashboardEvent
Category: DeviceEvent
Data: Device=User iPhone iOS 10.2.1 X0XX;DeviceData=OwnerGroup;LoginSessionID=xx1xxx0xxxxx
It seems likely that there is a special character or whitespace before User:
, and between Console
and Event:
. Putting single spaces in thos spots would result in this parse...
Event Type: Console
Event: DeviceDataModfied
User: domain\JoeUser
Source: ServerEvent
Module: DashboardEvent
Category: DeviceEvent
Data: Device=User iPhone iOS 10.2.1 X0XX;DeviceData=OwnerGroup;LoginSessionID=xx1xxx0xxxxx
And, I'm not sure whether the system would index "Event Type" or "Type".
so I would format the logs at ingestion? and then I would be indexing fields, correct?
I was hoping to avoid index-time operations if possible but if the log is garbage I can see how I'm limited to begin with. I agree if there were spaces we could use a delimiter to parse no problem.