It appears that Splunk is truncating Fireeye (7.4) ext json messages. There are 90 lines in the message it only extracts 81 lines.
I don't see them in the "all fields" section either. Need help getting the other 9 lines indexed.
It's not indexing below fields:
},
"occurred": "2015-11-05 20:48:26+00",
"id": "9200",
"action": "notified",
"dst": {
"ip": "127.0.0.20",
"mac": "00:22:44:66:88:aa",
"port": "20"
},
"name": "infection-match"
}
}
We are using the latest Fireeye Add-on (3.0.7)..
Our props.conf file settings.
[fe_json]
TRUNCATE=0
SHOULD_LINEMERGE = false
LINE_BREAKER = ((?!))
KV_MODE = JSON
TIME_PREFIX = \"occurred\":\s
TIME_FORMAT = \"%Y-%m-%d %H:%M:%S+00\"
TZ = UTC
FIELDALIAS-dest_ip_for_fireeye_app = alert.dst.ip as dest_ip
FIELDALIAS-dest_for_fireeye = alert.dst.ip as dest
FIELDALIAS-dest_port_for_fireeye = alert.dst.port as dest_port
FIELDALIAS-dest_mac_for_fireeye = alert.dst.mac as dest_mac
Did we get anywhere?
It appears its breaking on timestamps
TIME_PREFIX = "occurred": s
#looks like a typo.
...And you may need a line breaker too.
LINE_BREAKER = <regex>
LINE_BREAKER = <regular expression>
* Specifies a regex that determines how the raw text stream is broken into
initial events, before line merging takes place. (See the SHOULD_LINEMERGE
attribute, below)
* Defaults to ([\r\n]+), meaning data is broken into an event for each line,
delimited by any number of carriage return or newline characters.
* The regex must contain a capturing group -- a pair of parentheses which
defines an identified subcomponent of the match.
* Wherever the regex matches, Splunk considers the start of the first
capturing group to be the end of the previous event, and considers the end
of the first capturing group to be the start of the next event.
* The contents of the first capturing group are discarded, and will not be
present in any event. You are telling Splunk that this text comes between
lines.
* NOTE: You get a significant boost to processing speed when you use
LINE_BREAKER to delimit multiline events (as opposed to using
SHOULD_LINEMERGE to reassemble individual lines into multiline events).
* When using LINE_BREAKER to delimit events, SHOULD_LINEMERGE should be set
to false, to ensure no further combination of delimited events occurs.
* Using LINE_BREAKER to delimit events is discussed in more detail in the web
documentation at the following url:
http://docs.splunk.com/Documentation/Splunk/latest/Data/indexmulti-lineevents
** Special considerations for LINE_BREAKER with branched expressions **
When using LINE_BREAKER with completely independent patterns separated by
pipes, some special issues come into play.
EG. LINE_BREAKER = pattern1|pattern2|pattern3
Note, this is not about all forms of alternation, eg there is nothing
particular special about
example: LINE_BREAKER = ([\r\n])+(one|two|three)
where the top level remains a single expression.
A caution: Relying on these rules is NOT encouraged. Simpler is better, in
both regular expressions and the complexity of the behavior they rely on.
If possible, it is strongly recommended that you reconstruct your regex to
have a leftmost capturing group that always matches.
It may be useful to use non-capturing groups if you need to express a group
before the text to discard.
EG. LINE_BREAKER = (?:one|two)([\r\n]+)
* This will match the text one, or two, followed by any amount of
newlines or carriage returns. The one-or-two group is non-capturing
via the ?: prefix and will be skipped by LINE_BREAKER.
* A branched expression can match without the first capturing group
matching, so the line breaker behavior becomes more complex.
Rules:
1: If the first capturing group is part of a match, it is considered the
linebreak, as normal.
2: If the first capturing group is not part of a match, the leftmost
capturing group which is part of a match will be considered the linebreak.
3: If no capturing group is part of the match, the linebreaker will assume
that the linebreak is a zero-length break immediately preceding the match.
Example 1: LINE_BREAKER = end(\n)begin|end2(\n)begin2|begin3
* A line ending with 'end' followed a line beginning with 'begin' would
match the first branch, and the first capturing group would have a match
according to rule 1. That particular newline would become a break
between lines.
* A line ending with 'end2' followed by a line beginning with 'begin2'
would match the second branch and the second capturing group would have
a match. That second capturing group would become the linebreak
according to rule 2, and the associated newline would become a break
between lines.
* The text 'begin3' anywhere in the file at all would match the third
branch, and there would be no capturing group with a match. A linebreak
would be assumed immediately prior to the text 'begin3' so a linebreak
would be inserted prior to this text in accordance with rule 3. This
means that a linebreak will occur before the text 'begin3' at any
point in the text, whether a linebreak character exists or not.
Example 2: Example 1 would probably be better written as follows. This is
not equivalent for all possible files, but for most real files
would be equivalent.
LINE_BREAKER = end2?(\n)begin(2|3)?
LINE_BREAKER_LOOKBEHIND = <integer>
* When there is leftover data from a previous raw chunk,
LINE_BREAKER_LOOKBEHIND indicates the number of bytes before the end of
the raw chunk (with the next chunk concatenated) that Splunk applies the
LINE_BREAKER regex. You may want to increase this value from its default
if you are dealing with especially large or multiline events.
* Defaults to 100 (bytes).
# Use the following attributes to specify how multiline events are handled.
SHOULD_LINEMERGE = [true|false]
* When set to true, Splunk combines several lines of data into a single
multiline event, based on the following configuration attributes.
* Defaults to true.
# When SHOULD_LINEMERGE is set to true, use the following attributes to
# define how Splunk builds multiline events.
BREAK_ONLY_BEFORE_DATE = [true|false]
* When set to true, Splunk creates a new event only if it encounters a new
line with a date.
* Note, when using DATETIME_CONFIG = CURRENT or NONE, this setting is not
meaningful, as timestamps are not identified.
* Defaults to true.
BREAK_ONLY_BEFORE = <regular expression>
* When set, Splunk creates a new event only if it encounters a new line that
matches the regular expression.
* Defaults to empty.
MUST_BREAK_AFTER = <regular expression>
* When set and the regular expression matches the current line, Splunk
creates a new event for the next input line.
* Splunk may still break before the current line if another rule matches.
* Defaults to empty.
MUST_NOT_BREAK_AFTER = <regular expression>
* When set and the current line matches the regular expression, Splunk does
not break on any subsequent lines until the MUST_BREAK_AFTER expression
matches.
* Defaults to empty.
MUST_NOT_BREAK_BEFORE = <regular expression>
* When set and the current line matches the regular expression, Splunk does
not break the last event before the current line.
* Defaults to empty.
MAX_EVENTS = <integer>
* Specifies the maximum number of input lines to add to any event.
* Splunk breaks after the specified number of lines are read.
* Defaults to 256 (lines).
Not sure what happened to the "time_prefix" in the question. i doubled checked my props.conf and it is
"TIME_PREFIX = \"occurred\":\s" -- not sure why it pasted it as "TIME_PREFIX = \"occurred\":s" .. So i can assume it's correct with "TIME_PREFIX = \"occurred\":\s"..
I have a elementary understand on how to write regexp to capture data in our other systems (flat files, etc). But not sure how to create an expression on EOF in Splunk.
As for line break the default is "((?!))". <-- this is as negitive lookahead. But it doesn't make sense because there nothing to look back to.
Would something like this be better "(\$(?!}))"
Below is the alerts sent to Splunk
{
"msg": "extended",
"product": "Web MPS",
"version": "7.4.0.254758",
"appliance": "my-fireeye-pri.company.net",
"alert": {
"src": {
"mac": "00:00:00:00:00:00",
"ip": "169.250.0.1",
"host": "IM-testing.fe-notify-examples.com",
"vlan": "0",
"port": "10"
},
"severity": "minr",
"alert-url": "https://127.0.0.1/event_stream/events_for_bot?ev_id=9200&lms_iden=00:24:91:7A:5D:F4",
"explanation": {
"malware-detected": {
"malware": {
"name": "FireEye-TestEvent-SIG-IM",
"stype": "bot-command",
"sid": "30"
}
},
"protocol": "tcp",
"analysis": "content"
},
"occurred": "2015-11-05 20:48:26+00",
"id": "9200",
"action": "notified",
"dst": {
"ip": "127.0.0.20",
"mac": "00:44:44:66:44:BB",
"port": "20"
},
"name": "infection-match"
}
}
Just make this your time prefix and lets see what happens:
"occurred"
or
occurred
According to the documentation this is where it starts looking for date patterns and it should auto drop the colons and quotes on its quest to find a date on the same line this prefix is found on.
Do i have to do a splunkd restart everytime i make a change to fireeye addon "props.conf" file.
Change from
TIME_PREFIX = \"occurred\"\:\s
To
TIME_PREFIX = occurred
Good question. I should have mentioned this sooner.
The props changes we're doing affect search extraction. So first thing to note is that some of these changes to props/transforms should be on the search heads and some should be on forwarders / indexers. You can use the same props/transforms in both locations and be fine, but you need these in both locations for sure.
Another thing I didnt mention is that when you change these, sometimes even after a restart it will take a few minutes for the change to start working. You can run this search to help force the reload / update but even it doesnt work all the time.
| extract reload=T
Also this command is the mother of all refreshes
http[s]://[splunkweb hostname]:[splunkweb port]/debug/refresh
I generally push props changes, then sit on the server running a search with |extract reload=T for a few minutes until the changes show up.
Not sure what's going on. But the "\" keep getting removed when i post.
"\" there is one backslash in these quotes. "\" there are two backslashes in these quotes.
i dont understand the "s" in your time prefix... i guess you're using
\s
but its not showing the \ because youre not in code blocks
use the code blocks 1010101 in the menu bar or press space 5 times on a new line before typing.
Are the timestamps right on the data... if so then this isnt the problem and you need line breaks instead..... been a while since i did json in splunk .. it wasnt fun..
try this in your props as well.
BREAK_ONLY_BEFORE = {.*"msg"
Time stamps are correct. And i will add the "break_only_before" as well.