Splunk Search

Multiple events get aggregated into a single event

skippylou
Communicator

I've noticed this mainly with snort logs so far, but it appears that when events from the same source host have the same HH:MM:SS it aggregates them into a single event. From a view standpoint it's nice, but from a 'top' or statistics standpoint it is "hiding" data. Is this default behavior or is there some way to work around this?

I've included a sample 'single splunk event' which aggregated two snort events into one:

48  7/16/10
7:42:12.375 AM  

[**] [1:1419:12] SNMP trap udp [**]
[Classification: Attempted Information Leak] [Priority: 2] 
07/16-07:42:12.375683 A.B.C.D:61421 -> E.F.G.H:162
UDP TTL:64 TOS:0x0 ID:18253 IpLen:20 DgmLen:155
Len: 127
[Xref => http://cve.mitre.org/cgi-bin/cvename.cgi?name=2002-0013][Xref => http://cve.mitre.org/cgi-bin/cvename.cgi?name=2002-0012][Xref => http://www.securityfocus.com/bid/4132][Xref => http://www.securityfocus.com/bid/4089][Xref => http://www.securityfocus.com/bid/4088]
[**] [1:1419:12] SNMP trap udp [**]
[Classification: Attempted Information Leak] [Priority: 2] 
07/16-07:42:12.379017 A.B.C.D:61421 -> E.F.G.H:162
UDP TTL:64 TOS:0x0 ID:18256 IpLen:20 DgmLen:177
Len: 149
[Xref => http://cve.mitre.org/cgi-bin/cvename.cgi?name=2002-0013][Xref => http://cve.mitre.org/cgi-bin/cvename.cgi?name=2002-0012][Xref => http://www.securityfocus.com/bid/4132][Xref => http://www.securityfocus.com/bid/4089][Xref => http://www.securityfocus.com/bid/4088]

Thanks,

Scott

Tags (1)
1 Solution

Lowell
Super Champion

Yes, it's always recommended to get your individual events to be broken properly when you initially setup splunk. If you want to re-join your time events later on for viewing convenience, you can always do that at search-time using the transaction command, for example. So it is very import to get this kind of index-time logic setup correctly as early a possible....

Are your events showing up with the sourcetype of "snort" or some other value? If they are not, then it makes sens to first try to use Splunk's built in "snort" sourcetype. One way to force your log files to be picked up as a specific sourcetype is by adding an entry like this in your props.conf config file:

[source::/var/log/path/snort*.log]
# Adjust the pattern so it matches your local snort path
sourcetype = snort

If you need your own sourcetype (if the "snort" sourcetype doesn't work for you), then, based purely on your example given (in other words, I don't have any logs like this myself, so I'm making a best guess here), you could try using a custom sourcetype definition like this:

[my_sourcetype]
SHOULD_LINEMERGE = True
TIME_FORMAT = %m/%d-%H:%M:%S.%6N
MAX_TIMESTAMP_LOOKAHEAD = 100
BREAK_ONLY_BEFORE_DATE = False
BREAK_ONLY_BEFORE = ^\[\*\*\] \[[\d:]+\] \S+ trap

Someone who understand this sourcetype better could probably give you a better sourcetype definition. But perhaps this will be enough to get you started.


Update: Now that I know you are using the "snort" sourcetype, I did some more looking and it appears that appears the defined line breaker is a string that looks like: =+=+=+=+=+ which I don't see in your example, so that would explain why events are getting combined together inappropriately. So I suggest that you either create your own sourcetype and use it, or update splunk's "snort" sourcetype (the first option is probably better). I think the example I gave above will work for you, but it may require some tweaking.


Helpful docs:

View solution in original post

Michael
Contributor

Same problem here, with Suricata. Happening with all output: http.log, tls.log, access_log, etc..

I posted about this many months ago, without resolution. I'm surprised that hardly anyone else is seeing this. I wonder if it's something to do with the output files themselves. Suricata and Snort are so similar, it's possible...

If you discover a fix, please post it!

Mike

0 Karma

skippylou
Communicator

In case anyone runs into this, I accomplished better results by overriding the BREAK_ONLY_BEFORE settings in etc/system/default/props.conf by including the following in etc/system/local/props.conf:

[snort] 
BREAK_ONLY_BEFORE = ^\[\**\] .* \[\**\]$

This was for applying to snort alerts generate in alert_full mode.

Scott

skippylou
Communicator

Ha, was adding the anchors as you were adding a comment - agreed. Gotcha, on the greedy, will do.

0 Karma

Lowell
Super Champion

Also, it would probably be better to use a on-greedy form of ".", like ".?". So my final recommendation would look like: ^\[\\*+\] .*? \[\\*+\] (The formatting in my last comment got screwed up, text formatting on here and regex don't play well together...)

0 Karma

Lowell
Super Champion

Keep in mind that \[\**\] also matches just "[]" You may want to use \[\*+\] to require at least one "", or you could use \[\*{2}\] to match only the literal "[*]". You may get better performance if you stick a "^" in the front as well, which tells regex engine that your match has to be at the start of the line, otherwise it will search the entire line.

0 Karma

skippylou
Communicator

Yep, looks perfect, thanks. The 48 and date/time are the splunk event one as you suspected.

0 Karma

Lowell
Super Champion

I've tried to clean up the formatting on this. Does the above look correct? I assume the "48" and date/time were copied from the splunk web interface (and are therefore not part of your event.) Is the event on multiple lines like this, or on one longer line. It makes a big difference when you go to index the data.

0 Karma

Lowell
Super Champion

Yes, it's always recommended to get your individual events to be broken properly when you initially setup splunk. If you want to re-join your time events later on for viewing convenience, you can always do that at search-time using the transaction command, for example. So it is very import to get this kind of index-time logic setup correctly as early a possible....

Are your events showing up with the sourcetype of "snort" or some other value? If they are not, then it makes sens to first try to use Splunk's built in "snort" sourcetype. One way to force your log files to be picked up as a specific sourcetype is by adding an entry like this in your props.conf config file:

[source::/var/log/path/snort*.log]
# Adjust the pattern so it matches your local snort path
sourcetype = snort

If you need your own sourcetype (if the "snort" sourcetype doesn't work for you), then, based purely on your example given (in other words, I don't have any logs like this myself, so I'm making a best guess here), you could try using a custom sourcetype definition like this:

[my_sourcetype]
SHOULD_LINEMERGE = True
TIME_FORMAT = %m/%d-%H:%M:%S.%6N
MAX_TIMESTAMP_LOOKAHEAD = 100
BREAK_ONLY_BEFORE_DATE = False
BREAK_ONLY_BEFORE = ^\[\*\*\] \[[\d:]+\] \S+ trap

Someone who understand this sourcetype better could probably give you a better sourcetype definition. But perhaps this will be enough to get you started.


Update: Now that I know you are using the "snort" sourcetype, I did some more looking and it appears that appears the defined line breaker is a string that looks like: =+=+=+=+=+ which I don't see in your example, so that would explain why events are getting combined together inappropriately. So I suggest that you either create your own sourcetype and use it, or update splunk's "snort" sourcetype (the first option is probably better). I think the example I gave above will work for you, but it may require some tweaking.


Helpful docs:

Michael
Contributor

Question for you (Lowell or SkippLou),

Are you editing the props.conf on the sending system, forwarders, indexers, or where? I have a clustered environment...

Thanks!
Mike

0 Karma

skippylou
Communicator

Even stranger is that each alert is separate by a new line....not =+ combos anywhere to be found. Guess I'll add a new one with just the new line.

0 Karma

Lowell
Super Champion

If your talking about this post: http://answers.splunk.com/questions/4350/set-sourcetype-by-source-with-props-conf-not-working then I would guess that was a custom setup and not a splunk built in default. It's very common to setup your own custom sourcetypes. You could always post a comment to the author and ask if he would we willing to share his/her props.conf settings. Best of luck!

0 Karma

skippylou
Communicator

Interesting, thanks for the info. I came across in my internet search that there may have been a snort_alert_full at one time in splunk that I would have thought would have matched snort alert_full, but couldn't find it in 4.1.3, so thought the default snort one would work just fine. I'll give it a go.

Thanks.

0 Karma

Lowell
Super Champion

I've updated my answer based on your response. It doesn't look like the "snort" built in sourcetype is what you want to be using. It doesn't match what you posted in your sample event.

0 Karma

skippylou
Communicator

Yep, showing up as sourcetype="snort". I am in fact using the sourcetype of snort in the inputs on the light forwarder - which I assume is how it is getting applied.

0 Karma
Get Updates on the Splunk Community!

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...

Welcome to the Splunk Community!

(view in My Videos) We're so glad you're here! The Splunk Community is place to connect, learn, give back, and ...

Tech Talk | Elevating Digital Service Excellence: The Synergy of Splunk RUM & APM

Elevating Digital Service Excellence: The Synergy of Real User Monitoring and Application Performance ...