Splunk Search

Need assistance dedup'ing data

mjshoaf
New Member

I need help figuring out how to correctly dedup the data below. The 10 log messages below represent 4 distinct events (data circuit outages) . How can I dedup this data to get the counts to be accurate? If I dedup by router name (e.g. rrw01p), it only results in 1 event for rrw01p.

rrw01p - 2 events
rrw02p - 1 event
intrw02p - 1 event

***EDITED: I've grouped the messages to better reflect what I'm trying to communicate. I hope this helps to clarify.*

These 4 messages represent 1 outage event:
Jan 2 18:29:29 rrw01p 2001: Jan 2 18:29:28: %BGP-5-ADJCHANGE: neighbor 199.x.x.13 Down BFD adjacency down
Jan 2 18:29:29 rrw01p 1999: Jan 2 18:29:28: %BGP-5-ADJCHANGE: neighbor 152.x.x.73 vpn vrf SIP Down BFD adjacency down
Jan 2 18:29:29 rrw01p 1997: Jan 2 18:29:28: %BGP-5-ADJCHANGE: neighbor 68.x.x.133 Down BFD adjacency down
Jan 2 18:29:29 rrw01p 1995: Jan 2 18:29:28: %BGP-5-ADJCHANGE: neighbor 68.x.x.249 Down BFD adjacency down

This 1 message represents 1 outage event:
Jan 2 18:29:29 rrw02p 2158: Jan 2 18:29:29: %BGP-5-ADJCHANGE: neighbor 199.x.x.249 Down BFD adjacency down

These 4 messages represent 1 outage event:
Dec 7 15:46:57 rrw01p 1959: Dec 7 15:46:56: %BGP-5-ADJCHANGE: neighbor 152.x.x.73 vpn vrf SIP Down BFD adjacency down
Dec 7 15:46:56 rrw01p 1956: Dec 7 15:46:56: %BGP-5-ADJCHANGE: neighbor 199.x.x.13 Down BFD adjacency down
Dec 7 15:46:56 rrw01p 1954: Dec 7 15:46:56: %BGP-5-ADJCHANGE: neighbor 68.x.x.133 Down BFD adjacency down
Dec 7 15:46:56 rrw01p 1952: Dec 7 15:46:56: %BGP-5-ADJCHANGE: neighbor 68.x.x.249 Down BFD adjacency down

This 1 message represents 1 outage event:
Dec 7 15:46:57 intrw02p 2761: Dec 7 15:46:56: %BGP-5-ADJCHANGE: neighbor 4.x.x.249 Down BFD adjacency down

Tags (2)
0 Karma

niketn
Legend

@mjshoaf have you tried dedup by _time and host i.e.

<YourBaseSearch>
| dedup _time host

You can try the following run anywhere search otherwise to pull the date from the beginning of the event and override _time with the same. It also extracts the router as host but you should already have the same extracted:

| makeresults
| eval data="Jan 2 18:29:29 rrw01p 2001: Jan 2 18:29:28: %BGP-5-ADJCHANGE: neighbor 199.x.x.13 Down BFD adjacency down;Jan 2 18:29:29 rrw02p 2158: Jan 2 18:29:29: %BGP-5-ADJCHANGE: neighbor 199.x.x.249 Down BFD adjacency down;Jan 2 18:29:29 rrw01p 1999: Jan 2 18:29:28: %BGP-5-ADJCHANGE: neighbor 152.x.x.73 vpn vrf SIP Down BFD adjacency down;Jan 2 18:29:29 rrw01p 1997: Jan 2 18:29:28: %BGP-5-ADJCHANGE: neighbor 68.x.x.133 Down BFD adjacency down;Jan 2 18:29:29 rrw01p 1995: Jan 2 18:29:28: %BGP-5-ADJCHANGE: neighbor 68.x.x.249 Down BFD adjacency down;Dec 7 15:46:57 rrw01p 1959: Dec 7 15:46:56: %BGP-5-ADJCHANGE: neighbor 152.x.x.73 vpn vrf SIP Down BFD adjacency down;Dec 7 15:46:57 intrw02p 2761: Dec 7 15:46:56: %BGP-5-ADJCHANGE: neighbor 4.x.x.249 Down BFD adjacency down;Dec 7 15:46:56 rrw01p 1956: Dec 7 15:46:56: %BGP-5-ADJCHANGE: neighbor 199.x.x.13 Down BFD adjacency down;Dec 7 15:46:56 rrw01p 1954: Dec 7 15:46:56: %BGP-5-ADJCHANGE: neighbor 68.x.x.133 Down BFD adjacency down;Dec 7 15:46:56 rrw01p 1952: Dec 7 15:46:56: %BGP-5-ADJCHANGE: neighbor 68.x.x.249 Down BFD adjacency down"
| makemv data delim=";" 
| mvexpand data
| rename data as _raw
| rex "(?<date>[^\s]+\s[^\s]+\s[^\:]+\:[^\:]+\:[^\s]+\s)(?<host>[^\s]+)\s"
| eval _time=strptime(date,"%b %d %H:%M:%S")
| dedup _time host
____________________________________________
| makeresults | eval message= "Happy Splunking!!!"
0 Karma

damiensurat
Contributor

it would be good to understand which fields are being extracted from each event, as as well as what the string in the actual event data is. For instance, if the timestamp isn't in the event itself then you can dedup on router name, _raw...

if there are other fields being extracted such as neighbor, status (eg:Down) Message (Down BFD adjacency down or vpn vrf SIP Down BFD adjacency) you can then dedup on those as well.. if you don't see those field extractions, you can extract them using rex or regex commands.

This may help you get started to extract fields to allow you to dedup properly:

yoursearch | rex field=_raw "(?%.):\sneighbor\s(?\d{1,3}.(\d{1,3}|x).(\d{1,3}|x).\d{1,3}|x)\s(?.$)" | dedup change_type, neighbor_ip, router_message

0 Karma

mjshoaf
New Member

Base search is 'index = network ADJCHANGE bgp_state="Down"'
The router name is the 'host' field. Other extracted fields are 'bgp_neighbor' and 'bgp_state'.
Each router has multiple bgp neighbors. That's why I'm seeing multiple log messages for one event. In other words, the circuit goes down and each down neighbor results in a separate log message. But I want to count it as just one event. The router name (host) is the common field (along with time) that distinguishes one event from another.

0 Karma

damiensurat
Contributor

great... so now all you need to dedup on is the message:

yoursearch | rex field=_raw "%.*(\d{1,3}.(\d{1,3}|x).(\d{1,3}|x).\d{1,3}|x)\s(?<bgp_message>.*)"

the above should produce and extraction of the end of the event.. EG:
Down BFD adjacency down
vpn vrf SIP Down BFD adjacency
etc..

search with dedup:

yoursearch | rex field=_raw "%.*(\d{1,3}.(\d{1,3}|x).(\d{1,3}|x).\d{1,3}|x)\s(?<bgp_message>.*)"| dedup host, bgp_state, bgp_neighbor, bgp_message
0 Karma

damiensurat
Contributor

on a side note, in the rex extraction I include the x in extraction for the ip address:
(\d{1,3}.(\d{1,3}|x).(\d{1,3}|x).\d{1,3}|x)

I wasn't sure if this was part of the actual message or if you were obfuscating the ip with the x's. If you are unfamiliar with regular expressions the above looks for:

\d - any digit
\d{1,3} any digit 1-3 digits long
(\d{1,3}|x) - any digit 1-3 digits long or x (lowercase)

I'm not sure if you will have to use an additional \ to escape any special characters.. eg: the \d may need to become \d as I don't have any data to test with. hope this helps!

0 Karma

kmaron
Motivator

It looks like the message at the end is what makes 2 for events for rrw01p. So if you dedup on router name and that message you should get what you want.

0 Karma

BearMormont
Path Finder

What are the individual field names? You mentioned router number.

What constitutes a unique event? Each of these lines look distinct to me.

0 Karma

mjshoaf
New Member

The router name (rrw01p, rrw02p) is the key field. I can dedup by that, but I need a time element as well. If I just dedup the above data by router name, it will result in 1 event for rrw01p when they're were actually 2 events for rrw01p (18:29 on Jan 2 and 15:46 on Dec 7).

0 Karma

tiagofbmm
Influencer

Hey

what connects your events? Anything to pick up for that?

0 Karma

mjshoaf
New Member

The router name (rrw01p, rrw02p) is the key field.

0 Karma
Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...