Splunk Search

joining across field matrix

rizzo75
Path Finder

Hi -
I am trying to wrap my head around the following search - looking at join, appendcols and map commands to get the job done, but I am at a loss.

I have about 3000 IP address pairs(endpoints of IP connection) I want to join to network device logs. I would like to find the earliest match in the device logs. The join needs to happen across a matrix of fields to capture all events.

IP address pairs
Time,Id,HostA,HostB
12/14/2013 05:01:00,1,1.1.1.1,2.2.2.1
12/14/2013 06:02:00,2,1.1.1.2,2.2.2.2
12/14/2013 07:03:00,3,2.2.2.3,1.1.1.3
12/14/2013 08:03:00,4,1.1.1.4,2.2.2.4
...

Fields from network device
TimeSeen,LocalIP,RemoteIP,OtherFields
12/14/2013 05:01:11,1.1.1.1,2.2.2.1,foo
12/14/2013 05:02:22,2.2.2.2,1.1.1.2,bar
12/14/2013 05:03:33,1.1.1.3,2.2.2.3,foobar
12/14/2013 05:01:05,2.2.2.1,1.1.1.1,bar
...

How could I join both data sources across the fields with IP data? The logic would need to compare 2 different field sets.
IE - HostA=LocalIP HostB=RemoteIP OR HostA=RemoteIP HostB=LocalIP

And output a single event for each of the IP address pairs with the earliest event found in the network device logs, with fields from the both sources?
IE -
Time,Id, HostA, HostB,TimeSeen,LocalIP, RemoteIP, OtherFields
"12/14/2013:05:01:00",1,1.1.1.1,2.2.2.1,"12/14/2013:05:01:05",2.2.2.1,1.1.1.1,bar

Maybe I need to make a multi-value field out of each IP pair and join on that.

I am also unclear on how to find the earliest event based on 3 fields(TimeSeen,LocalIP,RemoteIP), then output all fields in the event. stats earliest() only seems to accept 1 field.

Any help is greatly appreciated.

Thanks,
Joe

Tags (1)
0 Karma
1 Solution

rizzo75
Path Finder

I found a solution by creating a multi-value field with the IP address pairs, then joining on that field.

When joining on multi-value fields, the order matters. I used the answer in this question to sort the values in the multi-value field. http://answers.splunk.com/answers/11394/is-it-possible-to-sort-or-reorder-a-multivalue-field

|inputcsv testing/scorecard.csv | eval ips=HostA+","+HostB | makemv ips delim="," | mvdedup +ips | join type=left ips [search index=testingdevices | sort + _time | eval ips=LocalIP+","+RemoteIP | makemv ips delim="," | mvdedup +ips] | fields Time Id HostA HostB TimeSeen LocalIP RemoteIP OtherFields | fillnull

       Time         Id  HostA   HostB       TimeSeen       LocalIP RemoteIP OtherFields
------------------- -- ------- ------- ------------------- ------- -------- -----------
12/14/2013 05:01:00  1 1.1.1.1 2.2.2.1 12/14/2013 05:01:05 2.2.2.1 1.1.1.1  bar
12/14/2013 06:02:00  2 1.1.1.2 2.2.2.2 12/14/2013 05:02:22 2.2.2.2 1.1.1.2  bar
12/14/2013 07:03:00  3 2.2.2.3 1.1.1.3 12/14/2013 05:03:33 1.1.1.3 2.2.2.3  foobar
12/14/2013 08:03:00  4 1.1.1.4 2.2.2.4 0                   0       0        0

I am happy with the results but wonder if there is a "better" way of doing this.

Thanks,
Joe

View solution in original post

0 Karma

rizzo75
Path Finder

I found a solution by creating a multi-value field with the IP address pairs, then joining on that field.

When joining on multi-value fields, the order matters. I used the answer in this question to sort the values in the multi-value field. http://answers.splunk.com/answers/11394/is-it-possible-to-sort-or-reorder-a-multivalue-field

|inputcsv testing/scorecard.csv | eval ips=HostA+","+HostB | makemv ips delim="," | mvdedup +ips | join type=left ips [search index=testingdevices | sort + _time | eval ips=LocalIP+","+RemoteIP | makemv ips delim="," | mvdedup +ips] | fields Time Id HostA HostB TimeSeen LocalIP RemoteIP OtherFields | fillnull

       Time         Id  HostA   HostB       TimeSeen       LocalIP RemoteIP OtherFields
------------------- -- ------- ------- ------------------- ------- -------- -----------
12/14/2013 05:01:00  1 1.1.1.1 2.2.2.1 12/14/2013 05:01:05 2.2.2.1 1.1.1.1  bar
12/14/2013 06:02:00  2 1.1.1.2 2.2.2.2 12/14/2013 05:02:22 2.2.2.2 1.1.1.2  bar
12/14/2013 07:03:00  3 2.2.2.3 1.1.1.3 12/14/2013 05:03:33 1.1.1.3 2.2.2.3  foobar
12/14/2013 08:03:00  4 1.1.1.4 2.2.2.4 0                   0       0        0

I am happy with the results but wonder if there is a "better" way of doing this.

Thanks,
Joe

0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

Announcing Modern Navigation: A New Era of Splunk User Experience

We are excited to introduce the Modern Navigation feature in the Splunk Platform, available to both cloud and ...

Modernize your Splunk Apps – Introducing Python 3.13 in Splunk

We are excited to announce that the upcoming releases of Splunk Enterprise 10.2.x and Splunk Cloud Platform ...

Step into “Hunt the Insider: An Splunk ES Premier Mystery” to catch a cybercriminal ...

After a whole week of being on call, you fell asleep on your keyboard, and you hit a sequence of buttons that ...