Splunk Search

How to find same values in different sources?

Momgineer
Engager

Our login page is developed by team1 and the main home page (After login) is developed by team2. The event logs from each use completely different structures. I strongly suspect unique system identifiers in the login logs may be carried into the home page logs, but I don't know which fields (out of 20-50 fields in each log) may contain similar values. 

Is there a method to find fields that have the same value in both sources if I don't know which fields to match on? 

(index=A sourcetype="login" colA="apple", colB="ABC123" , colC="purple")
(index=B sourcetype="home" field1="yellow", field2="orange", ..., field20="ABC123", field21="Monkey")

How can I search both sources to identify ( login.colB == home.field20) if I don't know in advance those fields match? I may not find ANY common values...

Labels (3)
0 Karma
1 Solution

JacekF
Path Finder

If you just want to find fields with the same value, you can use something like that:

(index="A" sourcetype="login") OR (index="B" sourcetype="home")
|  eval kv = "---"
| foreach *
 [|  eval kv = mvappend(kv, index + ":" + sourcetype + ":<<FIELD>>" + "|" + '<<FIELD>>')]
 | eval kv=mvfilter(!match(kv, "---"))
 | fields kv
 | mvexpand kv
 | makemv kv delim="|"
 | eval field=mvindex(kv,0)
 | eval value=mvindex(kv,1)
 | stats values(field) as fields by value
 | where mvcount(fields) > 1

This should give you a table of values and a fields list this value appears in.

For example, the below query:

|  makeresults 1
| eval index="A", sourcetype="login", colA="apple", colB="ABC123" , colC="purple"
| append 
    [| makeresults 1
    | eval index="B", sourcetype="home", field1="yellow", field2="orange", field3="lemon", field4="tomato", field5="potato", field20="ABC123", field21="Monkey"]
|  eval kv = "---"
| foreach *
 [|  eval kv = mvappend(kv, index + ":" + sourcetype + ":<<FIELD>>" + "|" + '<<FIELD>>')]
 | eval kv=mvfilter(!match(kv, "---"))
 | fields kv
 | mvexpand kv
 | makemv kv delim="|"
 | eval field=mvindex(kv,0)
 | eval value=mvindex(kv,1)
 | stats values(field) as fields by value
 | where mvcount(fields) > 1

produces the following results:

valuefields
ABC123

A:login:colB
B:home:field20

which means that value "ABC123" appears in index A sourcetype login colB and index B sourcetype home field20

View solution in original post

0 Karma

JacekF
Path Finder

If you just want to find fields with the same value, you can use something like that:

(index="A" sourcetype="login") OR (index="B" sourcetype="home")
|  eval kv = "---"
| foreach *
 [|  eval kv = mvappend(kv, index + ":" + sourcetype + ":<<FIELD>>" + "|" + '<<FIELD>>')]
 | eval kv=mvfilter(!match(kv, "---"))
 | fields kv
 | mvexpand kv
 | makemv kv delim="|"
 | eval field=mvindex(kv,0)
 | eval value=mvindex(kv,1)
 | stats values(field) as fields by value
 | where mvcount(fields) > 1

This should give you a table of values and a fields list this value appears in.

For example, the below query:

|  makeresults 1
| eval index="A", sourcetype="login", colA="apple", colB="ABC123" , colC="purple"
| append 
    [| makeresults 1
    | eval index="B", sourcetype="home", field1="yellow", field2="orange", field3="lemon", field4="tomato", field5="potato", field20="ABC123", field21="Monkey"]
|  eval kv = "---"
| foreach *
 [|  eval kv = mvappend(kv, index + ":" + sourcetype + ":<<FIELD>>" + "|" + '<<FIELD>>')]
 | eval kv=mvfilter(!match(kv, "---"))
 | fields kv
 | mvexpand kv
 | makemv kv delim="|"
 | eval field=mvindex(kv,0)
 | eval value=mvindex(kv,1)
 | stats values(field) as fields by value
 | where mvcount(fields) > 1

produces the following results:

valuefields
ABC123

A:login:colB
B:home:field20

which means that value "ABC123" appears in index A sourcetype login colB and index B sourcetype home field20

0 Karma

Momgineer
Engager
Spoiler
This is great! Exactly what I was hoping for. Thank you!
0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

Do the unique system identifiers have a known and consistent format e.g. AAANNN as in your example? If so, you could try extracting a new field with rex based on this pattern

| rex max_match=0 "\"(?<uid>[a-zA-Z]{3}\d{3})\""
0 Karma

Momgineer
Engager

Thank you for the input. Unfortunately, they are just random system-generated UIDs: 

1d1d33c5-0e5a-4cbe-afc4-c8c514ff62f4
d9be033a-703c-4f6b-a5de-c514f80e1a47

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

OK Thanks - that would have been useful to know up front though!

0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

[Puzzles] Solve, Learn, Repeat: Character substitutions with Regular Expressions

This challenge was first posted on Slack #puzzles channelFor BORE at .conf23, we had a puzzle question which ...

Splunk Community Badges!

  Hey everyone! Ready to earn some serious bragging rights in the community? Along with our existing badges ...

[Puzzles] Solve, Learn, Repeat: Matching cron expressions

This puzzle (first published here) is based on matching timestamps to cron expressions.All the timestamps ...