Solved: How to reference same field name from 2 different ...

neerajs_81 · ‎05-01-2022

Hi All,

I need to correlate data from 2 different Indexes wherein the field name is common.

Index=idx1 ( This index has general user info)
Field Name: sys_created_by
Value: <email id of the user>
Other fields in idx1 of interest: login_time

Index=idx2 ( This is the Index which has URLs accessed by the user)
Field Name: sys_created_by
Value: <email id of the user>

The url information is stored in a field called "url" in idx2.

Use case is to take the sys_created_by field from IDX1 and lookup/search for all urls in IDX2 accessed by the sys_created_by coming from idx1. I cannot rely on sys_created_by field from idx2 alone as it doesn't have all the other user attributes that are in IDX1 such as login_time. Hence i need to correlate data across the two indexes.

Do i need to do which will merge the sys_created_by from both indexes ?

eval common_field = coalesce(sys_created_by, sys_created_by)

I tried something like :

(index=idx1 sys_created_by!="") OR (index=idx2 sys_created_by!="" url!="")
| stats values(url) values(login_time) BY sys_created_by

But this doesn't show results as expected.
Is there a way to reference my common field like shown below in BY ; to tell Splunk which idx it needs to refer ?

| stats values(url), values(login_time) BY ( idx2.sys_created_by)

bowesmana · ‎05-01-2022

Coalesce is not needed, there is effectively only one field 'sys_created_by' and it will have a value depending on which index it comes from

Coalesce is used in this scenario, where the field name is different, e.g.

| eval common_field=coalesce(field_from_idx1, field_from_idx2)
| stats xxxx by common_field

but you don't need that.

View solution in original post

bowesmana · ‎05-01-2022

Your approach of

(index=idx1 sys_created_by!="") OR (index=idx2 sys_created_by!="" url!="")
| stats values(url) values(login_time) BY sys_created_by

should be exactly what you need. It does not matter that the data has the same field in both indexes, splunk does not care which index the event is from. If sys_created_by="XX" it will be the same for both indexes, so your stats values command will group all urls and login times for XX

What results did it show and why was it unexpected?

neerajs_81 · ‎05-01-2022

Thanks for responding. But do i need to use the coalesce command before the stats command ? Or is the coalesce not required at all and i can directly run the stats ?

ITWhisperer · ‎05-01-2022

coalesce works on different fields in the same event not the same field in different events

bowesmana · ‎05-01-2022

Although in the case of two data sets, it would be different fields in different events if only one of the fields is present in each data set. Thereby creating the 'join' field.

bowesmana · ‎05-01-2022

Coalesce is not needed, there is effectively only one field 'sys_created_by' and it will have a value depending on which index it comes from

Coalesce is used in this scenario, where the field name is different, e.g.

| eval common_field=coalesce(field_from_idx1, field_from_idx2)
| stats xxxx by common_field

but you don't need that.

neerajs_81 · ‎05-02-2022

bowesmana / @ITWhisperer
what about the case, when the common_field changes values across indexes ?
For example:
In Idx1 , i have sys_created_by = johnk@domain.com
In idx2, sys_created_by = some_other_user@domain.com but user_email=johnk@domain.com

If i want to do a stats grouping for johnk@domain.com , how to go about this ? The below coalesce command does not show desired results.

| eval common_field= coalesce(sys_created_by, user_email)

common_field ends up showing the value of some_other_user@domain.com .

But my requirement is to grab johnk@domain.com into some common field and leverage it further to stats group by .
Pls advise.

bowesmana · ‎05-02-2022

@neerajs_81

You need to decide which data (john@) is important and which field that comes from.

If your data has

idx1 (sys_created_by)

idx2 (sys_created_by and user_email)

then IFF user_email contains the DATA that is important to your grouping, then as idx1 does NOT contain that field, you would use any one of these three constructs (all will produce the same outcome)

| eval group_by_field=coalesce(user_email, sys_created_by)
| eval group_by_field=if(isnotnull(user_email), user_email, sys_created_by)
| eval group_by_field=if(index="idx2", user_email, sys_created_by)

Hope this helps

How to reference same field name from 2 different indexes when correlating ?

fields

stats

subsearch

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Monitoring AI Agents with Splunk Observability Cloud

[Puzzles] Solve, Learn, Repeat: Tiling

SOK it to Me: Top 3 Benefits of Using Splunk Operator on Kubernetes that’ll Make ...

Join the Conversation