Solved: Join fields together

JandrevdM · ‎10-29-2024

Good day,

Is there a way to join all my rows into one?

My simple query

index=collect_identities sourcetype=ldap:query user
| dedup email
| table email extensionAttribute10 extensionAttribute11 first last identity

Shows results as, as I have more than one email

email	extensionAttribute10	extensionAttribute11	first	last	identity
user@domain.com		user@consultant.com	User	Surname	USurname
userT1@domain.com	user@domain.com	user@domain.com	User	Surname	USurname
userT0@domain.com	user@domain.com	user@domain.com	User	Surname	USurname

I want to add a primary key that searches for "user@domain.com" and display all their email addresses that they have in one row.
Example

email	extensionAttribute10	extensionAttribute11	first	last	identity	email2	email3
user@domain.com	user@domain.com	user@consultant.com	User	Surname	USurname	userT1@domain.com	userT0@domain.com

gcusello · ‎10-29-2024

Hi @JandrevdM ,

you must find a common key between the records, if identity is your key, you could try something like this:

<your_search>
| stats 
     values(email) AS email
     values(extensionAttribute10) AS extensionAttribute10
     values(extensionAttribute119 AS extensionAttribute11
     values(first) AS first
     values(last) AS last
     BY identity

Ciao.

Giuseppe

View solution in original post

yuanliu · ‎10-30-2024

The problem here is unclear requirement: What is the logic to collapse the three rows after dedup into that single row?

As @gcusello speculates, the three rows have common values of identity. Is this correct? Such should be be stated explicitly.
The mock data also shows identical first and last for the three rows. Is this always true? Such should be stated explicitly, too.
More intricately, the mock data contains different values of extensionAttribute11 and extensionAttribute10. What are the criteria of choosing one or another from these differing values in the collapsed table? Volunteers here cannot read minds.
extensionAttribute10 in one of the three rows is blank; that in the rest rows is the same value. One can reasonably speculate that you want the non-blank value to be used in the collapsed table. But is this speculation correct? Are all non-blank values identical? Again, do not make volunteers read your mind.
Additionally, what is the logic to determine which value remains with field name email, which goes to email2, email3, etc.?

In the following example, I'll take arbitrary selection among emails (5), take every value of extensionAttribute11 (3), and take affirmative in (4). You get

email	extensionAtttribute10	extensionAttribute11	first	last	identity	email2	email3
user@domain.com	user@domain.com	user@consultant.com user@domain.com	User	Surname	USurname	userT0@domain.com	userT1@domain.com

This the search

index=collect_identities sourcetype=ldap:query user
| stats values(*) as * by first last identity
| eval idx = mvrange(1, mvcount(email))
| eval json = json_object()
| foreach idx mode=multivalue
    [eval ordinal = <<ITEM>> + 1, json = json_set(json, "email" . ordinal, mvindex(email, <<ITEM>>))]
| spath input=json
| eval email = mvindex(email, 0)
| table email extension* first last identity email*

(Of course, you can reduce extensionAttribute11 to one value if you know the logic.) Here is an emulation. Play with it and compare with real data.

| makeresults format=csv data="email,	extensionAttribute10,	extensionAttribute11,	first,	last,	identity
user@domain.com,	 ,	user@consultant.com,	User,	Surname,	USurname
userT1@domain.com,	user@domain.com,	user@domain.com,	User,	Surname,	USurname
userT0@domain.com,	user@domain.com,	user@domain.com,	User,	Surname,	USurname"
``` the above emulates
index=collect_identities sourcetype=ldap:query user
```

gcusello · ‎10-29-2024

Hi @JandrevdM ,

you must find a common key between the records, if identity is your key, you could try something like this:

<your_search>
| stats 
     values(email) AS email
     values(extensionAttribute10) AS extensionAttribute10
     values(extensionAttribute119 AS extensionAttribute11
     values(first) AS first
     values(last) AS last
     BY identity

Ciao.

Giuseppe

JandrevdM · ‎10-29-2024

Thanks is there any way to split it, I tried this but it is not working

index=collect_identities sourcetype=ldap:query 
| dedup email
| eval identity=replace(identity, "Adm0", "")
| eval identity=replace(identity, "Adm", "")
| eval identity=lower(identity)
| stats 
    values(email) AS email
    values(extensionAttribute10) AS extensionAttribute10
    values(extensionAttribute11) AS extensionAttribute11
    values(first) AS first
    values(last) AS last
    BY identity
| eval email=split(email, "")
| eval extensionAttribute10=split(extensionAttribute10, "")
| eval extensionAttribute11=split(extensionAttribute11, "")
| eval first=split(first, "")
| eval last=split(last, "")
| mvexpand email
| mvexpand extensionAttribute10
| mvexpand extensionAttribute11
| mvexpand first
| mvexpand last

gcusello · ‎10-29-2024

Hi @JandrevdM ,

what do you mean with "split"?

if you have a multivalue field, you could try with mvexpand to have a row for each value or nomv to have all the values in one row.

Ciao.

Giuseppe

JandrevdM · ‎10-29-2024

Instead of having

email

user@domain.com
userT1@domain.com

I would like to then split it

email 1	email 2
user@domain.com	userT2@domain.com

PickleRick · ‎10-29-2024

Well, you probably can using the foreach command and the {} notation (or alternatively using some mv* magic with kv command in the rnd) but the question is - what for. It's usually the other way around that's the problem - to normalize your data when you have multiple fields holding "the same" data.

gcusello · ‎10-29-2024

Hi @JandrevdM ,

do you know the max number of emails?

if they are fixed (always the same quantity), you could use nomv and then a regex to divide the emails.

Ciao.

Giuseppe

JandrevdM · ‎10-29-2024

I do not know the number of max emails. But I believe the first answer will be sufficient and I will try and work around it on my join

gcusello · ‎10-29-2024

Hi @JandrevdM ,

good for you, see next time!

let us know if we can help you more, or, please, accept one answer for the other people of Community.

Ciao and happy splunking

Giuseppe

P.S.: Karma Points are appreciated by all the contributors 😉

Join fields together

other

Enterprise Security Content Update (ESCU) | New Releases

Why am I not seeing the finding in Splunk Enterprise Security Analyst Queue?

Index This | What are the 12 Days of Splunk-mas?