Re: How to table a single event field joined with ...

mkulicke · ‎02-17-2022

Hi. I am having trouble figuring out how to execute this, although it's probably simple:

search 1 | field 1 | join [ search 2 | field 2 ] | table field 1, field 2

each instance of field 1 will return multiple values for field 2. i want to table both fields such that every value for field 2 is printed next to its corresponding value for field 1. The left column will have some duplicate values, the right column will have only unique values

I want a table that looks like this:

FIELD 1	FIELD 2
value A	value 1
value A	value 2
value A	value 3
value B	value 1
value B	value 2
value C	value 1
value C	value 2
value C	value 3

this is my actual search:

index=soe_app_retail sourcetype="vg:hvlm" source="*prd/vpa*" "*NumberOfRules*" | rex field=_raw "poid=(?<field_1>\d+)"
| join type=inner uid [ search index=soe_app_retail sourcetype="vg:hvlm" source="*prd/vpa*" "*upper*" |rename message as field_2]
| table field_1, field_2

right now i am getting only 1 row for each field_1 value, even though I know there are multiple values for field_2 for each field_1. I think it involves MV expand but I can't figure it out

ITWhisperer · ‎02-22-2022

Try

search 2 | field 2 | join [ search 1 | field 1 ] | table field 1, field 2

You only get one result from the join per event in the first search so try switching the searches around

ITWhisperer · ‎02-17-2022

Have you tried

| mvexpand field_2

mkulicke · ‎02-18-2022

Yes, that does not achieve the desired result

yuanliu · ‎02-18-2022

Could you describe the data characteristics more precisely? Did you determine that there must be multiple values of field_2 for each field_1 just because there are more values of field_1 than that of field_2, or are there some additional observations that you can disclose? More specifically, are there rows in which field_2 is null? In other words, does this filter give any result?

| where isnull(field_2)

Alternatively, add the following to study if some field_1 values has no corresponding field_2

| dedup field_1

In regard to the two searches, it looks like the only difference between them is the one string. The cost of join may not be worth it unless the two searches have to be conducted in different time periods. Consider a simple search like

index=soe_app_retail sourcetype="vg:hvlm" source="*prd/vpa*" ("*NumberOfRules*" OR "*upper*")
| rex field=_raw "poid=(?<field_1>\d+)"
| rename message as field_2

| stats values(field_1) as field_1 values(field_2) as field_2 by uid ``` join values of the two fields by uid ```
| mvexpand field_1

The above assumes that there is no row that contains poid AND NumberOfRules but NOT upper, or row that contains message AND upper but NOT NumberOfRules. If these are not true, you can filter them out like

index=soe_app_retail sourcetype="vg:hvlm" source="*prd/vpa*" ("*NumberOfRules*" OR "*upper*")
| rex field=_raw "poid=(?<field_1>\d+)"
| rex "(?<upper>upper)" | eval field_1=if(isnull(upper),null(),field_1) ``` these two are only necessary if poid exists in rows that contain NumberOfRules but not upper ```
| rename message as field_2
| rex "(?<NumberOfRules>NumberOfRules)" | eval field_2=if(isnull(NumberOfRules),null(),field_2) ``` these two are only neccessary if message exists in rows that contain upper but not NumberOfRules ```

| stats values(field_1) as field_1 values(field_2) as field_2 by uid ``` join values of the two fields by uid ```
| mvexpand field_1

mkulicke · ‎02-22-2022

@yuanliu Thank you for the in-depth reply. Let me see if I can rephrase the issue with more specificity ( i have to be cautious with divulging data, forgive me if It makes my reply obtuse).

In my first search, I am looking for all examples of the string "numberOfRules" and then extracting a different substring (poid) which I am naming field_1:

index=soe_app_retail sourcetype="vg:hvlm" source="*prd/vpa*" "*NumberOfRules*" | rex field=_raw "poid=(?<field_1>\d+)"

There will NEVER be duplicate values of this field, and I want to table it. Each event returned by this search has another unique value (uid) which also will never be duplicated, and I use this field to join to search 2:

index=soe_app_retail sourcetype="vg:hvlm" source="*prd/vpa*" "*upper*" | rename message as field_2]

There will often, but not always be multiple events (messages) returned for each unique uid.

So, if I search:

index=soe_app_retail sourcetype="vg:hvlm" source="*prd/vpa*" "*NumberOfRules*" "*UNIQUE_UID_A*" | rex field=_raw "poid=(?<field_1>\d+)" | table field 1, uid

I will only get 1 event, which will table like this:

field_A (abcde)

uid_A (1234)

if I search this:

index=soe_app_retail sourcetype="vg:hvlm" source="*prd/vpa*" "*upper*" "*UNIQUE_UID_A*"| rename message as field_2] | table uid, field_2

I will get this table:

uid_A (1234)	unique_message (first message w/ 1234)
uid_A (1234)	unique_message second message w/ 1234)
uid_A (1234)	unique_message third message w/ 1234)

what I want is:

field_A (abcde)	uid_A (1234)	unique_message (first message w/ 1234)
field_A (abcde)	uid_A (1234)	unique_message (second message w/ 1234)
field_B (fghij)	uid_B (5678)	unique_message_Y (first message w/ 5678)
field_B (fghij)	uid_B (5678)	unique_message (second message w/ 5678)
field_C (klmno)	uid_C (9012)	unique_message (first message w/9012

what i get is:

field_A (abcde)	uid_A (1234)	unique_message (first message w/ 1234)
field_B (fghij)	uid_B (5678)	unique_message_Y (first message w/ 5678)
field_C (klmno)	uid_C (9012)	unique_message (first message w/9012

this is my current search:

index=soe_app_retail sourcetype="vg:hvlm" source="*prd/vpa*" "*NumberOfRules*" | rex field=_raw "poid=(?<field_1>\d+)"
| join type=inner uid [ search index=soe_app_retail sourcetype="vg:hvlm" source="*prd/vpa*" "*upper*" |rename message as field_2]
| table field_1, uid, field_2

@yuanliu what am i doing wrong?

yuanliu · ‎02-22-2022

The final table will only match the number of rows returned in the outer search, namely 3 in your example. You need to return field_2 in the inter search in multivalue before match.

More suggestions about supply sample data and test searches:

Illustrate sample raw data is preferred to table, even in synthetic form, unless your question concentrates in field manipulation such as stats.
If you want to use table, include table head so other people can understand how columns correspond to your explanations, sample searches, etc.
Make sure synthetic data is congruent with test search. In your example, I assume that "field_A (abcde)" is a value of field_1 aka poid, but it won't match your term \d+. This makes troubleshooting so much more difficult.

Again, I strongly suggest that join be avoided in this use case. I also spotted an error in my previous illustration. Now that I get some sense of the data, it is easy to correct.

index=soe_app_retail sourcetype="vg:hvlm" source="*prd/vpa*" ("*NumberOfRules*" OR "*upper*")
| rex field=_raw "poid=(?<field_1>\d+)
| rename message as field_2
| rex "(?<upper>upper)" | eval field_2=if(isnull(upper),null(),field_2)

| stats values(field_1) as field_1 values(field_2) as field_2 by uid ``` join values of the two fields by uid ```
| stats count by uid field_1 field_2 ``` one stats rather than two mvexpand ```

Here is the test. Given raw data

uid_A (1234) poid=9876 NumberOfRules

uid_A (1234) ... upper ... message=unique_message (first message w/ 1234)

uid_A (1234) ... upper ... message=unique_message (second message w/ 1234)

uid_A (1234) ... upper ... message=unique_message (third message w/ 1234)

uid_B (5678) poid=5432 NumberOfRules

uid_B (5678) ... upper ... message=unique_message (first message w/ 5678)

uid_B (5678) ... upper ... message=unique_message (second message w/ 5678)

uid_B (5678) ... upper ... message=unique_message (third message w/ 5678)

The above search will return

uid	field_1	field_2	count
uid_A (1234)	9876	unique_message (first message w/ 1234)	1
uid_A (1234)	9876	unique_message (second message w/ 1234)	1
uid_A (1234)	9876	unique_message (third message w/ 1234)	1
uid_B (5678)	5432	unique_message (first message w/ 5678)	1
uid_B (5678)	5432	unique_message (second message w/ 5678)	1
uid_B (5678)	5432	unique_message (third message w/ 5678)	1

How to table a single event field joined with a multi-event field?

field extraction

fields

join

table

Update Your SOAR Apps for Python 3.13: What Community Developers Need to Know

October Community Champions: A Shoutout to Our Contributors!

Automatic Discovery Part 2: Setup and Best Practices

Are you a member of the Splunk Community?