Solved: How do I combine and filter searches on a common f...

jrjrjrjrjr · ‎05-15-2019

Hello, my data look like this:

{
    correlationId: "1",
    field1: "something **flagged**",
    field2: "alkjsd"
},
{
    correlationId:"1",
    info:"<id>A</id>"
},
{
    correlationId: "2",
    field1: "Hello world",
    field2: "nothing to see"
},
{
    correlationId:"2",
    info:"<id>B</id>"
},
{
    correlationId: "3",
    field1: "abc123",
    field2: "**flagged** things"
},
{
    correlationId:"3",
    info:"<id>C</id>"
}

I want to find all of the entries containing **flagged**values and output a list of ids that have the same correlationId as a **flagged** entry. In this case the output would be something like

A
C

I can output a list of all ids like this:

index=myindex | rex field=info "<id>(?<idvalue>[^\<]+)" | stats values(idvalue)

I can find the correlationId of **flagged** messages like this:

index=myindex "**flagged**" | stats values(correlationId)

How do I combine these into a single search that will give me only the ids that match the **flagged** correlationIds?

jrjrjrjrjr · ‎05-15-2019

I am sorry, I think I might have confused things with my formatting. I am not trying to parse a single event. Each object in my example represents an event, so I am trying to represent six events:
Event 1:

{
    correlationId: "1"
    field1: "something **flagged**"
    field2: "alkjsd"
 }

Event 2:

{
    correlationId: "1"
    info:"<id>A</id>"
 }

Event 3:

{
    correlationId: "2"
    field1: "Hello world"
    field2: "nothing to see"
 }

Event 4:

{
    correlationId: "2"
    info:"<id>B</id>"
 }

Event 5:

{
    correlationId: "3"
    field1: "abc123"
    field2: "**flagged** things"
 }

Event 6:

{
    correlationId: "3"
    info:"<id>C</id>"
 }

I can search by field, so I can get a list of all correlationId values from events that contain **flagged** with this search:

index=myindex "**flagged**" | stats values(correlationId)

Output:

1
3

Separately, I can get a list of all ids with this search:

index=myindex | rex field=info "<id>(?<idvalue>[^\<]+)" | stats values(idvalue)

Output:

A
B
C

What I want is to get the subset of ids whose correlationId value matches a correlationId from the first search, so ultimately, I want something that will output

A
C

Any ideas?

View solution in original post

jrjrjrjrjr · ‎05-15-2019

I am sorry, I think I might have confused things with my formatting. I am not trying to parse a single event. Each object in my example represents an event, so I am trying to represent six events:
Event 1:

{
    correlationId: "1"
    field1: "something **flagged**"
    field2: "alkjsd"
 }

Event 2:

{
    correlationId: "1"
    info:"<id>A</id>"
 }

Event 3:

{
    correlationId: "2"
    field1: "Hello world"
    field2: "nothing to see"
 }

Event 4:

{
    correlationId: "2"
    info:"<id>B</id>"
 }

Event 5:

{
    correlationId: "3"
    field1: "abc123"
    field2: "**flagged** things"
 }

Event 6:

{
    correlationId: "3"
    info:"<id>C</id>"
 }

I can search by field, so I can get a list of all correlationId values from events that contain **flagged** with this search:

index=myindex "**flagged**" | stats values(correlationId)

Output:

1
3

Separately, I can get a list of all ids with this search:

index=myindex | rex field=info "<id>(?<idvalue>[^\<]+)" | stats values(idvalue)

Output:

A
B
C

What I want is to get the subset of ids whose correlationId value matches a correlationId from the first search, so ultimately, I want something that will output

A
C

Any ideas?

dmarling · ‎05-15-2019

I'd suggest turning this solution to a comment. You can do what you described with the below logic:

[ search index=myindex "**flagged**" 
    | stats count by correlationId 
    | fields - count 
    | format] index=myindex 
| rex field=info "<id>(?<idvalue>[^\<]+)" 
| stats values(idvalue)

The bracketed subsearch is pulling all of the correlation id's and feeding that into the top of the next search which is producing list of idvalues that match the correlationId the sub search passed into it.

If this comment/answer was helpful, please up vote it. Thank you.

jrjrjrjrjr · ‎05-15-2019

That
is
excellent.

Thank you, hero.

dmarling · ‎05-15-2019

Using your example data, you'll need to manually extract all of the fields of interest with regular expression since it's not true JSON. You can extract those fields and then perform a stats by correlation id with a values of each of those fields. You then search the fields that have the flagged information to limit your results to those flagged rows. I did a little hack to make the copy/pasted data multi valued so I could use mvexpand on it to make each curly bracketed section into its own row:

| makeresults count=1 
| eval data="{
     correlationId: \"1\",
     field1: \"something **flagged**\",
     field2: \"alkjsd\"
 },
 {
     correlationId:\"1\",
     info:\"<id>A</id>\"
 },
 {
     correlationId: \"2\",
     field1: \"Hello world\",
     field2: \"nothing to see\"
 },
 {
     correlationId:\"2\",
     info:\"<id>B</id>\"
 },
 {
     correlationId: \"3\",
     field1: \"abc123\",
     field2: \"**flagged** things\"
 },
 {
     correlationId:\"3\",
     info:\"<id>C</id>\"
 }" 
| fields - _time
| rex mode=sed field=data "s/\},/}█/g"
| makemv delim="█" data
| mvexpand data
| eval data=trim(data)
| rex field=data "correlationId:\s?\"(?<correlationId>[^\"]+)"
| rex field=data "field1: \"(?<field1>[^\"]+)"
| rex field=data "field2: \"(?<field2>[^\"]+)"
| rex field=data "info:\s?\"(?<info>[^\"]+)"
| stats values(field1) as field1 values(field2) as field2 values(info) as info by correlationId
| search field1="*flagged*" OR field2="*flagged*"

If this comment/answer was helpful, please up vote it. Thank you.

How do I combine and filter searches on a common field

Announcing Scheduled Export GA for Dashboard Studio

Extending Observability Content to Splunk Cloud

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!