Splunk Search

How do I combine and filter searches on a common field

jrjrjrjrjr
Explorer

Hello, my data look like this:

{
    correlationId: "1",
    field1: "something **flagged**",
    field2: "alkjsd"
},
{
    correlationId:"1",
    info:"<id>A</id>"
},
{
    correlationId: "2",
    field1: "Hello world",
    field2: "nothing to see"
},
{
    correlationId:"2",
    info:"<id>B</id>"
},
{
    correlationId: "3",
    field1: "abc123",
    field2: "**flagged** things"
},
{
    correlationId:"3",
    info:"<id>C</id>"
}

I want to find all of the entries containing **flagged**values and output a list of ids that have the same correlationId as a **flagged** entry. In this case the output would be something like

A
C

I can output a list of all ids like this:

index=myindex | rex field=info "<id>(?<idvalue>[^\<]+)" | stats values(idvalue)

I can find the correlationId of **flagged** messages like this:

index=myindex "**flagged**" | stats values(correlationId)

How do I combine these into a single search that will give me only the ids that match the **flagged** correlationIds?

0 Karma
1 Solution

jrjrjrjrjr
Explorer

I am sorry, I think I might have confused things with my formatting. I am not trying to parse a single event. Each object in my example represents an event, so I am trying to represent six events:
Event 1:

{
    correlationId: "1"
    field1: "something **flagged**"
    field2: "alkjsd"
 }

Event 2:

{
    correlationId: "1"
    info:"<id>A</id>"
 }

Event 3:

{
    correlationId: "2"
    field1: "Hello world"
    field2: "nothing to see"
 }

Event 4:

{
    correlationId: "2"
    info:"<id>B</id>"
 }

Event 5:

{
    correlationId: "3"
    field1: "abc123"
    field2: "**flagged** things"
 }

Event 6:

{
    correlationId: "3"
    info:"<id>C</id>"
 }

I can search by field, so I can get a list of all correlationId values from events that contain **flagged** with this search:

index=myindex "**flagged**" | stats values(correlationId)

Output:

1
3

Separately, I can get a list of all ids with this search:

index=myindex | rex field=info "<id>(?<idvalue>[^\<]+)" | stats values(idvalue)

Output:

A
B
C

What I want is to get the subset of ids whose correlationId value matches a correlationId from the first search, so ultimately, I want something that will output

A
C

Any ideas?

View solution in original post

0 Karma

jrjrjrjrjr
Explorer

I am sorry, I think I might have confused things with my formatting. I am not trying to parse a single event. Each object in my example represents an event, so I am trying to represent six events:
Event 1:

{
    correlationId: "1"
    field1: "something **flagged**"
    field2: "alkjsd"
 }

Event 2:

{
    correlationId: "1"
    info:"<id>A</id>"
 }

Event 3:

{
    correlationId: "2"
    field1: "Hello world"
    field2: "nothing to see"
 }

Event 4:

{
    correlationId: "2"
    info:"<id>B</id>"
 }

Event 5:

{
    correlationId: "3"
    field1: "abc123"
    field2: "**flagged** things"
 }

Event 6:

{
    correlationId: "3"
    info:"<id>C</id>"
 }

I can search by field, so I can get a list of all correlationId values from events that contain **flagged** with this search:

index=myindex "**flagged**" | stats values(correlationId)

Output:

1
3

Separately, I can get a list of all ids with this search:

index=myindex | rex field=info "<id>(?<idvalue>[^\<]+)" | stats values(idvalue)

Output:

A
B
C

What I want is to get the subset of ids whose correlationId value matches a correlationId from the first search, so ultimately, I want something that will output

A
C

Any ideas?

0 Karma

dmarling
Builder

I'd suggest turning this solution to a comment. You can do what you described with the below logic:

[ search index=myindex "**flagged**" 
    | stats count by correlationId 
    | fields - count 
    | format] index=myindex 
| rex field=info "<id>(?<idvalue>[^\<]+)" 
| stats values(idvalue)

The bracketed subsearch is pulling all of the correlation id's and feeding that into the top of the next search which is producing list of idvalues that match the correlationId the sub search passed into it.

If this comment/answer was helpful, please up vote it. Thank you.

jrjrjrjrjr
Explorer

That
is
excellent.

Thank you, hero.

0 Karma

dmarling
Builder

Using your example data, you'll need to manually extract all of the fields of interest with regular expression since it's not true JSON. You can extract those fields and then perform a stats by correlation id with a values of each of those fields. You then search the fields that have the flagged information to limit your results to those flagged rows. I did a little hack to make the copy/pasted data multi valued so I could use mvexpand on it to make each curly bracketed section into its own row:

| makeresults count=1 
| eval data="{
     correlationId: \"1\",
     field1: \"something **flagged**\",
     field2: \"alkjsd\"
 },
 {
     correlationId:\"1\",
     info:\"<id>A</id>\"
 },
 {
     correlationId: \"2\",
     field1: \"Hello world\",
     field2: \"nothing to see\"
 },
 {
     correlationId:\"2\",
     info:\"<id>B</id>\"
 },
 {
     correlationId: \"3\",
     field1: \"abc123\",
     field2: \"**flagged** things\"
 },
 {
     correlationId:\"3\",
     info:\"<id>C</id>\"
 }" 
| fields - _time
| rex mode=sed field=data "s/\},/}█/g"
| makemv delim="█" data
| mvexpand data
| eval data=trim(data)
| rex field=data "correlationId:\s?\"(?<correlationId>[^\"]+)"
| rex field=data "field1: \"(?<field1>[^\"]+)"
| rex field=data "field2: \"(?<field2>[^\"]+)"
| rex field=data "info:\s?\"(?<info>[^\"]+)"
| stats values(field1) as field1 values(field2) as field2 values(info) as info by correlationId
| search field1="*flagged*" OR field2="*flagged*"
If this comment/answer was helpful, please up vote it. Thank you.
0 Karma
Get Updates on the Splunk Community!

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...