Hello, my data look like this:
{
correlationId: "1",
field1: "something **flagged**",
field2: "alkjsd"
},
{
correlationId:"1",
info:"<id>A</id>"
},
{
correlationId: "2",
field1: "Hello world",
field2: "nothing to see"
},
{
correlationId:"2",
info:"<id>B</id>"
},
{
correlationId: "3",
field1: "abc123",
field2: "**flagged** things"
},
{
correlationId:"3",
info:"<id>C</id>"
}
I want to find all of the entries containing **flagged**
values and output a list of ids that have the same correlationId as a **flagged**
entry. In this case the output would be something like
A
C
I can output a list of all ids like this:
index=myindex | rex field=info "<id>(?<idvalue>[^\<]+)" | stats values(idvalue)
I can find the correlationId of **flagged**
messages like this:
index=myindex "**flagged**" | stats values(correlationId)
How do I combine these into a single search that will give me only the ids that match the **flagged**
correlationIds?
I am sorry, I think I might have confused things with my formatting. I am not trying to parse a single event. Each object in my example represents an event, so I am trying to represent six events:
Event 1:
{
correlationId: "1"
field1: "something **flagged**"
field2: "alkjsd"
}
Event 2:
{
correlationId: "1"
info:"<id>A</id>"
}
Event 3:
{
correlationId: "2"
field1: "Hello world"
field2: "nothing to see"
}
Event 4:
{
correlationId: "2"
info:"<id>B</id>"
}
Event 5:
{
correlationId: "3"
field1: "abc123"
field2: "**flagged** things"
}
Event 6:
{
correlationId: "3"
info:"<id>C</id>"
}
I can search by field, so I can get a list of all correlationId
values from events that contain **flagged**
with this search:
index=myindex "**flagged**" | stats values(correlationId)
Output:
1
3
Separately, I can get a list of all ids with this search:
index=myindex | rex field=info "<id>(?<idvalue>[^\<]+)" | stats values(idvalue)
Output:
A
B
C
What I want is to get the subset of ids whose correlationId value matches a correlationId from the first search, so ultimately, I want something that will output
A
C
Any ideas?
I am sorry, I think I might have confused things with my formatting. I am not trying to parse a single event. Each object in my example represents an event, so I am trying to represent six events:
Event 1:
{
correlationId: "1"
field1: "something **flagged**"
field2: "alkjsd"
}
Event 2:
{
correlationId: "1"
info:"<id>A</id>"
}
Event 3:
{
correlationId: "2"
field1: "Hello world"
field2: "nothing to see"
}
Event 4:
{
correlationId: "2"
info:"<id>B</id>"
}
Event 5:
{
correlationId: "3"
field1: "abc123"
field2: "**flagged** things"
}
Event 6:
{
correlationId: "3"
info:"<id>C</id>"
}
I can search by field, so I can get a list of all correlationId
values from events that contain **flagged**
with this search:
index=myindex "**flagged**" | stats values(correlationId)
Output:
1
3
Separately, I can get a list of all ids with this search:
index=myindex | rex field=info "<id>(?<idvalue>[^\<]+)" | stats values(idvalue)
Output:
A
B
C
What I want is to get the subset of ids whose correlationId value matches a correlationId from the first search, so ultimately, I want something that will output
A
C
Any ideas?
I'd suggest turning this solution to a comment. You can do what you described with the below logic:
[ search index=myindex "**flagged**"
| stats count by correlationId
| fields - count
| format] index=myindex
| rex field=info "<id>(?<idvalue>[^\<]+)"
| stats values(idvalue)
The bracketed subsearch is pulling all of the correlation id's and feeding that into the top of the next search which is producing list of idvalues that match the correlationId the sub search passed into it.
That
is
excellent.
Thank you, hero.
Using your example data, you'll need to manually extract all of the fields of interest with regular expression since it's not true JSON. You can extract those fields and then perform a stats by correlation id with a values of each of those fields. You then search the fields that have the flagged information to limit your results to those flagged rows. I did a little hack to make the copy/pasted data multi valued so I could use mvexpand on it to make each curly bracketed section into its own row:
| makeresults count=1
| eval data="{
correlationId: \"1\",
field1: \"something **flagged**\",
field2: \"alkjsd\"
},
{
correlationId:\"1\",
info:\"<id>A</id>\"
},
{
correlationId: \"2\",
field1: \"Hello world\",
field2: \"nothing to see\"
},
{
correlationId:\"2\",
info:\"<id>B</id>\"
},
{
correlationId: \"3\",
field1: \"abc123\",
field2: \"**flagged** things\"
},
{
correlationId:\"3\",
info:\"<id>C</id>\"
}"
| fields - _time
| rex mode=sed field=data "s/\},/}█/g"
| makemv delim="█" data
| mvexpand data
| eval data=trim(data)
| rex field=data "correlationId:\s?\"(?<correlationId>[^\"]+)"
| rex field=data "field1: \"(?<field1>[^\"]+)"
| rex field=data "field2: \"(?<field2>[^\"]+)"
| rex field=data "info:\s?\"(?<info>[^\"]+)"
| stats values(field1) as field1 values(field2) as field2 values(info) as info by correlationId
| search field1="*flagged*" OR field2="*flagged*"