Using the Splunk addon for AWS to collect ec2 instance metadata I get an array called tags with key/value pairs such as below. What I want to do is extract the cluster name as a distinct var so that I can search on it or even better aggregate on it. Thoughts?
{ [-]
Key: hostname
Value: elasticsearch001
}
{ [-]
Key: cluster
Value: systemlogs
Hi @yuanliu ,
Thanks for your answer. It looks interesting.
But I'm having problems getting the result I wanted.
Sample data looks like below(I probably haven't closed brackets properly, but you get the idea)
detail: { [-]
resource: { [-]
instanceDetails: { [-]
tags: [
{ [-]
key: AWS_Backup
value: Yes
}
{ [-]
key: Schedule
value: pre-hours
}
]
}
}
my search looks like:
index=example_index
sourcetype="aws:cloudwatch:guardduty"
| spath
| rename detail.resource{}.instanceDetails{}.tags{}.* as tags_*
| eval host_name = mvindex(tags_value, mvfind(tags_key, "Name"))
After hitting the search button, nothing came up. If I add table host_name the value is empty.
It you could please point out errors, that would be awesome.
Thank you.
Please post sample JSON data in raw text format. "Translating" Splunk's default JSON view, especially with collapsable structures, back into JSON is very hard.
Thank you for the prompt reply.
Please see if below works for you:
[
{
"Resource": {
"InstanceDetails": {
"Tags": [
{
"Key": "AWS_Backup",
"Value": "Yes"
},
{
"Key": "Schedule",
"Value": "pre-hours"
},
{
"Key": "Name",
"Value": "HOST-NAME"
},
{
"Key": "Application Owner",
"Value": "XXX"
},
{
"Key": "Environment",
"Value": "XXX"
},
{
"Key": "Eco System",
"Value": "XXX"
},
{
"Key": "Patch Group",
"Value": "XXX"
},
{
"Key": "ScheduleMessage",
"Value": "XXX"
},
{
"Key": "Application",
"Value": "MuleSoftESB"
},
{
"Key": "Cost Centre",
"Value": "XXX"
}
]
},
"ResourceType": "Instance"
}
}
]
So, your raw event is a JSON array? In other words, you have fields like {}.Resource.InstanceDetails.Tags{}.Key , {}.Resource.InstanceDetails.Tags{}.Value, etc. In this case you will need to calculate against each element in this array. Like this:
| spath path={}
| mvexpand {}
| spath input={}
| rename Resource.InstanceDetails.Tags{}.* as Tags_*
| eval host_name = mvindex(Tags_Value, mvfind(Tags_Key, "Name"))
A more traditional method is to mvexpand the Tags{} array, then use where command to pick up the . Just for illustration,
| spath path={}
| mvexpand {}
| spath input={} path=Resource.InstanceDetails.Tags{} output=Tags
| mvexpand Tags
| spath input=Tags
| where Key == "Name"
| rename Value AS host_name
Hope this helps.
Hi @yuanliu ,
Thank you for your help so far. You have been amazing!
We do have two fields called:
detail.resource.instanceDetails.tags{}.key
detail.resource.instanceDetails.tags{}.value
The value looks like below:
And I have ran the search likes below, nothing returned. I also tried to remove .value in line 4. It didn't make a difference, have I done any thing wrong?
Cheers,
Freddy
A little linguistics here. In JSON, square brackets [] denote an array of JSON object, whereas curly brackets {} denote a list of key-value pairs. A JSON object can be an array or a list of key-value pairs; a JSON value can also be an array or a list of key-value pairs.
Splunk doesn't have a nested notation. So, SPL flattens JSON paths by concatenating various JSON keys with dots (".") and curly brackets ("{}") to form Splunk field names. Significantly, the string "{}" in SPL signifies an array; in JSON, that means that the value of the key preceding "{}" is enclosed by [].
In your text posting of sample data, the entire event is enclosed by []. That is why I asked if Splunk gives fields like {}.Resource.InstanceDetails.Tags{}.Key, i.e., every field name is preceded by "{}". (This is extremely rare in real life.) If Splunk gives you fields like detail.resource.instanceDetails.tags{}.key, that means that the events are not enclosed by [], but more like
{
"Resource": {
"InstanceDetails": {
"Tags": [
{
"Key": "AWS_Backup",
"Value": "Yes"
},
{
"Key": "Schedule",
"Value": "pre-hours"
},
{
"Key": "Name",
"Value": "HOST-NAME"
},
{
"Key": "Application Owner",
"Value": "XXX"
},
{
"Key": "Environment",
"Value": "XXX"
},
{
"Key": "Eco System",
"Value": "XXX"
},
{
"Key": "Patch Group",
"Value": "XXX"
},
{
"Key": "ScheduleMessage",
"Value": "XXX"
},
{
"Key": "Application",
"Value": "MuleSoftESB"
},
{
"Key": "Cost Centre",
"Value": "XXX"
}
]
},
"ResourceType": "Instance"
}
}
If this is the case, the original code should work, except you need to make letter cases match what Splunk shows. (SPL's case tolerance can be lax in some commands, but generally, field names are always case sensitive.)
| rename Resource.InstanceDetails.Tags{}.* as Tags_*
| eval host_name = mvindex(Tags_Value, mvfind(Tags_Key, "Name"))
Or, if you want to go the more traditional route, extract the array Resource.InstanceDetails.Tags{} first.
| spath path=Resource.InstanceDetails.Tags{}
| mvexpand Resource.InstanceDetails.Tags{}
| spath input=Resource.InstanceDetails.Tags{}
| where Key == "Name"
| rename Value AS Host_name
Here is an emulation of your sample data. You can play with it and compare with real data.
| makeresults
| eval _raw = "{
\"Resource\": {
\"InstanceDetails\": {
\"Tags\": [
{
\"Key\": \"AWS_Backup\",
\"Value\": \"Yes\"
},
{
\"Key\": \"Schedule\",
\"Value\": \"pre-hours\"
},
{
\"Key\": \"Name\",
\"Value\": \"HOST-NAME\"
},
{
\"Key\": \"Application Owner\",
\"Value\": \"XXX\"
},
{
\"Key\": \"Environment\",
\"Value\": \"XXX\"
},
{
\"Key\": \"Eco System\",
\"Value\": \"XXX\"
},
{
\"Key\": \"Patch Group\",
\"Value\": \"XXX\"
},
{
\"Key\": \"ScheduleMessage\",
\"Value\": \"XXX\"
},
{
\"Key\": \"Application\",
\"Value\": \"MuleSoftESB\"
},
{
\"Key\": \"Cost Centre\",
\"Value\": \"XXX\"
}
]
},
\"ResourceType\": \"Instance\"
}
}"
``` data emulation above ```
Hi @yuanliu
You are a true legend. It worked.
The reason why it was not working due to key and value needs to be lower case.
| eval server_name = mvindex(Tags_value, mvfind(Tags_key, "Name"))
Now the magic server_name is populated now.
Thank you.
There have been several recent discussions similar to this, like How to Extract JSON format as fields? and How to extract JSON with key name specified as a key. My goto method is spath. Once multivalued tags subfields are extracted, you can use mvfind to find "cluster"s index, and mvindex to extract cluster's name.
Assuming your data is like
{"tags": [ { "Key": "hostname", "Value": "elasticsearch001" }, { "Key": "cluster", "Value": "systemlogs" } ]}
The following code should extract cluster_name
| spath
| rename tags{}.* as tags_*
| eval cluster_name = mvindex(tags_Value, mvfind(tags_Key, "cluster"))
The sample data will give this output
_raw | cluster_name | tags_Key | tags_Value |
{"tags": [ { "Key": "hostname", "Value": "elasticsearch001" }, { "Key": "cluster", "Value": "systemlogs" } ]} | systemlogs | hostname cluster | elasticsearch001 systemlogs |