Solved: Self join and override MV fields

akshaysaraf · ‎09-21-2020

Need some help with a search

{
	"id": "123",
	"start_time": "2020-08-01 15:00:00",
	"end_time": "2020-08-01 16:00:00",
	"status": "FAIL",
        "details": [
                    {
                     "sub_id": 1,
                     "status": "PASS"
                    },
                    {
                     "sub_id": 2,
                     "status": "FAIL"
                    }
                   ]
}

{
	"id": "124",
	"start_time": "2020-08-01 16:05:00",
	"end_time": "2020-08-01 16:30:00",
	"status": "PASS",
	"original_id": "123",
        "details": [
                    {
                     "sub_id": 1,
                     "status": "PASS"
                    },
                    {
                     "sub_id": 3,
                     "status": "PASS"
                    }
                   ]
}

These two events can be joined with id and original_id field. The output below shows data from id:123 but overrides some fields like, end_time, status, sub_id and sub_status from second event.

The Tabular output I expect is :

id	start_time	end_time	status	sub_id	sub_status
123	2020-08-01 15:00:00	2020-08-01 16:30:00	PASS	1	PASS
123	2020-08-01 15:00:00	2020-08-01 16:30:00	PASS	3	PASS

Any help is appreciated, thanks

to4kawa · ‎09-21-2020

index=_internal | head 1 | fields _raw _time | eval _raw="{
	\"id\": \"123\",
	\"start_time\": \"2020-08-01 15:00:00\",
	\"end_time\": \"2020-08-01 16:00:00\",
	\"status\": \"FAIL\",
        \"details\": [
                    {
                     \"sub_id\": 1,
                     \"status\": \"PASS\"
                    },
                    {
                     \"sub_id\": 2,
                     \"status\": \"FAIL\"
                    }
                   ]
}"
| appendpipe [ eval _raw="{
	\"id\": \"124\",
	\"start_time\": \"2020-08-01 16:05:00\",
	\"end_time\": \"2020-08-01 16:30:00\",
	\"status\": \"PASS\",
	\"original_id\": \"123\",
        \"details\": [
                    {
                     \"sub_id\": 1,
                     \"status\": \"PASS\"
                    },
                    {
                     \"sub_id\": 3,
                     \"status\": \"PASS\"
                    }
                   ]
}"]
| rename COMMENT as "this is sample. from here, the logic."
| spath
| eval start_time=strptime(start_time,"%F %T"), end_time=strptime(end_time,"%F %T")
| eval ID=coalesce(original_id,id) 
| eventstats min(start_time) as start_time max(end_time) as end_time by ID
| streamstats count as session
| eventstats max(session) as last_session by ID
| where last_session=session
| foreach *_time [ eval <<FIELD>>=strftime('<<FIELD>>',"%F %T")]
| spath details{} output=details
| stats values(*) as * by details
| rename status as Status
| spath input=details
| rename status as sub_status
| table ID start_time end_time Status sub_*

View solution in original post

to4kawa · ‎09-21-2020

index=_internal | head 1 | fields _raw _time | eval _raw="{
	\"id\": \"123\",
	\"start_time\": \"2020-08-01 15:00:00\",
	\"end_time\": \"2020-08-01 16:00:00\",
	\"status\": \"FAIL\",
        \"details\": [
                    {
                     \"sub_id\": 1,
                     \"status\": \"PASS\"
                    },
                    {
                     \"sub_id\": 2,
                     \"status\": \"FAIL\"
                    }
                   ]
}"
| appendpipe [ eval _raw="{
	\"id\": \"124\",
	\"start_time\": \"2020-08-01 16:05:00\",
	\"end_time\": \"2020-08-01 16:30:00\",
	\"status\": \"PASS\",
	\"original_id\": \"123\",
        \"details\": [
                    {
                     \"sub_id\": 1,
                     \"status\": \"PASS\"
                    },
                    {
                     \"sub_id\": 3,
                     \"status\": \"PASS\"
                    }
                   ]
}"]
| rename COMMENT as "this is sample. from here, the logic."
| spath details{} output=details
| table _raw details
| mvexpand details
| spath input=details
| rename status as sub_status
| spath
| table id start_time end_time status sub_*

akshaysaraf · ‎09-21-2020

You helped me in past with question : https://community.splunk.com/t5/Splunk-Search/Self-Join-and-override-Values/m-p/512453

In this question the only difference is there is a MV field as well. I need the MV field of the latest() record. Coalesce doesn't seem to work for it.

akshaysaraf · ‎09-21-2020

Hello,

Thank you for your response, This would return all 4 nested records. I dont need those. I need only 2, which are overridden by the later record. I added some explanation in the question, and an expected output.

to4kawa · ‎09-21-2020

index=_internal | head 1 | fields _raw _time | eval _raw="{
	\"id\": \"123\",
	\"start_time\": \"2020-08-01 15:00:00\",
	\"end_time\": \"2020-08-01 16:00:00\",
	\"status\": \"FAIL\",
        \"details\": [
                    {
                     \"sub_id\": 1,
                     \"status\": \"PASS\"
                    },
                    {
                     \"sub_id\": 2,
                     \"status\": \"FAIL\"
                    }
                   ]
}"
| appendpipe [ eval _raw="{
	\"id\": \"124\",
	\"start_time\": \"2020-08-01 16:05:00\",
	\"end_time\": \"2020-08-01 16:30:00\",
	\"status\": \"PASS\",
	\"original_id\": \"123\",
        \"details\": [
                    {
                     \"sub_id\": 1,
                     \"status\": \"PASS\"
                    },
                    {
                     \"sub_id\": 3,
                     \"status\": \"PASS\"
                    }
                   ]
}"]
| rename COMMENT as "this is sample. from here, the logic."
| spath
| eval start_time=strptime(start_time,"%F %T"), end_time=strptime(end_time,"%F %T")
| eval ID=coalesce(original_id,id) 
| eventstats min(start_time) as start_time max(end_time) as end_time by ID
| streamstats count as session
| eventstats max(session) as last_session by ID
| where last_session=session
| foreach *_time [ eval <<FIELD>>=strftime('<<FIELD>>',"%F %T")]
| spath details{} output=details
| stats values(*) as * by details
| rename status as Status
| spath input=details
| rename status as sub_status
| table ID start_time end_time Status sub_*

akshaysaraf · ‎09-21-2020

Thank you so much. It worked!

I made a minor change, from :

| streamstats count as session

to:

| streamstats count as session by ID

I appreciate your help very much. 🙂

inventsekar · ‎09-21-2020

Hi, may we know your current SPL query please.. the info like index, source, st, etc are needed from you.

once you update us your current search query, we can further fine-tune your search query. thanks.

(i have received the 100 karma points giver badge, .. may we know how many karma points you have given?!)

akshaysaraf · ‎09-21-2020

Sorry, I dont understand. My current query to display those 2 JSON events is "index=abc" that's it.
The expected output I mentioned is a new implementation and I am seeking help for that.

I haven't given any karma points, I just like the answer that has worked for me in past questions. No karma given for this topic.

Self join and override MV fields

eval

search job inspector

stats

Earn a $35 Gift Card for Answering our Splunk Admins & App Developer Survey

Continuing Innovation & New Integrations Unlock Full Stack Observability For Your ...

Monitoring Amazon Elastic Kubernetes Service (EKS)