Splunk Search

sPath command Missing random few values from JSON Array

premrajvs
Explorer

In the data, there is an array of 5 commit IDs. For some reason, it is only returning 3 values. Not sure why  2 values are missing. Would like a fresh set of eyes to take a look please.

Query

index=XXXXX source="http:github-dev-token" eventtype="GitHub::Push" sourcetype="json_ae_git-webhook"
| spath output=commit_id path=commits.id

sourcetype definition

[ json_ae_git-webhook ]
AUTO_KV_JSON=false
CHARSET=UTF-8
KV_MODE=json
LINE_BREAKER=([\r\n]+)
NO_BINARY_CHECK=true
SHOULD_LINEMERGE=true
TRUNCATE=100000
category=Structured
description=JavaScript Object Notation format. For more information, visit http://json.org/
disabled=false
pulldown_type=true

Raw JSON data

{
"ref":"refs/heads/Dev",
"before":"d53e9b3cb6cde4253e05019295a840d394a7bcb0",
"after":"34c07bcbf557413cf42b601c1794c87db8c321d1",
"commits":[
{
"id":"a5c816a817d06e592d2b70cd8a088d1519f2d720",
"tree_id":"15e930e14d4c62aae47a3c02c47eb24c65d11807",
"distinct":false,
"message":"rrrrrrrrrrrrrrrrrrrrrr",
"timestamp":"2024-08-12T12:00:04-05:00",
"url":"https://github.com/xxxxxxxxxxxxxxx/AzureWorkload_A00008/commit/aaaaaaaaaaaa",
"author":{
"name":"aaaaaa aaaaaa",
"email":"101218171+aaaaaa@users.noreply.github.com",
"username":"aaaaaa"
},
"committer":{
"name":"aaaaaa aaaaaa",
"email":"101218171+aaaaaa@users.noreply.github.com",
"username":"aaaaaa"
},
"added":[

],
"removed":[

],
"modified":[
"asdafasdad.json"
]
},
{
"id":"a3b3b6f728ccc0eb9113e7db723fbfc4ad220882",
"tree_id":"3586aeb0a33dc5e236cb266c948f83ff01320a9a",
"distinct":false,
"message":"xxxxxxxxxxxxxxxxxxx",
"timestamp":"2024-08-12T12:05:40-05:00",
"url":"https://github.com/xxxxxxxxxxxxxxx/AzureWorkload_A00008/commit/a3b3b6f728ccc0eb9113e7db723fbfc4ad220882",
"author":{
"name":"aaaaaa aaaaaa",
"email":"101218171+aaaaaa@users.noreply.github.com",
"username":"aaaaaa"
},
"committer":{
"name":"aaaaaa aaaaaa",
"email":"101218171+aaaaaa@users.noreply.github.com",
"username":"aaaaaa"
},
"added":[

],
"removed":[

],
"modified":[
"sddddddf.json"
]
},
{
"id":"bdcd242d6854365ddfeae6b4f86cf7bc1766e028",
"tree_id":"8286c537f7dee57395f44875ddb8b2cdb7dd48b2",
"distinct":false,
"message":"Updating pipeline: pl_gwp_file_landing_check. Adding Sylvan Performance",
"timestamp":"2024-08-12T12:06:10-05:00",
"url":"https://github.com/xxxxxxxxxxxxxxx/AzureWorkload_A00008/commit/bdcd242d6854365ddfeae6b4f86cf7bc1766e028",
"author":{
"name":"aaaaaa aaaaaa",
"email":"101218171+aaaaaa@users.noreply.github.com",
"username":"aaaaaa"
},
"committer":{
"name":"aaaaaa aaaaaa",
"email":"101218171+aaaaaa@users.noreply.github.com",
"username":"aaaaaa"
},
"added":[

],
"removed":[

],
"modified":[
"asadwefvdx.json"
]
},
{
"id":"108ebd4ff8ae9dd70e669e2ca49e293684d5c37a",
"tree_id":"5a6d71393611718b8576f8a63cdd34ce619f17dd",
"distinct":false,
"message":"asdrwerwq",
"timestamp":"2024-08-12T10:09:33-07:00",
"url":"https://github.com/xxxxxxxxxxxxxxx/AzureWorkload_A00008/commit/108ebd4ff8ae9dd70e669e2ca49e293684d5c37a",
"author":{
"name":"dfsd",
"email":"l.llllllllllll@aaaaaa.com",
"username":"aaaaaa"
},
"committer":{
"name":"lllllllllllll",
"email":"l.llllllllllll@abc.com",
"username":"aaaaaa"
},
"added":[

],
"removed":[

],
"modified":[
"A.json",
"A.json",
"A.json"
]
},
{
"id":"34c07bcbf557413cf42b601c1794c87db8c321d1",
"tree_id":"5a6d71393611718b8576f8a63cdd34ce619f17dd",
"distinct":true,
"message":"asadasd",
"timestamp":"2024-08-12T13:32:45-05:00",
"url":"https://github.com/xxxxxxxxxxxxxxx/AzureWorkload_A00008/commit/34c07bcbf557413cf42b601c1794c87db8c321d1",
"author":{
"name":"aaaaaa aaaaaa",
"email":"101218171+aaaaaa@users.noreply.github.com",
"username":"aaaaaa"
},
"committer":{
"name":"GitasdjwqaikHubasdqw",
"email":"noreply@gitskcaskadahuqwdqbqwdqaw.com",
"username":"wdkcszjkcsebwdqwdfqwdawsldqodqw"
},
"added":[

],
"removed":[

],
"modified":[
"a.json",
"A1.json",
"A1.json"
]
}
],
"head_commit":{
"id":"34c07bcbf557413cf42b601c1794c87db8c321d1",
"tree_id":"5a6d71393611718b8576f8a63cdd34ce619f17dd",
"distinct":true,
"message":"sadwad from xxxxxxxxxxxxxxx/IH-5942-Pipeline-Change\n\nIh 5asdsazdapeline change",
"timestamp":"2024-08-12T13:32:45-05:00",
"url":"https://github.com/xxxxxxxxxxxxxxx/AzureWorkload_A00008/commit/3weweeeeeeeee,
"author":{
"name":"askjas",
"email":"101218171+asfsfgwsrsd@users.noreply.github.com",
"username":"asdwasdcqwasfdc-qwgbhvcfawdqxaiwdaszxc"
},
"committer":{
"name":"GsdzvcweditHuscwsab",
"email":"noreply@gitasdcwedhub.com",
"username":"wefczeb-fwefvdszlow"
},
"added":[

],
"removed":[

],
"modified":[
"zzzzzzz.json",
"Azzzzz.json",
"zzzz.json"
]
}
}
Labels (1)
0 Karma
1 Solution

premrajvs
Explorer

Thank you for all the inputs. Here is the final query

index=Github_Webhook source="http:github-dev-token" eventtype="GitHub::Push" sourcetype="json_ae_git-webhook"
| rename repository.name as RepoName
| spath path=commits{} output=commitscollection
| mvexpand commitscollection
| fields _time RepoName commitscollection
| spath input=commitscollection
| table RepoName id added{} modified{} removed{} author.name author.email message



| spath path=commits{} output=commitscollection --> Thanks to all the responders. This helps in getting the commits from array

Next challenge is, if you pull the data for all the other fields in the same approach, each of those values cannot be mapped with each other. To address this, we should use mvexpand to split them into separate array events

Once the array is split into separate events, now, we will use the same logic to split the data into events.

Hope this helps.

View solution in original post

0 Karma

premrajvs
Explorer

Thank you for all the inputs. Here is the final query

index=Github_Webhook source="http:github-dev-token" eventtype="GitHub::Push" sourcetype="json_ae_git-webhook"
| rename repository.name as RepoName
| spath path=commits{} output=commitscollection
| mvexpand commitscollection
| fields _time RepoName commitscollection
| spath input=commitscollection
| table RepoName id added{} modified{} removed{} author.name author.email message



| spath path=commits{} output=commitscollection --> Thanks to all the responders. This helps in getting the commits from array

Next challenge is, if you pull the data for all the other fields in the same approach, each of those values cannot be mapped with each other. To address this, we should use mvexpand to split them into separate array events

Once the array is split into separate events, now, we will use the same logic to split the data into events.

Hope this helps.

0 Karma

yuanliu
SplunkTrust
SplunkTrust

In addition to mistaken path notation ({} for array) as @PickleRick , you also do not need an extra spath if all you want is a multivalued field named commit_id.  Splunk should have taken care of extraction.

 

index=XXXXX source="http:github-dev-token" eventtype="GitHub::Push" sourcetype="json_ae_git-webhook"
| rename commits{}.id as commit_id

 

This is a full emulation

 

| makeresults format=json data="[{
\"ref\":\"refs/heads/Dev\",
\"before\":\"d53e9b3cb6cde4253e05019295a840d394a7bcb0\",
\"after\":\"34c07bcbf557413cf42b601c1794c87db8c321d1\",
\"commits\":[
{
\"id\":\"a5c816a817d06e592d2b70cd8a088d1519f2d720\",
\"tree_id\":\"15e930e14d4c62aae47a3c02c47eb24c65d11807\",
\"distinct\":false,
\"message\":\"rrrrrrrrrrrrrrrrrrrrrr\",
\"timestamp\":\"2024-08-12T12:00:04-05:00\",
\"url\":\"https://github.com/xxxxxxxxxxxxxxx/AzureWorkload_A00008/commit/aaaaaaaaaaaa\",
\"author\":{
\"name\":\"aaaaaa aaaaaa\",
\"email\":\"101218171+aaaaaa@users.noreply.github.com\",
\"username\":\"aaaaaa\"},
\"committer\":{
\"name\":\"aaaaaa aaaaaa\",
\"email\":\"101218171+aaaaaa@users.noreply.github.com\",
\"username\":\"aaaaaa\"},
\"added\":[],
\"removed\":[],
\"modified\":[\"asdafasdad.json\"]},
{
\"id\":\"a3b3b6f728ccc0eb9113e7db723fbfc4ad220882\",
\"tree_id\":\"3586aeb0a33dc5e236cb266c948f83ff01320a9a\",
\"distinct\":false,
\"message\":\"xxxxxxxxxxxxxxxxxxx\",
\"timestamp\":\"2024-08-12T12:05:40-05:00\",
\"url\":\"https://github.com/xxxxxxxxxxxxxxx/AzureWorkload_A00008/commit/a3b3b6f728ccc0eb9113e7db723fbfc4ad220...\",
\"author\":{
\"name\":\"aaaaaa aaaaaa\",
\"email\":\"101218171+aaaaaa@users.noreply.github.com\",
\"username\":\"aaaaaa\"},
\"committer\":{
\"name\":\"aaaaaa aaaaaa\",
\"email\":\"101218171+aaaaaa@users.noreply.github.com\",
\"username\":\"aaaaaa\"},
\"added\":[],
\"removed\":[],
\"modified\":[ \"sddddddf.json\"]},
{
\"id\":\"bdcd242d6854365ddfeae6b4f86cf7bc1766e028\",
\"tree_id\":\"8286c537f7dee57395f44875ddb8b2cdb7dd48b2\",
\"distinct\":false,
\"message\":\"Updating pipeline: pl_gwp_file_landing_check. Adding Sylvan Performance\",
\"timestamp\":\"2024-08-12T12:06:10-05:00\",
\"url\":\"https://github.com/xxxxxxxxxxxxxxx/AzureWorkload_A00008/commit/bdcd242d6854365ddfeae6b4f86cf7bc1766e...\",
\"author\":{
\"name\":\"aaaaaa aaaaaa\",
\"email\":\"101218171+aaaaaa@users.noreply.github.com\",
\"username\":\"aaaaaa\"},
\"committer\":{
\"name\":\"aaaaaa aaaaaa\",
\"email\":\"101218171+aaaaaa@users.noreply.github.com\",
\"username\":\"aaaaaa\"},
\"added\":[],
\"removed\":[],
\"modified\":[ \"asadwefvdx.json\"]},
{
\"id\":\"108ebd4ff8ae9dd70e669e2ca49e293684d5c37a\",
\"tree_id\":\"5a6d71393611718b8576f8a63cdd34ce619f17dd\",
\"distinct\":false,
\"message\":\"asdrwerwq\",
\"timestamp\":\"2024-08-12T10:09:33-07:00\",
\"url\":\"https://github.com/xxxxxxxxxxxxxxx/AzureWorkload_A00008/commit/108ebd4ff8ae9dd70e669e2ca49e293684d5c...\",
\"author\":{
\"name\":\"dfsd\",
\"email\":\"l.llllllllllll@aaaaaa.com\",
\"username\":\"aaaaaa\"},
\"committer\":{
\"name\":\"lllllllllllll\",
\"email\":\"l.llllllllllll@abc.com\",
\"username\":\"aaaaaa\"},
\"added\":[],
\"removed\":[],
\"modified\":[\"A.json\",\"A.json\",\"A.json\"]},{
\"id\":\"34c07bcbf557413cf42b601c1794c87db8c321d1\",
\"tree_id\":\"5a6d71393611718b8576f8a63cdd34ce619f17dd\",
\"distinct\":true,
\"message\":\"asadasd\",
\"timestamp\":\"2024-08-12T13:32:45-05:00\",
\"url\":\"https://github.com/xxxxxxxxxxxxxxx/AzureWorkload_A00008/commit/34c07bcbf557413cf42b601c1794c87db8c32...\",
\"author\":{
\"name\":\"aaaaaa aaaaaa\",
\"email\":\"101218171+aaaaaa@users.noreply.github.com\",
\"username\":\"aaaaaa\"},
\"committer\":{
\"name\":\"GitasdjwqaikHubasdqw\",
\"email\":\"noreply@gitskcaskadahuqwdqbqwdqaw.com\",
\"username\":\"wdkcszjkcsebwdqwdfqwdawsldqodqw\"},
\"added\":[],
\"removed\":[],
\"modified\":[ \"a.json\", \"A1.json\", \"A1.json\"]}],
\"head_commit\":{
\"id\":\"34c07bcbf557413cf42b601c1794c87db8c321d1\",
\"tree_id\":\"5a6d71393611718b8576f8a63cdd34ce619f17dd\",
\"distinct\":true,
\"message\":\"sadwad from xxxxxxxxxxxxxxx/IH-5942-Pipeline-Change\n\nIh 5asdsazdapeline change\",
\"timestamp\":\"2024-08-12T13:32:45-05:00\",
\"url\":\"https://github.com/xxxxxxxxxxxxxxx/AzureWorkload_A00008/commit/3weweeeeeeeee\",
\"author\":{
\"name\":\"askjas\",
\"email\":\"101218171+asfsfgwsrsd@users.noreply.github.com\",
\"username\":\"asdwasdcqwasfdc-qwgbhvcfawdqxaiwdaszxc\" },
\"committer\":{
\"name\":\"GsdzvcweditHuscwsab\",
\"email\":\"noreply@gitasdcwedhub.com\",
\"username\":\"wefczeb-fwefvdszlow\"},
\"added\":[],
\"removed\":[],
\"modified\":[\"zzzzzzz.json\",\"Azzzzz.json\",\"zzzz.json\" ]}}]"
| spath
``` the above emulates
index=XXXXX source="http:github-dev-token" eventtype="GitHub::Push" sourcetype="json_ae_git-webhook"
```
| rename commits{}.id as commit_id
| table commit_id

 

The output is

commit_id
a5c816a817d06e592d2b70cd8a088d1519f2d720
a3b3b6f728ccc0eb9113e7db723fbfc4ad220882
bdcd242d6854365ddfeae6b4f86cf7bc1766e028
108ebd4ff8ae9dd70e669e2ca49e293684d5c37a
34c07bcbf557413cf42b601c1794c87db8c321d1
Tags (1)

PickleRick
SplunkTrust
SplunkTrust

Assuming you wanted to say

path=commits{}.id

it seems to work for me.

| makeresults 
| eval _raw="{
\"ref\":\"refs/heads/Dev\",
\"before\":\"d53e9b3cb6cde4253e05019295a840d394a7bcb0\",
\"after\":\"34c07bcbf557413cf42b601c1794c87db8c321d1\",
\"commits\":[
{
\"id\":\"a5c816a817d06e592d2b70cd8a088d1519f2d720\",
\"tree_id\":\"15e930e14d4c62aae47a3c02c47eb24c65d11807\",
\"distinct\":false,
\"message\":\"rrrrrrrrrrrrrrrrrrrrrr\",
\"timestamp\":\"2024-08-12T12:00:04-05:00\",
\"url\":\"https://github.com/xxxxxxxxxxxxxxx/AzureWorkload_A00008/commit/aaaaaaaaaaaa\",
\"author\":{
\"name\":\"aaaaaa aaaaaa\",
\"email\":\"101218171+aaaaaa@users.noreply.github.com\",
\"username\":\"aaaaaa\"},
\"committer\":{
\"name\":\"aaaaaa aaaaaa\",
\"email\":\"101218171+aaaaaa@users.noreply.github.com\",
\"username\":\"aaaaaa\"},
\"added\":[],
\"removed\":[],
\"modified\":[\"asdafasdad.json\"]},
{
\"id\":\"a3b3b6f728ccc0eb9113e7db723fbfc4ad220882\",
\"tree_id\":\"3586aeb0a33dc5e236cb266c948f83ff01320a9a\",
\"distinct\":false,
\"message\":\"xxxxxxxxxxxxxxxxxxx\",
\"timestamp\":\"2024-08-12T12:05:40-05:00\",
\"url\":\"https://github.com/xxxxxxxxxxxxxxx/AzureWorkload_A00008/commit/a3b3b6f728ccc0eb9113e7db723fbfc4ad220...\",
\"author\":{
\"name\":\"aaaaaa aaaaaa\",
\"email\":\"101218171+aaaaaa@users.noreply.github.com\",
\"username\":\"aaaaaa\"},
\"committer\":{
\"name\":\"aaaaaa aaaaaa\",
\"email\":\"101218171+aaaaaa@users.noreply.github.com\",
\"username\":\"aaaaaa\"},
\"added\":[],
\"removed\":[],
\"modified\":[ \"sddddddf.json\"]},
{
\"id\":\"bdcd242d6854365ddfeae6b4f86cf7bc1766e028\",
\"tree_id\":\"8286c537f7dee57395f44875ddb8b2cdb7dd48b2\",
\"distinct\":false,
\"message\":\"Updating pipeline: pl_gwp_file_landing_check. Adding Sylvan Performance\",
\"timestamp\":\"2024-08-12T12:06:10-05:00\",
\"url\":\"https://github.com/xxxxxxxxxxxxxxx/AzureWorkload_A00008/commit/bdcd242d6854365ddfeae6b4f86cf7bc1766e...\",
\"author\":{
\"name\":\"aaaaaa aaaaaa\",
\"email\":\"101218171+aaaaaa@users.noreply.github.com\",
\"username\":\"aaaaaa\"},
\"committer\":{
\"name\":\"aaaaaa aaaaaa\",
\"email\":\"101218171+aaaaaa@users.noreply.github.com\",
\"username\":\"aaaaaa\"},
\"added\":[],
\"removed\":[],
\"modified\":[ \"asadwefvdx.json\"]},
{
\"id\":\"108ebd4ff8ae9dd70e669e2ca49e293684d5c37a\",
\"tree_id\":\"5a6d71393611718b8576f8a63cdd34ce619f17dd\",
\"distinct\":false,
\"message\":\"asdrwerwq\",
\"timestamp\":\"2024-08-12T10:09:33-07:00\",
\"url\":\"https://github.com/xxxxxxxxxxxxxxx/AzureWorkload_A00008/commit/108ebd4ff8ae9dd70e669e2ca49e293684d5c...\",
\"author\":{
\"name\":\"dfsd\",
\"email\":\"l.llllllllllll@aaaaaa.com\",
\"username\":\"aaaaaa\"},
\"committer\":{
\"name\":\"lllllllllllll\",
\"email\":\"l.llllllllllll@abc.com\",
\"username\":\"aaaaaa\"},
\"added\":[],
\"removed\":[],
\"modified\":[\"A.json\",\"A.json\",\"A.json\"]},{
\"id\":\"34c07bcbf557413cf42b601c1794c87db8c321d1\",
\"tree_id\":\"5a6d71393611718b8576f8a63cdd34ce619f17dd\",
\"distinct\":true,
\"message\":\"asadasd\",
\"timestamp\":\"2024-08-12T13:32:45-05:00\",
\"url\":\"https://github.com/xxxxxxxxxxxxxxx/AzureWorkload_A00008/commit/34c07bcbf557413cf42b601c1794c87db8c32...\",
\"author\":{
\"name\":\"aaaaaa aaaaaa\",
\"email\":\"101218171+aaaaaa@users.noreply.github.com\",
\"username\":\"aaaaaa\"},
\"committer\":{
\"name\":\"GitasdjwqaikHubasdqw\",
\"email\":\"noreply@gitskcaskadahuqwdqbqwdqaw.com\",
\"username\":\"wdkcszjkcsebwdqwdfqwdawsldqodqw\"},
\"added\":[],
\"removed\":[],
\"modified\":[ \"a.json\", \"A1.json\", \"A1.json\"]}],
\"head_commit\":{
\"id\":\"34c07bcbf557413cf42b601c1794c87db8c321d1\",
\"tree_id\":\"5a6d71393611718b8576f8a63cdd34ce619f17dd\",
\"distinct\":true,
\"message\":\"sadwad from xxxxxxxxxxxxxxx/IH-5942-Pipeline-Change\n\nIh 5asdsazdapeline change\",
\"timestamp\":\"2024-08-12T13:32:45-05:00\",
\"url\":\"https://github.com/xxxxxxxxxxxxxxx/AzureWorkload_A00008/commit/3weweeeeeeeee\",
\"author\":{
\"name\":\"askjas\",
\"email\":\"101218171+asfsfgwsrsd@users.noreply.github.com\",
\"username\":\"asdwasdcqwasfdc-qwgbhvcfawdqxaiwdaszxc\" },
\"committer\":{
\"name\":\"GsdzvcweditHuscwsab\",
\"email\":\"noreply@gitasdcwedhub.com\",
\"username\":\"wefczeb-fwefvdszlow\"},
\"added\":[],
\"removed\":[],
\"modified\":[\"zzzzzzz.json\",\"Azzzzz.json\",\"zzzz.json\" ]}}"
| spath output=commit_id path=commits{}.id | table commit_id

 shows 5 values

Splunk 9.3.0

Get Updates on the Splunk Community!

Now Available: Cisco Talos Threat Intelligence Integrations for Splunk Security Cloud ...

At .conf24, we shared that we were in the process of integrating Cisco Talos threat intelligence into Splunk ...

Preparing your Splunk Environment for OpenSSL3

The Splunk platform will transition to OpenSSL version 3 in a future release. Actions are required to prepare ...

Easily Improve Agent Saturation with the Splunk Add-on for OpenTelemetry Collector

Agent Saturation What and Whys In application performance monitoring, saturation is defined as the total load ...