Splunk Search

Duplicate fields in Splunk events

Yashvik
Explorer

Hi All,

When we doing a splunk search in our application (sh_app1), we noticed some fields are duplicated / double up (refer: sample_logs.png)

if we do the same search in another application (sh_welcome_app_ui), we do not see any duplication for the same fields.

cidPerf-May06-9-151xxx
levelINFO
node_nameaks-application-xxx

 

SPL being used.

index=splunk_idx source= some_source
| rex field=log "level=(?<level>.*?),"
| rex field=log "\[CID:(?<cid>.*?)\]"
| rex field=log "message=(?<msg>.*?),"
| rex field=log "elapsed_time_ms=\"(?<elap>.*?)\"" | search msg="\"search pattern\""
| table cid, msg, elap

The event count remains same if we search inside that app or any other app, only some fields are  duplicated. We couldn't figure out where the actual issue is. 
Can someone help? 

Labels (3)
1 Solution

gcusello
SplunkTrust
SplunkTrust

Hi @Yashvik,

this probably depends on the data you're using,

anyway, try to group your ata by a common key usingstats instead table command, something like this:

index=splunk_idx source= some_source
| rex field=log "level=(?<level>.*?),"
| rex field=log "\[CID:(?<cid>.*?)\]"
| rex field=log "message=(?<msg>.*?),"
| rex field=log "elapsed_time_ms=\"(?<elap>.*?)\"" | search msg="\"search pattern\""
| stats values(msg) AS msg values(elap) AS elap BY cid

Ciao.

Giuseppe

View solution in original post

Eider
Engager

In my case I was sending TCP info (JSON) through API REST, I had to recreate my source type configuration like this:

  • Name: Whatever
  • Description: Whatever
  • Destination App: Whatever
  • Category: Whatever
  • Indexed extractions: json
  • Next in the Advanced TAB, you need to add this extra setting: KV_MODE = none


The reason is that the json I send via API already contains the event attribute in the splunk expected way, so KV_MODE (key value mode) should be set to none, like this way you avoid double parsing the event json data.

{
 "sourcetype": "MyCustomSourceType",
 "index": "index-name",
 "event": {
  "a": "aa",
  "n": 1, .....
 }
}

 

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @Yashvik,

this probably depends on the data you're using,

anyway, try to group your ata by a common key usingstats instead table command, something like this:

index=splunk_idx source= some_source
| rex field=log "level=(?<level>.*?),"
| rex field=log "\[CID:(?<cid>.*?)\]"
| rex field=log "message=(?<msg>.*?),"
| rex field=log "elapsed_time_ms=\"(?<elap>.*?)\"" | search msg="\"search pattern\""
| stats values(msg) AS msg values(elap) AS elap BY cid

Ciao.

Giuseppe

Yashvik
Explorer

Hi @gcusello 
Thanks for the reply. using stats helps in removing the duplicate values in "statistics" tab.  However, the duplicate fields are still appearing in "Events" tab.  I don't understand how it's happening.

Ps. Due to unknown reasons I can't attach images.

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @Yashvik ,

events are the ones you have, if you don't want duplicated events also in the Events tab use the dedup command (https://docs.splunk.com/Documentation/SCS/current/SearchReference/DedupCommandOverview) to remome the duplicated ones.

Ciao.

Giuseppe

0 Karma

Yashvik
Explorer

Hi @gcusello 
Thanks, however actual issue is fields duplication. Please find the attached screenshot and you will see some fields contains duplicate values (cid, cluster, container_id, container_name etc). 
I'd like to understand why they are showing 2 values instead of one. 

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @Yashvik,

as I said, these are your logs and we cannot change them, you can only display them only one time to avoid unuseful duplications.

In addition, this is very frequent having json logs.

For this reason, I hint to use stats to display your logs in Statistics (and dashboard Panels) even if, in the raw logs you have duplicated values in some fields.

You shouldn't modify your logs, they are as they are and you use them displaying what you need.

Ciao.

Giuseppe

0 Karma

Yashvik
Explorer

Hello @gcusello 
But source doesn't contain any duplicate fields while sending to Splunk & they are appearing  only if we search within particular app. 
As said earlier, If I run the same query outside the app, I don't see these duplicate field values. My users don't have permissions to run the searches outside their app so they see duplicate entries every time. 

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @Yashvik,

your data seems to be json, that usually has duplicated field values.

Anyway, could you share a sample of your data (please not a screenshot)?

About the behaviour in a particoular app, maybe there are some calculated fields that elaborate your values.

Ciao.

Giuseppe

 

0 Karma

Yashvik
Explorer

Sure @gcusello 

Sample event:

{ [-]
   applicationuslcc-nonprod
   clusterAKS-SYD-NPDI1-ESE-2
   container_id9ae09dba5f0ca4c75dfxxxxxxb6b1824ec753663f02d832cf5bfb6f0dxxxxxxx
   container_imageacrsydnpdi1ese.azurecr.io/ms-usl-acct-maint:snapshot-a23584a1221b57xxxxxb437d80xxxxxxb6e65
   container_namems-usl-acct-maint
   levelINFO
   log2024-05-06 11:08:40.385 INFO 26 --- [org.springframework.kafka.KafkaListenerEndpointContainer#0-0-C-1] [CID:Perf-May06-9-151615] l.AccountCreditLimitChangedKafkaListener : message="xxxxx listener 'account credit limit event enrichment'", elapsed_time_ms="124"
   namespaceuslcc-nonprod
   node_nameaks-application-3522xxxxx-vmss0000xl
   pod_ip10.209.82.xxx
   pod_namems-usl-acct-maint-ppte-7dc7xxxxxx-2fc58
   tenantuslcc
   timestamp2024-05-06 11:08:40.385

}

 

Raw:  
{"log":"2024-05-06 11:08:40.385 INFO 26 --- [org.springframework.kafka.KafkaListenerEndpointContainer#0-0-C-1] [CID:Perf-May06-9-151615] l.AccountCreditLimitChangedxxxxxListener : message=\"xxxxx listener 'account credit limit event enrichment'\", elapsed_time_ms=\"124\"","application":"uslcc-nonprod","cluster":"AKS-SYD-NPDI1-ESE-2","namespace":"uslcc-nonprod","tenant":"uslcc","timestamp":"2024-05-06 11:08:40.385","level":"INFO","container_id":"9ae09dba5xxxxxfd2724b6b1824ec753663f02dxxxxxf0d55d59940","container_name":"ms-usl-acct-maint","container_image":"acrsydnpdi1ese.azurecr.io/ms-usl-acct-maint:snapshot-a23584a1221b5749xxxxxd803eb2aabaxxxxx5","pod_name":"ms-usl-acct-maint-ppte-7dc7c9xxxxc58","pod_ip":"10.209.82.xxx","node_name":"aks-application-35229300-vmssxxxxxl"}

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @Yashvik,

as I said, check the caculated fields in your app.

Ciao.

Giuseppe

0 Karma

Yashvik
Explorer

Thanks @gcusello  will get it checked.

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @Yashvik,

good for you, see next time!

Ciao and happy splunking

Giuseppe

P.S.: Karma Points are appreciated 😉

0 Karma
Get Updates on the Splunk Community!

SignalFlow: What? Why? How?

What is SignalFlow? Splunk Observability Cloud’s analytics engine, SignalFlow, opens up a world of in-depth ...

Federated Search for Amazon S3 | Key Use Cases to Streamline Compliance Workflows

Modern business operations are supported by data compliance. As regulations evolve, organizations must ...

New Dates, New City: Save the Date for .conf25!

Wake up, babe! New .conf25 dates AND location just dropped!! That's right, this year, .conf25 is taking place ...