Solved: Re: CSV Input, duplicates values

miguel1423 · ‎04-08-2020

I am monitoring my PKI certificates with a PowerShell script which returns the number of valide days for each certificats in a Excel document on my splunk server.

My script is running each days to update my value "validity", the problem is I have multiple entry for each certificat with the validity diffrents on my dashboard.

How can I only update the value "validity" after that my scritp run and not have a new entry ?

My query:
index = "index_pki" sourcetype = "splunk_csv" | statistics count by ReqID, CN, Template, Validity, NotAfter, NotBefore, San, Tumbprint | where validity <30 | sort Validity

And the printscrenn

Thanks a lot

DalJeanis · ‎04-10-2020

You just need to dedup your results for the most recent value for each certificate, or you can ask for latest() for each field in your stats command.

This way should work, assuming that ReqID and CN are the key.

index = "index_pki" sourcetype = "splunk_csv" 
| fields _time ReqID  CN  Template  Validity  NotAfter  NotBefore  San  Tumbprint
| stats max(_time) as _time latest(*) as * by ReqID, CN
| where Validity < 30 
| sort Validity
| table _time ReqID  CN  Template  Validity  NotAfter  NotBefore  San  Tumbprint

This way should also work, assuming that ReqID and CN are the key.

index = "index_pki" sourcetype = "splunk_csv" 
| fields _time ReqID  CN  Template  Validity  NotAfter  NotBefore  San  Tumbprint
| dedup  ReqID CN
| where Validity < 30 
| sort Validity
| table _time ReqID  CN  Template  Validity  NotAfter  NotBefore  San  Tumbprint

Note - I did not correct any field names, but assumed your spelling was correct.

View solution in original post

to4kawa · ‎04-10-2020

 index = "index_pki" sourcetype = "splunk_csv" 
| stats max(Validity) as Validity by ReqID, CN, Template,NotAfter, NotBefore, San, Tumbprint
| where Validity < 30 
| sort Validity

How's this? stats by clause makes unique field values.

DalJeanis · ‎04-10-2020

You just need to dedup your results for the most recent value for each certificate, or you can ask for latest() for each field in your stats command.

This way should work, assuming that ReqID and CN are the key.

index = "index_pki" sourcetype = "splunk_csv" 
| fields _time ReqID  CN  Template  Validity  NotAfter  NotBefore  San  Tumbprint
| stats max(_time) as _time latest(*) as * by ReqID, CN
| where Validity < 30 
| sort Validity
| table _time ReqID  CN  Template  Validity  NotAfter  NotBefore  San  Tumbprint

This way should also work, assuming that ReqID and CN are the key.

index = "index_pki" sourcetype = "splunk_csv" 
| fields _time ReqID  CN  Template  Validity  NotAfter  NotBefore  San  Tumbprint
| dedup  ReqID CN
| where Validity < 30 
| sort Validity
| table _time ReqID  CN  Template  Validity  NotAfter  NotBefore  San  Tumbprint

Note - I did not correct any field names, but assumed your spelling was correct.

miguel1423 · ‎04-14-2020

Hello,

Thank you for your answer, your second solution works pretty well 😃 but what does "field" do ? | fields _time ReqID CN Template Validity NotAfter NotBefore San Tumbprint

Regards

DalJeanis · ‎04-14-2020

The fields command limits the results to exactly the fields that you list, making sure that those fields are extracted but saving the system from having to attempt to extract anything else. (Technically, the underscore field _time does not have to be listed, but I put it there anyway for documentation purposes.)

CSV Input, duplicates values

Data Management Digest – December 2025

Index This | What is broken 80% of the time by February?

Unlock Faster Time-to-Value on Edge and Ingest Processor with New SPL2 Pipeline ...

Join the Conversation

CSV Input, duplicates values

Data Management Digest – December 2025

Index This | What is broken 80% of the time by February?

Unlock Faster Time-to-Value on Edge and Ingest Processor with New SPL2 Pipeline ...