Splunk Search

Splunk to search only latest log file and not any historical data

shashankk
Communicator

My requirement is simple, I have created a Certificate monitoring script and passing the log file through a splunk dashboard. I want splunk to only check the latest log file and not store any historical data in search events.

Below is the sample log file output - (It is a "|" separated log file output)

 

 

ALERT|appu2.de.com|rootca12|/applications/hs_cert/cert/live/h_hcm.jks|Expired|2020-10-18
WARNING|appu2.de.com|key|/applications/hs_cert/cert/live/h_hcm.jks|Expiring Soon|2025-06-14
INFO|appu2.de.com|rootca13|/applications/hs_cert/cert/live/h_core.jks|Valid|2026-10-18
ALERT|appu2.de.com|rootca12|/applications/hs_cert/cert/live/h_core.jks|Expired|2020-10-18
WARNING|appu2.de.com|key|/applications/hs_cert/cert/live/h_core.jks|Expiring Soon|2025-03-22
ALERT|appu2.de.com|key|/applications/hs_cert/cert/live/h_mq.p12|Expired|2025-01-03

 

 

 
I am looking for 2 points here:
1. How do I handle only latest log file content (no history) in "inputs.conf" - what changes to be done?
2. Below is the sample SPL query, kindly check and suggest if any changes.

 

index=test_event source=/applications/hs_cert/cert/log/cert_monitor.log
| rex field=_raw "(?<Severity>[^\|]+)\|(?<Hostname>[^\|]+)\|(?<CertIssuer>[^\|]+)\|(?<FilePath>[^\|]+)\|(?<Status>[^\|]+)\|(?<ExpiryDate>[^\|]+)"
| multikv forceheader=1
| table Severity Hostname CertIssuer FilePath Status ExpiryDate

 


@ITWhisperer - Kindly help

Labels (5)
0 Karma

shashankk
Communicator

@marnall @yuanliu @gcusello First of all thank you for your kind response and apologize if my query has confused you. I have added a screenshot (masked) of the raw events from the splunk (All Time). If you refer the screenshot, it is showing events in one complete block. I simply want to show the latest available log output through the splunk dashboard. The queries shared above is only referring to the head 1 line of each event block. That seems to be an incorrect output. 

I request you to please refer the attached screenshot output and suggest accordingly.

Thanks in advance 🙂

0 Karma

yuanliu
SplunkTrust
SplunkTrust

you refer the screenshot, it is showing events in one complete block. I simply want to show the latest available log output through the splunk dashboard. The queries shared above is only referring to the head 1 line of each event block. That seems to be an incorrect output. 

This is even more confusing.  The latest available log output is exactly head 1, which @marnall already gives.  Could you illustrate what's the difference between that search output and the output you wanted? (No screenshot.  Use text, text table, etc.)

To help yourself, here are four golden rules that I call four commandments of asking an answerable question:

  • Illustrate data input (in raw text, anonymize as needed), whether they are raw events or output from a search (SPL that volunteers here do not have to look at).
  • Illustrate the desired output from illustrated data.
  • Explain the logic between illustrated data and desired output without SPL.
  • If you also illustrate attempted SPL, illustrate actual output and compare with desired output, explain why they look different to you if that is not painfully obvious.
0 Karma

shashankk
Communicator

@yuanliu @marnall @gcusello Kindly refer below query output:

 

 

 

index=test_event source=/applications/hs_cert/cert/log/cert_monitor.log host="appu1.com"
| head 1
| rex field=_raw "(?<Severity>[^\|]+)\|(?<Hostname>[^\|]+)\|(?<CertIssuer>[^\|]+)\|(?<FilePath>[^\|]+)\|(?<Status>[^\|]+)\|(?<ExpiryDate>[^\|\s]+)"
| multikv forceheader=1
| table Severity Hostname CertIssuer FilePath Status ExpiryDate

 

 

 

Actual Output:

SeverityHostnameCertIssuerFilePathStatusExpiryDate
INFOappu1.comrootca13/applications/hs_cert/cert/live/h_core.jksValid2026-10-18
INFOappu1.comrootca13/applications/hs_cert/cert/live/h_core.jksValid2026-10-18
INFOappu1.comrootca13/applications/hs_cert/cert/live/h_core.jksValid2026-10-18

 

Expected Output:

SeverityHostnameCertIssuerFilePathStatusExpiryDate
INFOappu1.comrootca13/applications/hs_cert/cert/live/h_hcm.jksValid2026-10-18
ALERTappu1.comrootca12/applications/hs_cert/cert/live/h_hcm.jksExpired2020-10-18
INFOappu1.comkey/applications/hs_cert/cert/live/h_hcm.jksValid2025-06-14
INFOappu1.comrootca13/applications/hs_cert/cert/live/h_core.jksValid2026-10-18
ALERTappu1.comrootca12/applications/hs_cert/cert/live/h_core.jksExpired2020-10-18
INFOappu1.comkey/applications/hs_cert/cert/live/h_core.jksValid2025-03-22


Refer the above Actual output section, it only showing first line of the block and repeated the same 3 times. This is not the expected output of the event block shared. 

Can you please suggest if any mistake in the rex field used? Refer below scenario as well:

 

 

 

index=test_event source=/applications/hs_cert/cert/log/cert_monitor.log host="appu1.com"
| rex field=_raw "(?<Severity>[^\|]+)\|(?<Hostname>[^\|]+)\|(?<CertIssuer>[^\|]+)\|(?<FilePath>[^\|]+)\|(?<Status>[^\|]+)\|(?<ExpiryDate>[^\|\s]+)"
| table Severity Hostname CertIssuer FilePath Status ExpiryDate

 

 

I am seeing only first row output from each event block. (Instead of showing all rows from the log file.)

SeverityHostnameCertIssuerFilePathStatusExpiryDate
INFOappu1.comrootca13/applications/hs_cert/cert/live/h_hcm.jksValid2026-10-18
INFOappu1.comrootca13/applications/hs_cert/cert/live/h_core.jksValid2026-10-18


Kindly suggest. 

0 Karma

yuanliu
SplunkTrust
SplunkTrust

The problem arises at least in part from missing header in event data.  If the illustrated raw event is a complete event in _raw, this is what you can do to add that header.  No need for rex.

 

index=test_event source=/applications/hs_cert/cert/log/cert_monitor.log
| head 1
| eval _raw = "Severity,Hostname,CertIssuer,FilePath,Status,ExpiryDate
" . replace(_raw, "\|", ",")
| multikv forceheader=1
| table Severity Hostname CertIssuer FilePath Status ExpiryDate

 

Here is a complete emulation:

 

 

| makeresults
| eval _raw="ALERT|appu2.de.com|rootca12|/applications/hs_cert/cert/live/h_hcm.jks|Expired|2020-10-18
WARNING|appu2.de.com|key|/applications/hs_cert/cert/live/h_hcm.jks|Expiring Soon|2025-06-14
INFO|appu2.de.com|rootca13|/applications/hs_cert/cert/live/h_core.jks|Valid|2026-10-18
ALERT|appu2.de.com|rootca12|/applications/hs_cert/cert/live/h_core.jks|Expired|2020-10-18
WARNING|appu2.de.com|key|/applications/hs_cert/cert/live/h_core.jks|Expiring Soon|2025-03-22
ALERT|appu2.de.com|key|/applications/hs_cert/cert/live/h_mq.p12|Expired|2025-01-03"

``` the above emulates
index=test_event source=/applications/hs_cert/cert/log/cert_monitor.log
```
| head 1
| rex field=_raw "(?<Severity>[^\|]+)\|(?<Hostname>[^\|]+)\|(?<CertIssuer>[^\|]+)\|(?<FilePath>[^\|]+)\|(?<Status>[^\|]+)\|(?<ExpiryDate>[^\|\s]+)"
| multikv forceheader=1
| table Severity Hostname CertIssuer FilePath Status ExpiryDate

 

Output is

SeverityHostnameCertIssuerFilePathStatusExpiryDate
ALERTappu2.de.comrootca12/applications/hs_cert/cert/live/h_hcm.jksExpired2020-10-18
WARNINGappu2.de.comkey/applications/hs_cert/cert/live/h_hcm.jksExpiring Soon2025-06-14
INFOappu2.de.comrootca13/applications/hs_cert/cert/live/h_core.jksValid2026-10-18
ALERTappu2.de.comrootca12/applications/hs_cert/cert/live/h_core.jksExpired2020-10-18
WARNINGappu2.de.comkey/applications/hs_cert/cert/live/h_core.jksExpiring Soon2025-03-22
ALERTappu2.de.comkey/applications/hs_cert/cert/live/h_mq.p12Expired2025-01-03

 

0 Karma

yuanliu
SplunkTrust
SplunkTrust

Several things need to be clarified.  First, as @marnall says, there is no such a thing as to use input.conf to make Splunk only handle part of the event unless you can predetermine which row in that multi-row data is "latest".  Secondly, when you say "latest", people generally understand it to be the latest event in indexer.  If you desire to only SHOW the latest row based on ExpiryDate, that can be easily achieved in search.

Thirdly, your "simple" requirement statement omitted an important qualifier: Do you want the absolute largest ExpiryDate in the entire log or do you want the largest ExpiryDate per group by certain criteria, e.g., by FilePath?

If it's the former, you can simply do

 

index=test_event source=/applications/hs_cert/cert/log/cert_monitor.log
| rex field=_raw "(?<Severity>[^\|]+)\|(?<Hostname>[^\|]+)\|(?<CertIssuer>[^\|]+)\|(?<FilePath>[^\|]+)\|(?<Status>[^\|]+)\|(?<ExpiryDate>[^\|]+)"
| multikv forceheader=1
| sort ExpiryDate
| tail 1
| table Severity Hostname CertIssuer FilePath Status ExpiryDate

 

If, on the other hand, you want largest ExpiryDate by FilePath - which seems more practical to me, you could do

 

index=test_event source=/applications/hs_cert/cert/log/cert_monitor.log
```
| rex field=_raw "(?<Severity>[^\|]+)\|(?<Hostname>[^\|]+)\|(?<CertIssuer>[^\|]+)\|(?<FilePath>[^\|]+)\|(?<Status>[^\|]+)\|(?<ExpiryDate>[^\|]+)"
| multikv
| sort - FilePath ExpiryDate
| stats latest(*) as * by FilePath
| table Severity Hostname CertIssuer FilePath Status ExpiryDate

 

Output from this search using your sample data is

SeverityHostnameCertIussuerFilePathStatusExpiryDate
INFOappu2.de.comrootca13/applications/hs_cert/cert/live/h_core.jksValid2026-10-18
WARNINGappu2.de.comkey/applications/hs_cert/cert/live/h_hcm.jksExpiring Soon2025-06-14
ALERTappu2.de.comkey/applications/hs_cert/cert/live/h_mq.p12Expired2025-01-03

 

This method uses a side effect of latest function's assumptions about event order.  There are more resilient method to do this, too.

Here is an emulation of your sample data you can play with and compare with real data

 

| makeresults format=csv data="_raw
ALERT|appu2.de.com|rootca12|/applications/hs_cert/cert/live/h_hcm.jks|Expired|2020-10-18
WARNING|appu2.de.com|key|/applications/hs_cert/cert/live/h_hcm.jks|Expiring Soon|2025-06-14
INFO|appu2.de.com|rootca13|/applications/hs_cert/cert/live/h_core.jks|Valid|2026-10-18
ALERT|appu2.de.com|rootca12|/applications/hs_cert/cert/live/h_core.jks|Expired|2020-10-18
WARNING|appu2.de.com|key|/applications/hs_cert/cert/live/h_core.jks|Expiring Soon|2025-03-22
ALERT|appu2.de.com|key|/applications/hs_cert/cert/live/h_mq.p12|Expired|2025-01-03"
``` the above emulates
index=test_event source=/applications/hs_cert/cert/log/cert_monitor.log
```

 

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @shashankk ,

if youe logs arrive in block (more or less the same timestamp), you could use a solution like this:

index=test_event source=/applications/hs_cert/cert/log/cert_monitor.log [ search index=test_event source=/applications/hs_cert/cert/log/cert_monitor.log | head 1 | eval earliest= _time-60, latest=_time+60 | fields earliest latest ]
| rex field=_raw "(?<Severity>[^\|]+)\|(?<Hostname>[^\|]+)\|(?<CertIssuer>[^\|]+)\|(?<FilePath>[^\|]+)\|(?<Status>[^\|]+)\|(?<ExpiryDate>[^\|]+)"
| multikv forceheader=1
| table Severity Hostname CertIssuer FilePath Status ExpiryDate

 It runs if your logs are all in blocks of around 60 seconds.

Ciao.

Giuseppe

0 Karma

shashankk
Communicator

@gcusello  Thanks for your response. Yes, the log event is in one block. But the below query is showing incorrect results. It is showing historical data as well. (not the latest block events)

Can I handle this in "inputs.conf" file to only show the latest one log file only? I am not looking for any historical data.

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @shashankk ,

no, you cannot manage thisin inputs.conf.

Modify my search using a correct time frame depending on the frequency of your data:

if your file is read every 5 minutes, use:

| eval earliest= _time-60, latest=_time+60 

,

if every minute, use:

| eval earliest= _time-30, latest=_time+30 

In this way, you are sure to read only the latest file.

Ciao.

Giuseppe

0 Karma

marnall
Motivator

Splunk will store the indexed data until the end of the retention period in the index. You cannot tell Splunk to just store the latest copy from inputs.conf. You can, however, use searches to return only the latest indexed event.

By default, events will be returned in reverse chronological order. So if your list of certificates is in a single event, then you may be able to filter to only the latest one by using "head 1"

index=test_event source=/applications/hs_cert/cert/log/cert_monitor.log
| head 1
| rex field=_raw "(?<Severity>[^\|]+)\|(?<Hostname>[^\|]+)\|(?<CertIssuer>[^\|]+)\|(?<FilePath>[^\|]+)\|(?<Status>[^\|]+)\|(?<ExpiryDate>[^\|]+)"
| multikv forceheader=1
| table Severity Hostname CertIssuer FilePath Status ExpiryDate

If this is not the case, then perhaps you could post a sanitized screenshot of your events to give us a better idea of how they appear in your search interface.

0 Karma
Get Updates on the Splunk Community!

Mastering Threat Hunting

Watch NowWatch an insightful talk where we dive into the world of threat hunting, exploring the key ...

Harnessing Splunk’s Federated Search for Amazon S3

Managing your data effectively often means balancing performance, costs, and compliance. Splunk’s Federated ...

Infographic provides the TL;DR for the 2024 Splunk Career Impact Report

We’ve been buzzing with excitement about the recent validation of Splunk Education! The 2024 Splunk Career ...