Getting Data In

Event logs getting duplicated every collection

clintla
Contributor

I have a duplication problem w/ splunk- I know why but not how to fix it.

I have an event log that I have to extract every hour (entire log every time).

so if I get an error im searching for.

10:00am target event.

An hour later my script runs & I get a few more events it counts the previous
events with my time chart.

10:00am target event

11:05am target event.

Now my chart is incorrect due to now the previous error count -1 is now 2 since it has been re-indexed (3 events total). these logs go back a year (I cant clear them every time on the
device which would make it easy)
It should read 1 target event per hour but every hour- the previous events get doubled making
the chart incredibly inaccurate.

10:00am (2 events)

11:05am (1 event)

Seems like this would be something easy to fix- things like followtail dont seem to apply though.

If I could dedup on a chart that would be good. Dont think that works in a chart though. Anyone have another solution?

Tags (3)
0 Karma

kristian_kolb
Ultra Champion

I think you should rather try to fix your script so that it does not read the entire file each time it runs (if possible). By getting the input data correct, you don't have to worry about 'fixing' the output of your searches. Also, re-indexing the entire file each time consumes part of your license. That may not be a big issue if you read a small log file once an hour, but someday you might be asked to run the script once a minute....

/kristian

Takajian
Builder

Could you try following dedup command? I think this remove duplicated events. Please let me know if this help in your environment.

... | dedup _raw

Takajian
Builder

In your case, how it works? The dedup will be before timechart command.

sourcetype="getlog" | dedup _raw| timechart count by host useother="f"

But dedup command is a kinds of workaround. The ideal is to fix your script.

0 Karma

clintla
Contributor

sourcetype="getlog" | timechart count by host useother="f" | dedup _raw

add "| dedup _raw" & it goes from that sloped output to "No results"- that seems like it should have worked but I dont understand fully the command I guess.

0 Karma
Get Updates on the Splunk Community!

Splunk + ThousandEyes: Correlate frontend, app, and network data to troubleshoot ...

 Are you tired of troubleshooting delays caused by siloed frontend, application, and network data? We've got a ...

Splunk Observability for AI

Don’t miss out on an exciting Tech Talk on Splunk Observability for AI!Discover how Splunk’s agentic AI ...

🔐 Trust at Every Hop: How mTLS in Splunk Enterprise 10.0 Makes Security Simpler

From Idea to Implementation: Why Splunk Built mTLS into Splunk Enterprise 10.0  mTLS wasn’t just a checkbox ...