Getting Data In

Event logs getting duplicated every collection

clintla
Contributor

I have a duplication problem w/ splunk- I know why but not how to fix it.

I have an event log that I have to extract every hour (entire log every time).

so if I get an error im searching for.

10:00am target event.

An hour later my script runs & I get a few more events it counts the previous
events with my time chart.

10:00am target event

11:05am target event.

Now my chart is incorrect due to now the previous error count -1 is now 2 since it has been re-indexed (3 events total). these logs go back a year (I cant clear them every time on the
device which would make it easy)
It should read 1 target event per hour but every hour- the previous events get doubled making
the chart incredibly inaccurate.

10:00am (2 events)

11:05am (1 event)

Seems like this would be something easy to fix- things like followtail dont seem to apply though.

If I could dedup on a chart that would be good. Dont think that works in a chart though. Anyone have another solution?

Tags (3)
0 Karma

kristian_kolb
Ultra Champion

I think you should rather try to fix your script so that it does not read the entire file each time it runs (if possible). By getting the input data correct, you don't have to worry about 'fixing' the output of your searches. Also, re-indexing the entire file each time consumes part of your license. That may not be a big issue if you read a small log file once an hour, but someday you might be asked to run the script once a minute....

/kristian

Takajian
Builder

Could you try following dedup command? I think this remove duplicated events. Please let me know if this help in your environment.

... | dedup _raw

Takajian
Builder

In your case, how it works? The dedup will be before timechart command.

sourcetype="getlog" | dedup _raw| timechart count by host useother="f"

But dedup command is a kinds of workaround. The ideal is to fix your script.

0 Karma

clintla
Contributor

sourcetype="getlog" | timechart count by host useother="f" | dedup _raw

add "| dedup _raw" & it goes from that sloped output to "No results"- that seems like it should have worked but I dont understand fully the command I guess.

0 Karma
Get Updates on the Splunk Community!

Webinar Recap | Revolutionizing IT Operations: The Transformative Power of AI and ML ...

The Transformative Power of AI and ML in Enhancing Observability   In the realm of IT operations, the ...

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...