Getting Data In

Powershell script to extract events from a log file - many dupe events indexed in Splunk

a_splunk_user
Path Finder

Question - is there a CRC equivalent for data indexed from a Powershell function?

On a server, I have a log file generated everyday. There are many events in this daily log file, but I only need to index a specific event.

My approach was to create a powershell script which matches a specified string and returns the lines (the content, not the line number) that contain these events. Running the script from the Powershell CLE appears to work as expected.

I then created the following stanza in inputs.conf on the UF of the server:
[powershell://apache_search_test]
script = . "C:<subdirectories>\scanApacheLog_Splunk.ps1"
sourcetype = apache_test
schedule = 60
disabled = 0

My test condition was to match the string "30/Jun/2017:09:59:17 -0400" - I see that the four events with this timestamp are indexed:
xxx.xxx.xxx.xxx - - [30/Jun/2017:09:59:17 -0400] "HEAD /enterprise HTTP/1.0" 302 -
xxx.xxx.xxx.xxx - - [30/Jun/2017:09:59:17 -0400] "POST /u/xmlrpc HTTP/1.1" 200 120
xxx.xxx.xxx.xxx - - [30/Jun/2017:09:59:17 -0400] "POST /u/xmlrpc HTTP/1.1" 200 120
xxx.xxx.xxx.xxx - - [30/Jun/2017:09:59:17 -0400] "POST /u/xmlrpc HTTP/1.1" 200 120

The problem is these events are additionally indexed each time the schedule of the stanza runs. But I only want the four events indexed (or less if they are not unique).

Is there something missing in the stanza? Do I need to add a stanza to props.conf (in that it may not be recognizing the timestamp)?

Any help is appreciated!

0 Karma

koshyk
Super Champion

If I understand correctly the best way to diagnose, i'm thinking of few options
1. use your powershell script to write into a FILE. and let Splunk read the file. This way you can understand if the script creates duplicate events or NOT.
2. Write a unique ID in your messages (i.e session id or process id ), that way you are sure if they are same messages or not.
3. output timestamp to much precise values with milliseconds and timezone
4. Tackle at Splunk level by putting props for duplicate events (not preferred)

0 Karma
Get Updates on the Splunk Community!

Optimize Cloud Monitoring

  TECH TALKS Optimize Cloud Monitoring Tuesday, August 13, 2024  |  11:00AM–12:00PM PST   Register to ...

What's New in Splunk Cloud Platform 9.2.2403?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.2.2403! Analysts can ...

Stay Connected: Your Guide to July and August Tech Talks, Office Hours, and Webinars!

Dive into our sizzling summer lineup for July and August Community Office Hours and Tech Talks. Scroll down to ...