Getting Data In

Powershell script to extract events from a log file - many dupe events indexed in Splunk

a_splunk_user
Path Finder

Question - is there a CRC equivalent for data indexed from a Powershell function?

On a server, I have a log file generated everyday. There are many events in this daily log file, but I only need to index a specific event.

My approach was to create a powershell script which matches a specified string and returns the lines (the content, not the line number) that contain these events. Running the script from the Powershell CLE appears to work as expected.

I then created the following stanza in inputs.conf on the UF of the server:
[powershell://apache_search_test]
script = . "C:<subdirectories>\scanApacheLog_Splunk.ps1"
sourcetype = apache_test
schedule = 60
disabled = 0

My test condition was to match the string "30/Jun/2017:09:59:17 -0400" - I see that the four events with this timestamp are indexed:
xxx.xxx.xxx.xxx - - [30/Jun/2017:09:59:17 -0400] "HEAD /enterprise HTTP/1.0" 302 -
xxx.xxx.xxx.xxx - - [30/Jun/2017:09:59:17 -0400] "POST /u/xmlrpc HTTP/1.1" 200 120
xxx.xxx.xxx.xxx - - [30/Jun/2017:09:59:17 -0400] "POST /u/xmlrpc HTTP/1.1" 200 120
xxx.xxx.xxx.xxx - - [30/Jun/2017:09:59:17 -0400] "POST /u/xmlrpc HTTP/1.1" 200 120

The problem is these events are additionally indexed each time the schedule of the stanza runs. But I only want the four events indexed (or less if they are not unique).

Is there something missing in the stanza? Do I need to add a stanza to props.conf (in that it may not be recognizing the timestamp)?

Any help is appreciated!

0 Karma

koshyk
Super Champion

If I understand correctly the best way to diagnose, i'm thinking of few options
1. use your powershell script to write into a FILE. and let Splunk read the file. This way you can understand if the script creates duplicate events or NOT.
2. Write a unique ID in your messages (i.e session id or process id ), that way you are sure if they are same messages or not.
3. output timestamp to much precise values with milliseconds and timezone
4. Tackle at Splunk level by putting props for duplicate events (not preferred)

0 Karma
Get Updates on the Splunk Community!

Enterprise Security Content Update (ESCU) | New Releases

In December, the Splunk Threat Research Team had 1 release of new security content via the Enterprise Security ...

Why am I not seeing the finding in Splunk Enterprise Security Analyst Queue?

(This is the first of a series of 2 blogs). Splunk Enterprise Security is a fantastic tool that offers robust ...

Index This | What are the 12 Days of Splunk-mas?

December 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...