Getting Data In

Powershell script to extract events from a log file - many dupe events indexed in Splunk

a_splunk_user
Path Finder

Question - is there a CRC equivalent for data indexed from a Powershell function?

On a server, I have a log file generated everyday. There are many events in this daily log file, but I only need to index a specific event.

My approach was to create a powershell script which matches a specified string and returns the lines (the content, not the line number) that contain these events. Running the script from the Powershell CLE appears to work as expected.

I then created the following stanza in inputs.conf on the UF of the server:
[powershell://apache_search_test]
script = . "C:<subdirectories>\scanApacheLog_Splunk.ps1"
sourcetype = apache_test
schedule = 60
disabled = 0

My test condition was to match the string "30/Jun/2017:09:59:17 -0400" - I see that the four events with this timestamp are indexed:
xxx.xxx.xxx.xxx - - [30/Jun/2017:09:59:17 -0400] "HEAD /enterprise HTTP/1.0" 302 -
xxx.xxx.xxx.xxx - - [30/Jun/2017:09:59:17 -0400] "POST /u/xmlrpc HTTP/1.1" 200 120
xxx.xxx.xxx.xxx - - [30/Jun/2017:09:59:17 -0400] "POST /u/xmlrpc HTTP/1.1" 200 120
xxx.xxx.xxx.xxx - - [30/Jun/2017:09:59:17 -0400] "POST /u/xmlrpc HTTP/1.1" 200 120

The problem is these events are additionally indexed each time the schedule of the stanza runs. But I only want the four events indexed (or less if they are not unique).

Is there something missing in the stanza? Do I need to add a stanza to props.conf (in that it may not be recognizing the timestamp)?

Any help is appreciated!

0 Karma

koshyk
Super Champion

If I understand correctly the best way to diagnose, i'm thinking of few options
1. use your powershell script to write into a FILE. and let Splunk read the file. This way you can understand if the script creates duplicate events or NOT.
2. Write a unique ID in your messages (i.e session id or process id ), that way you are sure if they are same messages or not.
3. output timestamp to much precise values with milliseconds and timezone
4. Tackle at Splunk level by putting props for duplicate events (not preferred)

0 Karma
Get Updates on the Splunk Community!

Earn a $35 Gift Card for Answering our Splunk Admins & App Developer Survey

Survey for Splunk Admins and App Developers is open now! | Earn a $35 gift card!      Hello there,  Splunk ...

Continuing Innovation & New Integrations Unlock Full Stack Observability For Your ...

You’ve probably heard the latest about AppDynamics joining the Splunk Observability portfolio, deepening our ...

Monitoring Amazon Elastic Kubernetes Service (EKS)

As we’ve seen, integrating Kubernetes environments with Splunk Observability Cloud is a quick and easy way to ...