Getting Data In

help in reading my csv file

splunkn
Communicator

I want to monitor a csv file which generated through a script and producing output as below

Below am having 4 columns one with id one with date,one with description and one with explanation in some kind of xml stuffs

123,2016-07-07 05:00:00,gooddata,somexmldata
123,2016-07-07 06:00:00,baddata,somexmldata
123,2016-07-07 07:00:00,gooddata,somexmldata
123,2016-07-07 08:00:00,baddata,somexmldata

  1. How to monitor this csv file ( What are the things i need to have in my props & transforms) (Linebreaking/Timestamprule/...)
  2. I wanted to filter the rows which is having C column with the string "baddata". How can I send these rows to null queue

Any help appreciated !! Thanks in advance

Tags (1)
0 Karma

mattymo
Splunk Employee
Splunk Employee

Getting data in

http://docs.splunk.com/Documentation/Splunk/6.4.2/Data/Howdoyouwanttoadddata

The oft overlooked add data wizard is a great tool to help with creating inputs and props via the gui. It allows you to play with the many props settings and see the result as you work, to ensure you get the desired outcome. Also the getting data in manual will cover indepth much of what I tried below in more detail.

alt text

To start, I put your csv fields into a text document then uploaded it to my splunk instance

alt text
alt text

I began by selecting the default csv sourcetype to build off of, then defined the csv header schema explicitly, which you may or may not have to do depending on whether the headers exist in the file.

alt text

Then I went to work on explicitly teaching splunk how to read the timestamp field.

alt text

You can then review your creation and export it to your clipboard for easy pasting to cli or just save it to the local instance.

alt text
alt text

That would get the data coming in and getting indexed. A potential "gotcha" will be the characters in your xml string. If it has commas (not sure if xml contains commas off the top of my head) you will need to choose a different delimiter or insert one in pre-parsing.

Route & Filter

http://docs.splunk.com/Documentation/Splunk/6.4.2/Forwarding/Routeandfilterdatad

If in a distributed environment (using forwarders) be sure to put the props on your forwarder along with your inputs.conf (again something you can use the gui to create a template for, see getting data in link above for file monitor options) if you are in distributed environment to ensure that you can nullqueue the fields values you don't want.

    Caveats for routing and filtering structured data[edit]
    Splunk Enterprise does not parse structured data that has been forwarded to an indexer[edit]
    When you forward structured data to an indexer, Splunk Enterprise does not parse this data once it arrives at the indexer, even if you have configured props.conf on that indexer with INDEXED_EXTRACTIONS and its associated attributes. Forwarded data skips the following queues on the indexer, which precludes any parsing of that data on the indexer:

     parsing
     aggregation
     typing

The forwarded data must arrive at the indexer already parsed. To achieve this, you must also set up props.conf on the forwarder that sends the data. This includes configuration of INDEXED_EXTRACTIONS and any other parsing, filtering, anonymizing, and routing rules. Universal forwarders are capable of performing these tasks solely for structured data. See "Forward data extracted from header files".

There is a good example in the docs that should cover what you need:

Discard specific events and keep the rest
    This example discards all sshd events in /var/log/messages by sending them to nullQueue:

    1. In props.conf, set the TRANSFORMS-null attribute:

    [source::/var/log/messages]
    TRANSFORMS-null= setnull
    2. Create a corresponding stanza in transforms.conf. Set DEST_KEY to "queue" and FORMAT to "nullQueue":

    [setnull]
    REGEX = \[sshd\]
    DEST_KEY = queue
    FORMAT = nullQueue

I haven't tried it yet, but I am assuming regex provided in the above documentation would work for your scenario by filtering whatever baddata looks like, or there may be an option to reference the field name from your header in the regex. to tighten it up. Would have to test or have one of our other splunk community superstars chime in.

Hope this helps get you started! I will update if/when I have a chance to test the fitering in the lab.

- MattyMo
0 Karma
Get Updates on the Splunk Community!

New Case Study Shows the Value of Partnering with Splunk Academic Alliance

The University of Nevada, Las Vegas (UNLV) is another premier research institution helping to shape the next ...

How to Monitor Google Kubernetes Engine (GKE)

We’ve looked at how to integrate Kubernetes environments with Splunk Observability Cloud, but what about ...

Index This | How can you make 45 using only 4?

October 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with this ...