Getting Data In

With a Splunk forwarder, what are the best ways to control a format's data?

oxthon
New Member

Hello everyone,

I hope you are fine.

So I have a question about the indexing of data in Splunk and especially the control of the data.

My configuration is an indexer distributed with a forwarder.

I receive data from a remote mount. These are CSV files.

An example of structure:
date, host, ipv4, ipv6, dns, nb_packet, size, ....

line 125: ipv4=12.32.45.255 => right
line 356: ipv4= 42.hello!.84.125 => wrong so go in index=error please and hurry up 🙂

I would like to control the content of the data. For example, that the format of ipv4 is good.
is it possible for each field to control the format of its value in transform.conf or props.conf?

Today, I run my CSV python (panda) to control them.

Is Splunk able to do it?

If you have an example with a CSV with two or three fields, I'm interested.

I thank you a thousand times.

Oxthon.

0 Karma

MuS
SplunkTrust
SplunkTrust

Hi oxthon,

Well looking at this in a pure technical way it is of course possible to do this, does it make sense to do it ¯\_(ツ)_/¯ Most likely better to follow @ddrillic 's advice and make sure those events do not come into Splunk in the first place.

But back to your question, you can use a props & transforms setup to check if an event contains a valid IP and put it into index=right anything with an invalid IP will go into index=error. This setup can be based on source, sourcetype or host.

Try something like this:

props.conf

[your sourcetype name here]
TRANSFORMS - 000-sourcetypeName-routing-based-on-ip = 001-sourcetypeName-routing, 000-default-errorRouting

transforms.conf

[000-default-errorRouting]
REGEX = =\s[\d\w\.!]+\s
DEST_KEY =  _MetaData:index
FORMAT = error

[001-sourcetypeName-routing]
REGEX = (?:\d{1,3}\.){3}\d{1,3}
DEST_KEY =  _MetaData:index
FORMAT = right

These setting must go onto the parsing Splunk instance (HWF or indexer) and you need to restart this instance.

Hope this helps ...

cheers, MuS

ddrillic
Ultra Champion

-- control the content of the data.

It's not part of the product. Actually, the product prides itself in allowing any data through and therefore, I would place safeguards before the data is being ingested, meaning, before it reaches Splunk.

Get Updates on the Splunk Community!

Enterprise Security Content Update (ESCU) | New Releases

In the last month, the Splunk Threat Research Team (STRT) has had 2 releases of new security content via the ...

Announcing the 1st Round Champion’s Tribute Winners of the Great Resilience Quest

We are happy to announce the 20 lucky questers who are selected to be the first round of Champion's Tribute ...

We’ve Got Education Validation!

Are you feeling it? All the career-boosting benefits of up-skilling with Splunk? It’s not just a feeling, it's ...