Getting Data In

With a Splunk forwarder, what are the best ways to control a format's data?

oxthon
New Member

Hello everyone,

I hope you are fine.

So I have a question about the indexing of data in Splunk and especially the control of the data.

My configuration is an indexer distributed with a forwarder.

I receive data from a remote mount. These are CSV files.

An example of structure:
date, host, ipv4, ipv6, dns, nb_packet, size, ....

line 125: ipv4=12.32.45.255 => right
line 356: ipv4= 42.hello!.84.125 => wrong so go in index=error please and hurry up 🙂

I would like to control the content of the data. For example, that the format of ipv4 is good.
is it possible for each field to control the format of its value in transform.conf or props.conf?

Today, I run my CSV python (panda) to control them.

Is Splunk able to do it?

If you have an example with a CSV with two or three fields, I'm interested.

I thank you a thousand times.

Oxthon.

0 Karma

MuS
Legend

Hi oxthon,

Well looking at this in a pure technical way it is of course possible to do this, does it make sense to do it ¯\_(ツ)_/¯ Most likely better to follow @ddrillic 's advice and make sure those events do not come into Splunk in the first place.

But back to your question, you can use a props & transforms setup to check if an event contains a valid IP and put it into index=right anything with an invalid IP will go into index=error. This setup can be based on source, sourcetype or host.

Try something like this:

props.conf

[your sourcetype name here]
TRANSFORMS - 000-sourcetypeName-routing-based-on-ip = 001-sourcetypeName-routing, 000-default-errorRouting

transforms.conf

[000-default-errorRouting]
REGEX = =\s[\d\w\.!]+\s
DEST_KEY =  _MetaData:index
FORMAT = error

[001-sourcetypeName-routing]
REGEX = (?:\d{1,3}\.){3}\d{1,3}
DEST_KEY =  _MetaData:index
FORMAT = right

These setting must go onto the parsing Splunk instance (HWF or indexer) and you need to restart this instance.

Hope this helps ...

cheers, MuS

ddrillic
Ultra Champion

-- control the content of the data.

It's not part of the product. Actually, the product prides itself in allowing any data through and therefore, I would place safeguards before the data is being ingested, meaning, before it reaches Splunk.

Get Updates on the Splunk Community!

New in Observability - Improvements to Custom Metrics SLOs, Log Observer Connect & ...

The latest enhancements to the Splunk observability portfolio deliver improved SLO management accuracy, better ...

Improve Data Pipelines Using Splunk Data Management

  Register Now   This Tech Talk will explore the pipeline management offerings Edge Processor and Ingest ...

3-2-1 Go! How Fast Can You Debug Microservices with Observability Cloud?

Register Join this Tech Talk to learn how unique features like Service Centric Views, Tag Spotlight, and ...