Getting Data In

How do I remove fields from VMWare Add-on before indexing?

AdamHolmes
New Member

I'm currently receiving an excess amount of data from the VMWare app sample below and would like to only keep a few of the fields before being indexed. Is there a way to do this?

_raw: vm-1111 501170cc-8439-1cb3-04ba-8dc34434b33c 4001 20 0 0 0 0 0 0 0 21 0 0 0 0 21
Field Extractions:
p_average_net_bytesRx_kiloBytesPerSecond 0

p_average_net_bytesTx_kiloBytesPerSecond 0

p_average_net_received_kiloBytesPerSecond 0
p_average_net_transmitted_kiloBytesPerSecond 0

p_average_net_usage_kiloBytesPerSecond 0

p_summation_net_broadcastRx_number 21

p_summation_net_broadcastTx_number 0

p_summation_net_droppedRx_number 0

p_summation_net_droppedTx_number 0

p_summation_net_multicastRx_number 0

p_summation_net_multicastTx_number 0

p_summation_net_packetsRx_number 21
p_summation_net_packetsTx_number 0

I'm looking to only keep these fields before being indexed (for example)

p_average_net_received_kiloBytesPerSecond 0
p_average_net_transmitted_kiloBytesPerSecond 0

p_summation_net_droppedRx_number 0

p_summation_net_droppedTx_number 0

p_summation_net_packetsRx_number 21
p_summation_net_packetsTx_number 0

0 Karma

solarboyz1
Builder

You can route to nullqueue based on patterns in the events you want to drop:

https://docs.splunk.com/Documentation/Splunk/latest/Forwarding/Routeandfilterdatad

The following would prevent any events with the string p_summation_net from getting indexed.

props.conf:

[vmware:sourcetype]
TRANSFORMS-null = drop_p_avg, drop_p_summation

transforms.conf:

[drop_p_avg]
REGEX = p_average_net_
DEST_KEY = queue
FORMAT = nullQueue

These would need to be placed on your indexers.

0 Karma

AdamHolmes
New Member

Tried this approach, I created a test message (in JSON format for example)
_raw: {"message": "Running ITBSA Common Module", "field1": "some text", "state": "OK"}

On the Search Head / Indexer (my test system is a combined one)
Updated file: /opt/splunk/etc/system/local/props.conf
[common]
TRANSFORMS-null = drop_message

Update file: /opt/splunk/etc/system/local/transforms.conf
[drop_message]
REGEX = state
DEST_KEY = queue
FORMAT = nullQueue

I restarted splunkd and now no data is coming in.

On a forwarder I have this specified to create test data
[script://./bin/common.py]
source = monitoring::test
sourcetype = common

0 Karma

solarboyz1
Builder

ALL messages, or just all messages of the sourcetype common?

If ALL messages, not sure your issues.

if messages of the sourcetype common is the issue, the problem could be your REGEX is matching more than expected.

0 Karma

AdamHolmes
New Member

I need the raw input go from
{"message": "Running ITBSA Common Module", "field1": "some text", "state": "OK"}
to
{"field1": "some text", "state": "OK"}

from the specficied sourcetype. However, the catch is that the true data that is coming in does not fit that format it looks like a tab separated data
_raw: vm-1111 501170cc-8439-1cb3-04ba-8dc34434b33c 4001 20 0 0 0 0 0 0 0 21 0 0 0 0 21

0 Karma

solarboyz1
Builder

I misunderstood, thought you were looking to get rid of the events, not the specific fields.

If you want to get rid of specific fields, you probably want to look at SEDCMD-
http://docs.splunk.com/Documentation/Splunk/7.1.0/Admin/Propsconf#Field_extraction_configuration

You should be able to use Sed like syntax to remove the unwanted data

Something like:

props.conf
[sourcetype]
SEDCMD-removeunwanted1 = s/{[^:]+?:[^:]+?/{/

0 Karma

AdamHolmes
New Member

That's what I was leaning towards but as the data is 'tab' separated I was unsure on how the field extractions would handle that. I was hoping to specify just the field names to be excluded.
Along with writing the regex would be just 'fun'
Example data and would need to remove the bold (if I have to deal with raw data)
vm-125620 5006a450-f3f4-3794-ecb7-a50b97a8bec4 vmnic5 20 0 0 0 0 0 0 0

vm-1111 501170cc-8439-1cb3-04ba-8dc34434b33c 4000 20 58738 1108 1108 0 379 0 379 11612 0 0 0 1487 0
vm-163268 5006d319-719a-d56c-3e3c-eb1cab4163de aggregated 20 656 2 2 0 91 0 91 1591 0 0 0 94 60

0 Karma

solarboyz1
Builder

Assuming the field at the front is the VMware eventId...you might need to create one per eventId:

SEDCMD-vm125620 = s/(vm-125620)\s(\w)\s(\w)\s(\w)\s(\w)\s(\w)\s(\w)\s(\w)\s(\w)\s(\w)\s(\w)/$1 $2 $3 $4 $5 $6 $7 $8 $10/

0 Karma

solarboyz1
Builder

Correction, should have used back references \1 instead of variables $1 in the SEDCMD:

SEDCMD-vm125620 = s/(vm-125620)\s(\w)\s(\w)\s(\w)\s(\w)\s(\w)\s(\w)\s(\w)\s(\w)\s(\w)\s(\w)/\1 \2 \3 \4 \5 \6 \7 \8 \10/

0 Karma
Get Updates on the Splunk Community!

The Splunk Success Framework: Your Guide to Successful Splunk Implementations

Splunk Lantern is a customer success center that provides advice from Splunk experts on valuable data ...

Splunk Training for All: Meet Aspiring Cybersecurity Analyst, Marc Alicea

Splunk Education believes in the value of training and certification in today’s rapidly-changing data-driven ...

Investigate Security and Threat Detection with VirusTotal and Splunk Integration

As security threats and their complexities surge, security analysts deal with increased challenges and ...