Solved: About the license when using SEDCMD

yutaka1005 · ‎12-10-2017

In my environment, as for the "csv" data to be captured,
The column that is not needed is dropped using SEDCMD.

For example, the following example excludes the third column "description".

example

Data
time,ipaddress,description
YYYY/mm/dd HH:MM:SS,192.x.x.x,this is ...
YYYY/mm/dd HH:MM:SS,172.x.x.x,this is ...
YYYY/mm/dd HH:MM:SS,10.x.x.x,this is ...

props.conf
SEDCMD-test = s/([^,]*),([^,]*),([^,]*)/\1,\2/g

When searching, it seems that the third column "description" was excluded from the displayed raw event.

But in the field list, "description" exists, and the field values corresponding to each event also remained as data.

As for the order of processing, I think that SEDCMD will move first than license calculation.
However, at the time of searching, it seemed that the data of the excluded column was captured, so I thought that the usage of licenses would not change.

Will I can reduce license usage by the SEDCMD exclusion?

supersleepwalke · ‎01-17-2018

I just tested this, and yes, it does reduce the license usage.

How I tested: Take one log file, and ingest it twice. Once to a normal sourcetype, and again to a sourcetype call "sed_yes2". Props for sed_yes2 are as follows:

[sed_yes2]
SEDCMD-yes = s/[^0-9]//g

This removes all characters but numbers. This way, I can see there is actually contents, but the lines are much smaller.

Here is the output from check Splunk's license usage. (Ignore the sed_yes sourcetype. On my first try, I typo'ed the SEDCMD:

View solution in original post

FrankVl · ‎01-18-2018

The reason for the description field to still show up is probably because you apply INDEXED CSV extractions on the data and that takes place before the SEDCMD is applied?

Interesting question whether those indexed extractions count against your license, or whether it is just the raw event that counts...

supersleepwalke · ‎01-17-2018

I just tested this, and yes, it does reduce the license usage.

How I tested: Take one log file, and ingest it twice. Once to a normal sourcetype, and again to a sourcetype call "sed_yes2". Props for sed_yes2 are as follows:

[sed_yes2]
SEDCMD-yes = s/[^0-9]//g

This removes all characters but numbers. This way, I can see there is actually contents, but the lines are much smaller.

Here is the output from check Splunk's license usage. (Ignore the sed_yes sourcetype. On my first try, I typo'ed the SEDCMD:

yutaka1005 · ‎02-21-2018

Splunk support said that the amount of license usage will change depending on SEDMCD too.

Thank you for answering.

yutaka1005 · ‎01-17-2018

Thank you for answer supersleepwalker!

Yeah!
I tried the same way in the same way, but the license usage seemed to be decreasing on the log!
I would like someone in Splunk to tell which is right, if possible.

Elsurion · ‎12-19-2017

Hi Yutaka

One way to filter the events before hitting the index is to filter them trough a heavy forwarder.

So your normal forwarder is forwarding you csv to this heavy forwarder, which will strip the unwanted fields and then forward the reduced data load itself to the indexer.
Since i have here a few garbage generators, this is a good way to reduce the garbage to a suitable amount and save some volume on the daily license.

ledion · ‎12-10-2017

SEDCMD modifies the contents of the event before it hits the index - therefore before the license. Are you saying that for the events for which description was stripped the field is present at search time? If so, what is the source type of the data - "csv" by any chance? If so, there is extra index time processing done for them (csv and json) - I'm not sure what's used against the license then, maybe someone from splunk support can clarify?

yutaka1005 · ‎12-11-2017

Thank you for answer.

Yes, I said that "discription" column that was exclude by SEDCMD was appeared in fields list when I searched.

And data format is "csv".

About the license when using SEDCMD

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

ATTENTION: We’re Moving! (AGAIN!)

Deep Dive: Optimizing Telemetry Pipelines in Splunk Observability Cloud

Announcing Modern Navigation: A New Era of Splunk User Experience

Join the Conversation