Splunk Search

How can I get rid of thousands of automatically created sourcetypes

markgo
Engager

I've had the misfortune of feeding 30K input files from Amazon S3 Cloudfront logs into my live Splunk instance, without specifying a sourcetype.

This has created a serious problem in that it has resulted in thousands of automatically created variants of sourcetype-too-small from the bizarre headers that Amazon likes to use (note that the REAL data does not cause this issue).

As a result, performance has slowed to a crawl.

I've deleted the "bad" events, but is there something I can do about the bad automatically created sourcetypes?

As to why I didn't notice this--it didn't become a problem until the number of sourcetypes grew to a prodigous value. And since my searches excluded bad events, I never noticed the sourcetypes.

MuS
SplunkTrust
SplunkTrust

Hi markgo

I recently fixed that by adding this to my props.conf & transforms.conf:

**props.conf**
[default]
TRANSFORMS-meta = fix_auto_source

**transforms.conf**
[fix_auto_source]
SOURCE_KEY = MetaData:Source
DEST_KEY = MetaData:Source
REGEX = ^(/.*|.:.*)
FORMAT = source::splunktcp://25000

this changes all those automatically created sources to splunktcp://25000.

hope this helps a bit and don't forget to change the regex to match your pattern.

regards

Get Updates on the Splunk Community!

Fun with Regular Expression - multiples of nine

Fun with Regular Expression - multiples of nineThis challenge was first posted on Slack #regex channel ...

[Live Demo] Watch SOC transformation in action with the reimagined Splunk Enterprise ...

Overwhelmed SOC? Splunk ES Has Your Back Tool sprawl, alert fatigue, and endless context switching are making ...

What’s New & Next in Splunk SOAR

Security teams today are dealing with more alerts, more tools, and more pressure than ever.  Join us on ...