Splunk Search

How can I get rid of thousands of automatically created sourcetypes

markgo
Engager

I've had the misfortune of feeding 30K input files from Amazon S3 Cloudfront logs into my live Splunk instance, without specifying a sourcetype.

This has created a serious problem in that it has resulted in thousands of automatically created variants of sourcetype-too-small from the bizarre headers that Amazon likes to use (note that the REAL data does not cause this issue).

As a result, performance has slowed to a crawl.

I've deleted the "bad" events, but is there something I can do about the bad automatically created sourcetypes?

As to why I didn't notice this--it didn't become a problem until the number of sourcetypes grew to a prodigous value. And since my searches excluded bad events, I never noticed the sourcetypes.

MuS
SplunkTrust
SplunkTrust

Hi markgo

I recently fixed that by adding this to my props.conf & transforms.conf:

**props.conf**
[default]
TRANSFORMS-meta = fix_auto_source

**transforms.conf**
[fix_auto_source]
SOURCE_KEY = MetaData:Source
DEST_KEY = MetaData:Source
REGEX = ^(/.*|.:.*)
FORMAT = source::splunktcp://25000

this changes all those automatically created sources to splunktcp://25000.

hope this helps a bit and don't forget to change the regex to match your pattern.

regards

Get Updates on the Splunk Community!

Splunk Mobile: Your Brand-New Home Screen

Meet Your New Mobile Hub  Hello Splunk Community!  Staying connected to your data—no matter where you are—is ...

Introducing Value Insights (Beta): Understand the Business Impact your organization ...

Real progress on your strategic priorities starts with knowing the business outcomes your teams are delivering ...

Enterprise Security (ES) Essentials 8.3 is Now GA — Smarter Detections, Faster ...

As of today, Enterprise Security (ES) Essentials 8.3 is now generally available, helping SOC teams simplify ...