Getting Data In

Index past logs under one source

gnovak
Builder

I was wondering: Is there a way to index past logs and still have them show up as just one source?

Example:

I have a directory with a bunch of logs in it. They look like:

BEK02132013.log
BEK02142013.log
BEK02152013.log
BEK02162013.log
BEK02172013.log
BEK02182013.log
....
....
etc., etc.,

So a new log is made every day with the date in it. This means if I setup a monitor inputs for this directory, all the files are indexed. They all show up as a source and this makes my source list huge!

Considering there are timestamps in the logs, I was wondering is there a way for all these logs to just be under one source? Example: All this data is under the source BEK.log.

Tags (1)
0 Karma
1 Solution

kristian_kolb
Ultra Champion

Yes, that can be done, but it will not alter already indexed data, just new stuff coming in.

Assuming you have an inputs.conf that looks like this;

[monitor:///var/logs/BEKLOGS]
index = blah
sourcetype = bek

you would want to have a props.conf entry like this;

[bek]
TRANSFORMS-foo = set_bek_source

and a transforms.conf like this

[set_bek_source]
REGEX = .
DEST_KEY = MetaData:Source
FORMAT = source::BEK.log

For more examples, see:

http://splunk-base.splunk.com/answers/5544/override-source-tcpxxxx-of-a-tcp-input-using-transforms
http://docs.splunk.com/Documentation/Splunk/5.0.2/Admin/Transformsconf

Hope this helps,

Kristian

View solution in original post

gkanapathy
Splunk Employee
Splunk Employee

If you're just setting the source to a static value, you can do it via transforms, as kristian.kolb suggested, but it might be simpler to just do:

[monitor:///my/log/path/BEK*.log]
sourcetype=BEKlogs
source=BEK.log

However, if you're going to do that, I might just suggest you ignore "source" and use "sourcetype" anyway. If on the other hand, you want to preserve part of the source path, e.g., you are monitoring files like:

[monitor:///my/path/*/logs/BEK*.log]

and you want the source to read like /my/path/group1/logs/BEK.log, you would use kristian's method of a transform, but you would need a more complex REGEX and FORMAT to extract and use the appropriate parts of the source you want.

kristian_kolb
Ultra Champion

aah, I knew there was a simpler way... just never done much of source overriding, just index, host etc.

0 Karma

kristian_kolb
Ultra Champion

Yes, that can be done, but it will not alter already indexed data, just new stuff coming in.

Assuming you have an inputs.conf that looks like this;

[monitor:///var/logs/BEKLOGS]
index = blah
sourcetype = bek

you would want to have a props.conf entry like this;

[bek]
TRANSFORMS-foo = set_bek_source

and a transforms.conf like this

[set_bek_source]
REGEX = .
DEST_KEY = MetaData:Source
FORMAT = source::BEK.log

For more examples, see:

http://splunk-base.splunk.com/answers/5544/override-source-tcpxxxx-of-a-tcp-input-using-transforms
http://docs.splunk.com/Documentation/Splunk/5.0.2/Admin/Transformsconf

Hope this helps,

Kristian

kristian_kolb
Ultra Champion

just to clarify; if you use a heavy forwarder, the props and transforms should go there and not on the indexer.

For universal or lightweight forwarder, the settings should be on the indexer.

gnovak
Builder

I have the inputs on the forwarder and made entries in props.conf and transforms.conf on the indexer. So far don't have the logs showing up but will look at things. I had a crc salt error so i added crcsalt = and now it's working! Thanks so much for your assistance.

0 Karma

gnovak
Builder

Ok i will try this. I assumed it used transforms but wasn't sure the exact way to go about it. Let me test this.

0 Karma
Get Updates on the Splunk Community!

Automatic Discovery Part 1: What is Automatic Discovery in Splunk Observability Cloud ...

If you’ve ever deployed a new database cluster, spun up a caching layer, or added a load balancer, you know it ...

Real-Time Fraud Detection: How Splunk Dashboards Protect Financial Institutions

Financial fraud isn't slowing down. If anything, it's getting more sophisticated. Account takeovers, credit ...

Splunk + ThousandEyes: Correlate frontend, app, and network data to troubleshoot ...

 Are you tired of troubleshooting delays caused by siloed frontend, application, and network data? We've got a ...