Getting Data In

Large apache log files

mq20123167
New Member

Hello!

I'm new to Splunk and just getting my head around it all.

Our company is already using Splunk and we are considering using it on an apache server to gather web statistics in a similar fashion to AWstats.

We have enabled a log rotation on our server and we have 1 month worth of logs that is rotated. My concern is that once the apache server deletes the logs older then one month then I assume we will no longer be able to be search on that old information through splunk.

Ideally I would like 6-12 months worth of data. We have already racked up 645,000 events in a single month.

If we saved our logs somewhere else and got splunk to review our 6-12 months of data we would be going over a few million events. If splunk the right tool for this job? Can it handle that number of events? Or is it mostly made for short term log analysis?

Tags (1)
0 Karma
1 Solution

Ayn
Legend

First, regarding your concern - your assumption is incorrect, because Splunk doesn't work directly on the source files. What happens when you add a file/directory to be monitored by Splunk is that events are indexed - you could say they're copied to Splunk's index (database). Once that's done, it doesn't matter what happens to the source file. The events are in the index, and will be indefinitely (or at least for as long as you've told Splunk to keep events).

There's really no limit to how many events Splunk can handle. Many use it for analysis of huge amounts of data spanning over several years. There are Splunk deployments out there indexing several terabytes of data each day. For that kind of deployment you obviously can't just put your one so-so specced Splunk indexer, but you can scale your deployment easily by adding more indexers and other Splunk instances as you go.

View solution in original post

Ayn
Legend

First, regarding your concern - your assumption is incorrect, because Splunk doesn't work directly on the source files. What happens when you add a file/directory to be monitored by Splunk is that events are indexed - you could say they're copied to Splunk's index (database). Once that's done, it doesn't matter what happens to the source file. The events are in the index, and will be indefinitely (or at least for as long as you've told Splunk to keep events).

There's really no limit to how many events Splunk can handle. Many use it for analysis of huge amounts of data spanning over several years. There are Splunk deployments out there indexing several terabytes of data each day. For that kind of deployment you obviously can't just put your one so-so specced Splunk indexer, but you can scale your deployment easily by adding more indexers and other Splunk instances as you go.

mq20123167
New Member

Thanks Ayn, appreciate your help with this.

0 Karma
Get Updates on the Splunk Community!

What's new in Splunk Cloud Platform 9.1.2312?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.1.2312! Analysts can ...

What’s New in Splunk Security Essentials 3.8.0?

Splunk Security Essentials (SSE) is an app that can amplify the power of your existing Splunk Cloud Platform, ...

Let’s Get You Certified – Vegas-Style at .conf24

Are you ready to level up your Splunk game? Then, let’s get you certified live at .conf24 – our annual user ...