Getting Data In

How to index certain logs only during a certain time range (6am - 6pm)?

agoktas
Communicator

Hello,

I have 4 log files on one Host that I want to index/ingest.

Log #1, #2, #3 will be ingested 24 hours a day, but log file #4 shares a batch process in the evening that has 20 - 30GB per evening of events that is not needed, nor do we want to pay for - because I wouldn't use them at this point in time.

I want to avoid stopping the Splunk Universal Forwarder Windows service from 6pm to 6am because that would mean that logs #1, #2, & #3 will not index. Also, I believe it would pool up in the fish bucket anyway, so that will null my effort to exclude indexing from 6pm to 6am for log #4.

Any ideas how I can avoid indexing log #4 from 6pm to 6am (night time batch window)?

Thanks!

Tags (2)
0 Karma
1 Solution

MuS
Legend

Hi agoktas,

there are multiple ways of achieve this:

Update:
Following up all comments, this was the final working config:

Here is the final config that looks to be working great (we forgot '00' for the midnight hour):
Indexer configuration:

props.conf

[AppInternal]
TRANSFORMS-null= Appsetnull

transforms.conf

#Discard all events between 6pm - 6am
[Appsetnull]
REGEX = (?:d+/d+/d+|d+-d+-d+)s(18|19|20|21|22|23|00|01|02|03|04|05):
DEST_KEY = queue
FORMAT = nullQueue

Hope this helps ...

cheers, MuS

View solution in original post

MuS
Legend

Hi agoktas,

there are multiple ways of achieve this:

Update:
Following up all comments, this was the final working config:

Here is the final config that looks to be working great (we forgot '00' for the midnight hour):
Indexer configuration:

props.conf

[AppInternal]
TRANSFORMS-null= Appsetnull

transforms.conf

#Discard all events between 6pm - 6am
[Appsetnull]
REGEX = (?:d+/d+/d+|d+-d+-d+)s(18|19|20|21|22|23|00|01|02|03|04|05):
DEST_KEY = queue
FORMAT = nullQueue

Hope this helps ...

cheers, MuS

vinnithenose
Loves-to-Learn

What would the Regex look like to discard events on just Saturday from 12AM to 4AM?

Thanks.

0 Karma

agoktas
Communicator

Hi MuS,

From my understanding, you can only blacklist files or a regex value for a source's/file's content.

But I don't see anything where you can configure blacklist time frames.

Am I just not seeing the documentation pertaining to this?

Have you set something like this up before?

Thanks!

0 Karma

MuS
Legend

Sorry my bad, go for the nullQueue filtering solution from the docs http://docs.splunk.com/Documentation/Splunk/6.3.1511/Forwarding/Routeandfilterdatad#Filter_event_dat... Hopefully you will have some unique identifier for the un-needed events. Do this on the indexer and re-start splunk.

otherwise the only solution time wise, would be an external cron job that stops the universal forwarder, checks this log 4 for the end of the batch process and echo "" > log.4 and restarts the universal forwarder again.....

0 Karma

agoktas
Communicator

I believe I'm on the last piece of this puzzle...

I have 2 servers involved, and only one of them needs to have the events for a particular set of logs sent to the nullQueue during the 6pm - 6am time window.

So that means I need to know how to specify the particular hostname + the log name in the example provided in http://docs.splunk.com/Documentation/Splunk/6.3.1511/Forwarding/Routeandfilterdatad#Filter_event_dat....

Because in the example of the link above, it only is specifying the log name/path as the source. How do I add the host as well?

Any ideas how I do this? Can you provide an example?

0 Karma

MuS
Legend

How about adding the host name to the regex? Because the props.conf stanza can be either source, sourcetype or host .....

0 Karma

agoktas
Communicator

In essence, I would need to be both: host & source instead of just one.

When broth criteria are met, then this would apply for these events.

Here is what I was thinking for props.conf:

[host::HOSTA]
[source::(?i)systemout.log.log$|systemoutServerA.log$]
TRANSFORMS-null= Applicationsetnull

But I'm not sure if that will work. Perhaps both 's need to be on the same line? Comma delimited? Even if possible?

Thoughts?

Thanks.

0 Karma

MuS
Legend

easiest way to achieve that, would be to assign a different sourcetype to the log that needs to be excluded or as stated before use the host in the regex like in this example http://docs.splunk.com/Documentation/Splunk/5.0/Data/Advancedsourcetypeoverrides#Example:_Assign_a_s...

0 Karma

agoktas
Communicator

You da man MuS!

Here is the final config that looks to be working great (we forgot '00' for the midnight hour):
Indexer configuration:

Props.conf:

[AppInternal]
TRANSFORMS-null= Appsetnull

Transforms.conf:

#Discard all events between 6pm - 6am
[Appsetnull]
REGEX = (?:\d+\/\d+\/\d+|\d+-\d+-\d+)\s(18|19|20|21|22|23|00|01|02|03|04|05):
DEST_KEY = queue
FORMAT = nullQueue

Thanks again for all your help! 🙂

MuS
Legend

You're welcome 🙂 I've updated the answer so please accept it - thx ! And don't forget to up-vote @rich7177 he started this with his idea to filter based on time 😉 !!

0 Karma

agoktas
Communicator

In this case, it will need to be host & source combined because I have 2 servers.

Host A will need to restrict 6pm - 6am for logging with Log A (systemout.log) & Log B (systemoutServerA.log).

Host B doesn't need the time restriction for Log A (systemout.log) & Log B (systemoutServerB.log).

If both servers had the same restriction needs, I would be home free. But since one is able to ingest 24hours a day, then it throws a wrench in the works.

Thanks.

0 Karma

agoktas
Communicator

That sounds perfect.

In fact, I'm now remembering a Splunk sales engineer mentioning this a while back for a similar situation. 🙂

I'll give this a shot. This should work just fine.

By any chance, would you happen to know the regex value for greater than 6pm & less than 6am?

Thanks!

0 Karma

MuS
Legend

Can you provide some examples of the events containing the time?

0 Karma

agoktas
Communicator

Absolutely.

There are actually 2 files that I will be dealing with (log #4 & log #5). Here are examples of each:

Log #4 example:
12,User:R_getStuff:1234567:id,com.company.demographics.app.inside.pf.Addid,user,OK,2015/12/09 11:42:48:477,2015/12/09 11:42:48:477,0
Log $5 example:
2015-12-09 11:43:10,801 DEBUG - _standard | Entering Summary2.inc | User: blah| Koid:CHOOSE_ACCOUNT:1234567:blah blah

The positioning of the date/time stamp are in different spots, but that shouldn't be a problem.
The date/time is formatting different between the two, but that shouldn't matter because I'm only looking at the time hour & minute - which is formatted the same of course. 🙂 So only 1 regex value needed for both stanzas that applies for only hour & minute.

Thanks!

0 Karma

MuS
Legend

Based on the examples and assuming you will have 24 hours in the logs(?) try this regex:

(?:\d+\/\d+\/\d+|\d+-\d+-\d+)\s(07|08|09|10|11|12|13|14|15|16|18):

The first group will match both possible date formats and the second group will macht any hour from 07 til 18 ..... Does that makes sense?

0 Karma

agoktas
Communicator

Since this value would send it to the nullQueue, I'm guessing we would do this instead (batch window):

(?:\d+\/\d+\/\d+|\d+-\d+-\d+)\s(19|20|21|22|23|01|02|03|04|05|06)

So the stanza would look like:

[setnull]
REGEX = (?:\d+\/\d+\/\d+|\d+-\d+-\d+)\s(19|20|21|22|23|01|02|03|04|05|06)
DEST_KEY = queue
FORMAT = nullQueue

Does that look right?

0 Karma

MuS
Legend

HeHe, I keep messing up things here (the time range this time) 🙂
That's the transforms.conf and it looks good, don't forget the props.conf to match it to source and place it on either a heavy weight forwarder or an indexer and restart Splunk after the change.

0 Karma

agoktas
Communicator

By the way, does the end of that regex value need to have a colon? I noticed you had it in your first example.

Please verify this is correct?
(?:\d+\/\d+\/\d+|\d+-\d+-\d+)\s(19|20|21|22|23|01|02|03|04|05|06):

0 Karma

MuS
Legend

This is just to make sure it does match the timestamp, it will be the : between the hour and the minute. If you are 1000% sure there are no other events containing something like 1234-12-12 20 foo or 1234/12/12 20 foo you won't need it....otherwise add it.

0 Karma

agoktas
Communicator

Cool. I'll give this a shot.

We have 1 indexer/search head and I'll configure both the props.conf & transforms.conf there.

I'll restart the indexer and see how things go. I'll probably be doing this tomorrow and will update this thread on how it turns out.

Thanks so much for your help! 🙂

0 Karma
Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...