Getting Data In

vix.input.1.et.regex - Log directory only contains year, month, and day. How are searches affected if there is no hour?

yahoohunk
Explorer

When I perform a query "index=api" with date range for example 07/07/2015 - 07/07/2015, I only get results within the first second of midnight on 07/07/2015.

But if I perform the same query with date range 07/07/2015 - 07/08/2015, I get more results from the 07/07/2015 day. It includes more than just midnight.

Does not having hours and minutes in the directory affect the search? In Splunk, using the 07/07/2015-07/07/2015 gets me all results of that day.

My HDFS logs are partitioned by year/month/day.

virtual index
[api]
vix.provider = testprovider
vix.input.1.path = /projects/test/test_logs/api/...
vix.input.1.accept = .
vix.input.1.et.regex = /projects/test/test_logs/.+/(\d\d\d\d)/(\d\d)/(\d\d)/.+
vix.input.1.et.format = yyyyMMdd
vix.input.1.et.offset = 0
vix.input.1.lt.regex = /projects/test/test_logs/.+/(\d\d\d\d)/(\d\d)/(\d\d)/.+
vix.input.1.lt.format = yyyyMMdd
vix.input.1.lt.offset = 86400

Tags (2)
0 Karma
1 Solution

suarezry
Builder

Typically the events themselves would have timestamps. Did you configure timestamp recognition?

Configuring timestamp recognition in splunk

For example, event contains:

2015-07-10T13:40:51Z syslog.tcp {"message":"<166>2015-07-10T13:40:51.076Z somehost.somedomain Vpxa: [FFEDEB90 verbose 'VpxaHalCnxHostagent' opID=WFU-c8713dc4] [WaitForUpdatesDone] Received callback","client_host":"10.6.50.104"}

props.conf:

[source::/my/source/...]
sourcetype = hadoop
priority = 100
ANNOTATE_PUNCT = false
SHOULD_LINEMERGE = false
MAX_TIMESTAMP_LOOKAHEAD = 30
TIME_PREFIX = ^
TIME_FORMAT = %Y-%m-%dT%H:%M:%SZ
TZ=UTC

View solution in original post

suarezry
Builder

Typically the events themselves would have timestamps. Did you configure timestamp recognition?

Configuring timestamp recognition in splunk

For example, event contains:

2015-07-10T13:40:51Z syslog.tcp {"message":"<166>2015-07-10T13:40:51.076Z somehost.somedomain Vpxa: [FFEDEB90 verbose 'VpxaHalCnxHostagent' opID=WFU-c8713dc4] [WaitForUpdatesDone] Received callback","client_host":"10.6.50.104"}

props.conf:

[source::/my/source/...]
sourcetype = hadoop
priority = 100
ANNOTATE_PUNCT = false
SHOULD_LINEMERGE = false
MAX_TIMESTAMP_LOOKAHEAD = 30
TIME_PREFIX = ^
TIME_FORMAT = %Y-%m-%dT%H:%M:%SZ
TZ=UTC

yahoohunk
Explorer

I have not configured the timestamp recognition. Will try that and the timezone and see if it works.

0 Karma

Ledion_Bitincka
Splunk Employee
Splunk Employee

Also, make sure to set the same timezone in indexes.conf for vix.input.1.et/lt.timezone - in many cases the timezone of the data is GMT while the search is ran in a user specified timezone, e.g. PST

0 Karma

jeffland
SplunkTrust
SplunkTrust

"Does not having hours and minutes in the directory affect the search" - which directory is that? If splunk determines the timestamp for your events from the directory structure, then of course those are needed (and splunk will give events a midnight timestamp if only day is available).
Could you clarify how you run the three searches you mention above, with 1) from 07/07/2015 - 07/07/2015, 2) from 07/07/2015 - 07/08/2015 and 3) "In Splunk", and some example timestamps from those results?

0 Karma

yahoohunk
Explorer

The directory and file looks like the following in HDFS.
/projects/test/test_logs/api/2015/07/07/api_server.log.2015-07-07.gz

A line in the log looks like
[Thu Jul 09 02:03:02 2015] [error] [client 127.0.0.1] log={"messages": "test"}

Ran the search index="api"
Smart Mode

Used the "Date Range" option with Between "07/07/2015" and "07/07/2015".
Time Column Results

7/7/15
12:00:15.000 AM

Event Column Results
[Tue Jul 07 00:00:15 2015] [error] log={"messages": "test"}
29,000 events
In the results listings, I don't see anything beyond 00:00

Used the "Date Range" option with Between "07/07/2015" and "07/08/2015".
Time Column Results
7/7/15
12:00:15.000 AM

Event Column Results
[Tue Jul 07 00:00:15 2015] [error] log={"messages": "test"}
76,000,000 events
In the results listings, I don't see anything beyond 00:00

Question: It says 76 mil events matched, but results list only hour 0.

0 Karma

jeffland
SplunkTrust
SplunkTrust

That looks like a problem with your timestamp recognition.

0 Karma
Get Updates on the Splunk Community!

Join Us for Splunk University and Get Your Bootcamp Game On!

If you know, you know! Splunk University is the vibe this summer so register today for bootcamps galore ...

.conf24 | Learning Tracks for Security, Observability, Platform, and Developers!

.conf24 is taking place at The Venetian in Las Vegas from June 11 - 14. Continue reading to learn about the ...

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...