Archive

Why do I get different performance in Hunk based on how I specify time?

Builder

We have a setup with MapR File System and Splunk Hadoop for Analytics (HUNK) ...MapRFS is using an NFS mount to have all logs centralized.

We have our virtual index set up with this file directory format:
maprdata/sourcetype/year/month/day/hour/foo.log

I have included the indexes. conf file at the bottom without the provider.

When I run a basic index search such as index=mapr1 | stats count, for the last 3 hours using Splunk Web, it runs optimally. It finds the relevant events quickly and then finished. This is ideal.

When I run the same search for a specific hour in the past (say 3 hours ago) either using the Splunk Web or "earliest=-3h@h latest=-2h@h", it will search through a very large number of events and then find events finally (and its not the correct number of events even).

[mapr1]
vix.input.1.accept =
vix.input.1.et.format = yyyyMMddHH
vix.input.1.et.regex = /user/mapr/maprdata/.?/(\d+)/(\d+)/(\d+)/(\d+)/.
vix.input.1.lt.format = yyyyMMddHH
vix.input.1.lt.offset = 3600
vix.input.1.lt.regex = /user/mapr/maprdata/.?/(\d+)/(\d+)/(\d+)/(\d+)/.
vix.input.1.path = /user/mapr/maprdata/${sourcetype}/...
vix.provider = maproly

0 Karma
1 Solution

Builder
0 Karma

Builder
0 Karma

Contributor

are you receiving the data in a json , generated by some application, in that case can you change the timestamp field at the json schema from string to long.
probably your issue is like below:
your Timestamp:
timestamp: "276257257257265"
Splunk expects:
timestamp: 276257257257265

Alternatively, you can handle this via a calculated field. For example, you could add this to props.conf:
EVAL-_time = strptime(timestamp, "%s")
However "%s" expects a 10-digit epoch time string, so you would probably need to use substr/trim too. Hence in this case if possible, it is best to get the timestamp type changed before it reaches splunk.

0 Karma