Splunk Search

how to set _time from Hive field

toabhishek16
New Member

Dear All,

I am using Hive 0.14 and Hunk 6.2. I am able to process the data in Hive tables through Hunk. but I am facing a problem that Hunk is not extracting the correct time in _time field. I have a field in Hive table rt which contains the time in epoch format. how I can assign this value (value of rt field) to _time field, so that I can run time based queries also.

please help me.

Thanks
Abhishek

Tags (3)
0 Karma

hyan_splunk
Splunk Employee
Splunk Employee

A better way is to use index time extraction and force your time field to be a required field.

So in the vix stanza of indexes.conf
[vix]
vix.input.1.required.fields = yourTimeField

and then use props.conf to configure index time timestamp extraction
[sourcetype]
TIME_PREFIX="yourTimeField":
TIME_FORMAT = %s

The makes time-based partition pruning work because:
a) we assume that the files within a directory contain time from the range extracted from the path
b) the value of _time is known to be coming from the events (ie not modified), which due to (a) have to be within the time range of the path

0 Karma

hyan_splunk
Splunk Employee
Splunk Employee

Hunk converts all hive data to json format.

The following line disables index-time timestamping:
DATETIME_CONFIG = NONE

To convert your epoch format time field, do this:
EVAL-_time = strptime(yourTimeField, "%s")

Here is various strptime formats supported by Splunk
http://docs.splunk.com/Documentation/Splunk/latest/Data/Configuretimestamprecognition#Enhanced_strpt...

elin
Splunk Employee
Splunk Employee

You can disable index-time timestamping for your hive table and then extract _time as a search-time field.

Here's an example for data in json format: http://docs.splunk.com/Documentation/Hunk/latest/Hunk/Setupavirtualindex#Edit_props.conf_.28optional...

Or if you prefer to use the UI, you can also use the HDFS Explorer to adjust time stamps when you configure your HDFS source. http://docs.splunk.com/Documentation/Hunk/latest/Hunk/ConfigureHDFS

0 Karma

toabhishek16
New Member
  1. data is not partitioned in hive by time.
  2. Hunk return the field and value, for example

table have schema: field1 string, field2 bigint

than it is returning
field1: value1
field2: value2

my question here is how I can tell Hunk to get _time field from field2 which contains epoch time.

0 Karma

Ledion_Bitincka
Splunk Employee
Splunk Employee

Two questions first though:

  1. is the data in Hive partitioned by time in HDFS? If so, what does the partition scheme look like, an example would be great
  2. what do the events returned by Hunk look like? Again, an anonymized example should be sufficient
0 Karma
Register for .conf21 Now! Go Vegas or Go Virtual!

How will you .conf21? You decide! Go in-person in Las Vegas, 10/18-10/21, or go online with .conf21 Virtual, 10/19-10/20.