how to set _time from Hive field

toabhishek16 · ‎05-18-2015

Dear All,

I am using Hive 0.14 and Hunk 6.2. I am able to process the data in Hive tables through Hunk. but I am facing a problem that Hunk is not extracting the correct time in _time field. I have a field in Hive table rt which contains the time in epoch format. how I can assign this value (value of rt field) to _time field, so that I can run time based queries also.

please help me.

Thanks
Abhishek

hyan_splunk · ‎05-20-2015

A better way is to use index time extraction and force your time field to be a required field.

So in the vix stanza of indexes.conf
[vix]
vix.input.1.required.fields = yourTimeField

and then use props.conf to configure index time timestamp extraction
[sourcetype]
TIME_PREFIX="yourTimeField":
TIME_FORMAT = %s

The makes time-based partition pruning work because:
a) we assume that the files within a directory contain time from the range extracted from the path
b) the value of _time is known to be coming from the events (ie not modified), which due to (a) have to be within the time range of the path

hyan_splunk · ‎05-18-2015

Hunk converts all hive data to json format.

The following line disables index-time timestamping:
DATETIME_CONFIG = NONE

To convert your epoch format time field, do this:
EVAL-_time = strptime(yourTimeField, "%s")

Here is various strptime formats supported by Splunk
http://docs.splunk.com/Documentation/Splunk/latest/Data/Configuretimestamprecognition#Enhanced_strpt...

elin · ‎05-18-2015

You can disable index-time timestamping for your hive table and then extract _time as a search-time field.

Here's an example for data in json format: http://docs.splunk.com/Documentation/Hunk/latest/Hunk/Setupavirtualindex#Edit_props.conf_.28optional...

Or if you prefer to use the UI, you can also use the HDFS Explorer to adjust time stamps when you configure your HDFS source. http://docs.splunk.com/Documentation/Hunk/latest/Hunk/ConfigureHDFS

toabhishek16 · ‎05-18-2015

data is not partitioned in hive by time.
Hunk return the field and value, for example

table have schema: field1 string, field2 bigint

than it is returning
field1: value1
field2: value2

my question here is how I can tell Hunk to get _time field from field2 which contains epoch time.

Ledion_Bitincka · ‎05-18-2015

Two questions first though:

is the data in Hive partitioned by time in HDFS? If so, what does the partition scheme look like, an example would be great
what do the events returned by Hunk look like? Again, an anonymized example should be sufficient

how to set _time from Hive field

Splunk Mobile: Your Brand-New Home Screen

Introducing Value Insights (Beta): Understand the Business Impact your organization ...

Enterprise Security (ES) Essentials 8.3 is Now GA — Smarter Detections, Faster ...

Are you a member of the Splunk Community?

how to set _time from Hive field

Splunk Mobile: Your Brand-New Home Screen

Introducing Value Insights (Beta): Understand the Business Impact your organization ...

Enterprise Security (ES) Essentials 8.3 is Now GA — Smarter Detections, Faster ...