Getting Data In

Why is Hunk outputting space separated values from HDFS/Hive as JSON in search results?

ddrillic
Ultra Champion

We see the following:
alt text

On the HDFS file system, the values are space separated. How can we "fix" the loading process so it won't show as json?

Tags (2)
0 Karma
1 Solution

splunkIT
Splunk Employee
Splunk Employee

char(n) should now be supported in splunk / hunk 6.4.1 maintenance release:
https://answers.splunk.com/answers/379387/hunk-hive-and-decimalnn.html#answer-405335

View solution in original post

0 Karma

splunkIT
Splunk Employee
Splunk Employee

char(n) should now be supported in splunk / hunk 6.4.1 maintenance release:
https://answers.splunk.com/answers/379387/hunk-hive-and-decimalnn.html#answer-405335

View solution in original post

0 Karma

ddrillic
Ultra Champion

That's interesting as we just upgraded to 6.4.1

0 Karma

rdagan_splunk
Splunk Employee
Splunk Employee

If you are using Hunk with Hive, what you see is the expected behavior.

Hunk will display the data in a JSON format if you are using Hive metadata, Parquet, Avro, SEQ, and CSV.
If you are using Hunk without Hive (or without any of the above file formats), you will see the data as the normal Splunk log format

0 Karma

JeffWiedemann
Engager

As Dan commented above... we got to the bottom of the issue and it appears that it's occurring because the Hive table serving the above data has been defined with a char(n) datatype for text as opposed to string. While the workaround posted here: https://answers.splunk.com/answers/372130/hunk-630-doesnt-seem-to-work-with-hive-version-013.html will allow char(n) tables to be input without error, they still do display the data a little funky. At this time, it appears a Hunk update (current version is 6.3.x) to better support char(n) will be required for it to operate the same as string.

0 Karma

rdagan_splunk
Splunk Employee
Splunk Employee

Hi Dan,
Yes JSON nesting is also the expected behavior. So when you see nesting in the Json file, we visualize it using the A.B notation and we use the plus sign to enable users to expend the nesting.

0 Karma

ddrillic
Ultra Champion

Fair enough, but why do we see one level of nesting for some tables and two levels of nesting for other tables, while the hive and the hdfs data look identical?

0 Karma

ddrillic
Ultra Champion

Raanan, working with Jeff and we realized that the extra json level happens when the hive data definition holds Char(N). When the hive data definition is String, we see the expected presentation.

Just to keep in mind, the upgrade of the hive libraries last week solved the immediate fatal error when issuing a query. Now we have a different related issue.

ddrillic
Ultra Champion

Hi Raanan, David,

The thing is that for some tables we see one level of json nesting and for others we see a nesting for each field with the creation of new fields on the left pane.

So, that's the issue.

Regards,
Dan

0 Karma

Claw
Splunk Employee
Splunk Employee

Hi Dan

Are you saying that the records are not actually JSON or that we have extra spaces to deal with or that you don't want to view the records as JSON?

0 Karma

ddrillic
Ultra Champion

The input looks like -
01Arabian U2007-05-08TSUAS63 2016-01-06201
6-01-06

0 Karma

ddrillic
Ultra Champion

Hi David,

The records are in text, space separated. The preferred way to view the data is in a non-json way. What do you think?

Regards,
Dan

0 Karma

burwell
SplunkTrust
SplunkTrust

Dan: maybe a sample of the input would help us understand?

0 Karma
.conf21 Now Fully Virtual!
Register for FREE Today!

We've made .conf21 totally virtual and totally FREE! Our completely online experience will run from 10/19 through 10/20 with some additional events, too!