Hi Team,
I have set up hunk with Apache Hadoop 2.26 and my data is stored in Hive 0.13 table with ORC compression. Data size is around 2 TB.
When I am trying to execute any query through Hunk, it is taking too much time. Equivalent query in hive is taking only 80 Sec. I am executing the above on the same hive table.
Please help me to improve the performance of Hunk. How I can achieve fast data processing through Hunk?
Thanks
Abhishek
Try this combo:
index=idxtmgorc cs_username="anyname" | stats count(cs_username) as username
Be in Smart Mode
In addition, Here is a link that shows you the Mode options:
http://docs.splunk.com/Documentation/Splunk/6.2.4/SearchTutorial/Aboutsearchactionsandmodes
and a link that shows you the Search commands:
http://docs.splunk.com/Documentation/Splunk/6.2.4/SearchTutorial/Usethesearchlanguage
I am happy to see that you are able to improve the performance when running Hadoop MapReduce Jobs.
The rule is Hunk triggers an MR job if:
1. the search is not ran in verbose mode AND
2. the search contains any filtering predicates in the first search command
OR
3. the search contains any reporting commands
thanks rdagan_splunk for clarifying the problem.
Hi rdagan_splunk,
I am in Smart mode as suggested by you.
I tried the queries suggested by you and observed when I am runnning query which involves stats command, it is working fast. but when I am trying to run generic queries like index=idxtmg, its is taking too much time for giving results.
please suggest me how I can improve the system's performance.
Can you share the Hunk query? Also please make sure you are in a smart search mode (not in Verbose mode)
Hi rdagan_splunk,
I am using below queries in Hunk and Hive respectevely:
index=idxtmgorc cs_username="anyname" - taking too much time almost more then 30 minutes.
select cs_username from tmg_orc_table where cs_username='anyname - hive query taking only 80 sec on same data and same cluster.
give me some time, I will let you know about the mode.
Thanks & Regards
Abhishek Soni
Abhishek did you ever solve the performance issue? I have the same issue.