About prvnks

prvnks · ‎07-07-2016

I was trying out datamodel acceleration with Hunk (latest version). This is how my datamodel.conf looks: cat etc/apps/search/local/datamodels.conf [LVSMC] acceleration = 1 acceleration.earliest_time = -1d acceleration.hunk.compression_codec = snappy acceleration.hunk.dfs_block_size = 134217728 acceleration.hunk.file_format = parquet acceleration.manual_rebuilds = 0 It starts accelerating. But, the parquet-snappy files get deleted after collecting for around 10-20 mins. Suddenly, the parquet files disappears. May be summary creation is dropping this newly created files. $ date;/usr/bin/hadoop fs -du -h /abcd/SplunkMR/datamodel Wed Jul 6 08:46:56 PDT 2016 0 0 /abcd/SplunkMR/datamodel/70F888CB-CA73-4A97-B54F-6B0ACA9A4E7E_DM_search_test $ date;/usr/bin/hadoop fs -du -h /abcd/SplunkMR/datamodel Wed Jul 6 09:04:40 PDT 2016 2.5 G 7.4 G /abcd/SplunkMR/datamodel/70F888CB-CA73-4A97-B54F-6B0ACA9A4E7E_DM_search_test $ date;/usr/bin/hadoop fs -du -h /abcd/SplunkMR/datamodel Wed Jul 6 09:05:47 PDT 2016 75.4 M 226.2 M /abcd/SplunkMR/datamodel/70F888CB-CA73-4A97-B54F-6B0ACA9A4E7E_DM_search_test I tried to play around with other options. It did not help. acceleration.max_time acceleration.backfill_time acceleration.manual_rebuilds acceleration.max_concurrent Pls note that our Hunk would require around 8 hours to process entire day’s data when no other queries are fired. I don’t know how to catch up and make Hunk accelerate datamodel for 1 day data. Is there some switch that I can use to retain the parquet-snappy files? I tried to adjust earliest_time and backfill_time(much shorter than earliest_time). It did not help. Pls let me know where it could be going wrong.

Posts	1
Solutions	0
Karma Given	0
Karma Received	0
Member Since	‎07-07-2016

Online Status	Offline
Date Last Visited	‎06-05-2020 02:04 AM

Hunk - Data Model Acceleration - Parquet files get...

Hunk - Data Model Acceleration - Parquet files get...