Splunk Search

Hunk min value multi column search question

mikejf12
New Member

I have installed hunk 6.1.3 onto a Centos 6 Linux host and connected it to a Centos 6 Linux based CDH5 Hadoop cluster.
I have installed hunk under /usr/local/hunk and set up my configuration files to pull csv based data off of hdfs.

[hadoop@hc2nn system]$ pwd
/usr/local/hunk/etc/system

[hadoop@hc2nn local]$ cat indexes.conf
[provider:cdh5]
vix.family = hadoop
vix.command.arg.3 = $SPLUNK_HOME/bin/jars/SplunkMR-s6.0-hy2.0.jar
vix.env.HADOOP_HOME = /usr/lib/hadoop
vix.env.JAVA_HOME = /usr/lib/jvm/jre-1.6.0-openjdk.x86_64
vix.fs.default.name = hdfs://hc2nn:8020
vix.splunk.home.hdfs = /user/hadoop/hunk/workdir
vix.mapreduce.framework.name = yarn
vix.yarn.resourcemanager.address = hc2nn:8032
vix.yarn.resourcemanager.scheduler.address = hc2nn:8030
vix.mapred.job.map.memory.mb = 1024
vix.yarn.app.mapreduce.am.staging-dir = /user
vix.splunk.search.recordreader.csv.regex = .csv$
[hadoop@hc2nn local]$ cat props.conf
[source::/data/hunk/rdbms/...]
REPORT-csvreport = extractcsv
[extractcsv]
DELIMS="\,"
FIELDS="year","manufacturer","model","class","engine size","cyclinders","transmission","Fuel
Type","fuel_city_l_100km","fuel_hwy_l_100km","fuel_city_mpg","fuel_hwy_mpg","fuel_l_yr","c02_g_km"

Hunk is running from the host hc2nn and I can start it as login at http://hc2nn:8000. I can run searches via

index=cdh5_vindex

and I can select columns to display, create reports and dashboards. What I want to do though is create a report
from the search pane that shows minimum co2 emmissions for a particular manufacturer and model. I would then like
to limit the output to the top 20 minimum values.

Please excuse the mistakes but I understand that I can do something like this

index=cdh5_vindex manufacturer model c02_g_km | stats min(c02_g_km) as minco2 | table manufacturer model minco2

I know that this isnt the correct format but I wondered whether someone could advise the correct approach. I can create
single column reports and dashboards but I would like to create something a little more complicated.

Tags (3)
0 Karma

Ledion_Bitincka
Splunk Employee
Splunk Employee

Would the following do what you're looking for?

index=cdh5_vindex | stats min(c02_g_km) AS minco2 BY  manufacturer, model | sort 20 minco2 

By the way, you don't need to define any of the props/tranforms.conf for CSV files in Hunk as they're automatically recognized and converted into JSON (at runtime) for better visualizations.

0 Karma