Getting Data In

How do you run a subquery for each row of a CSV file by passing the field in a search string?

known_user
Engager

I want to run a Splunk query for all the values in the CSV file and replace the value with the field in the CSV file. I've imported the file into Splunk as an input lookup table, and I'm able to view the fields using an inputlookup query. But, I want to run that with all the sub queries where I'm fetching maximum count on a per hour, per day, per week and per month basis.

input file is ids.csv, which has around 800 rows and it's just one column, like below:

1234,
2345
2346
4567

...

query that im using:

| inputlookup ids.csv | fields ids as id |  [search index="abc" id "search string here" |bin _time span="1hour" | stats count as maxHour by _time | sort - count | head 1]  |appendcols[search
index="abc" id "search string here" |bin _time span="1day" | stats count as maxDay by _time | sort - count |head 1 ]|appendcols[search
index="abc" id "search string here" |bin _time span="1week" | stats count as maxWeek by _time | sort - count | head 1 ]|appendcols[search
index="abc" id "search string here" |bin _time span="1month" | stats count as maxMonth by _time | sort - count | head 1]

I'm not getting the expected results for this. I'm expecting a tabular format where I get the count for each time range with the specific ID by passing the ID field in the search subquery.

How can I solve this?

Thanks

Tags (2)
0 Karma

jkat54
SplunkTrust
SplunkTrust

I’ve never tried more than one bin command but maybe this works

index=abc

| lookup ids as id OUTPUT id
| where isnotnull(id)
| eval month=_time, week=_time, day=_time
| bin month span=1m
| bin week span=1w
| bin day span=1d
| top 1 id by month week day

0 Karma

woodcock
Esteemed Legend

Assuming that you are trying to limit the search results to only those id values that are in ids.csv, then like this:

index="abc" id "search string here" [ |inputlookup ids.csv | rename ids AS id | format ]
| timechart limit=0 useother=false span=1h count AS maxHour BY id
| untable _time id maxHour
| multireport

[ sort 0 - maxHour
| dedup id ]

[ timechart limit=0 useother=false span=1d sum(maxHour) AS maxDay BY id
| untable _time id maxDay
| sort 0 - maxDay
| dedup id ]

[ timechart limit=0 useother=false span=1mon sum(maxDay) AS maxMonth BY id
| untable _time id maxMonth
| sort 0 - maxMonth
| dedup id ]

This gives you each id's maxHour, maxDay, and maxMonth.

0 Karma

richgalloway
SplunkTrust
SplunkTrust

The multireport command looks interesting, but I can find no documentation for it.

---
If this reply helps you, Karma would be appreciated.
0 Karma

woodcock
Esteemed Legend

Yes, I only know it by using it. It needs to be documented because it is very useful.

0 Karma

richgalloway
SplunkTrust
SplunkTrust

Docs feedback, @woodcock

---
If this reply helps you, Karma would be appreciated.
0 Karma

richgalloway
SplunkTrust
SplunkTrust

First, unless you just left it out of the question, your CSV needs a header row so Splunk knows what the field name is. I will call it "id".

The literal translation of your question would employ the map command. Something like

| inputlookup ids.csv | map [[search index="abc" $id$ "search string here" |bin _time span="1hour" | stats count as maxHour by _time | sort - count | head 1]  
|appendcols[search
 index="abc" $id$ "search string here" |bin _time span="1day" | stats count as maxDay by _time | sort - count |head 1 ]
|appendcols[search
 index="abc" $id$ "search string here" |bin _time span="1week" | stats count as maxWeek by _time | sort - count | head 1 ]
|appendcols[search
 index="abc" $id$ "search string here" |bin _time span="1month" | stats count as maxMonth by _time | sort - count | head 1]]

But this would be horrendously inefficient as the searches would be executed 800 times each.

More efficient would be to make the lookup file a subsearch of each search.

search index="abc" [ | inputlookup ids.csv | fields id | format ] "search string here" |bin _time span="1hour" | stats count as maxHour by _time, id | sort - count | dedup id]  
 |appendcols[search
  index="abc" [ | inputlookup ids.csv | fields id | format ] "search string here" |bin _time span="1day" | stats count as maxDay by _time, id | sort - count | dedup id ]
 |appendcols[search
  index="abc" [ | inputlookup ids.csv | fields id | format ] "search string here" |bin _time span="1week" | stats count as maxWeek by _time, id | sort - count | dedup id ]
 |appendcols[search
  index="abc" [ | inputlookup ids.csv | fields id | format ] "search string here" |bin _time span="1month" | stats count as maxMonth by _time, id | sort - count | dedup id ]

Now you only make four passes through the data.

---
If this reply helps you, Karma would be appreciated.
0 Karma
Get Updates on the Splunk Community!

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...