Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results forย

Splunk Search

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results forย

- Community
- :
- Splunk Answers
- :
- Using Splunk
- :
- Splunk Search
- :
- Is it possible to use appendcols without multiple ...

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark Topic
- Subscribe to Topic
- Mute Topic
- Printer Friendly Page

- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

asherman

Path Finder

โ08-14-2014
07:55 AM

Hi,

I am trying to build a table of results, and my query has the form:

search X

|dedup block

|stats count by block

|appendcols [search X | stats ... | eval ... | chart ...]

|appendcols [search X | stats ... | eval ... | chart ...]

|appendcols [search X | stats ... | eval ... | chart ...]

Each search is the same while the subsequent operations differ. I can get the table I want, but as my data grows my query's time is increasing a lot. Is it possible for me to save the search (variable based on a given query) so that I can reuse the results and avoid the duplicate searches? Alternatively, is there a better command for this purpose than appendcols?

Thanks

Edit:

The full query is:

```
index=test_index3 base="*" corner=*100c*
| dedup dir corner path name
| stats count as paths by block
| appendcols [search index=test_index3 base="*" corner=*100c*
| stats count(eval(isNum(base))) as num_vals, count as num_paths by block
| eval percent=round(num_vals*100/num_paths,2)
| chart max(percent) as "num_vals" by block]
| appendcols [search index=test_index3 base="*" NOT base="*nan*" corner=*100c*
| stats count(eval((base-min)>0 AND (base-max)<0)) AS bounded, count AS num_vals by block
| eval percent=round(bounded*100/num_vals,2)
| chart max(percent) as "bounded paths" by block]
| appendcols [search index=test_index3 base="*" NOT base="*nan*" corner=*100c*
| stats count(eval((base-max)<0)) AS bounded, count AS num_vals by block
| eval percent=round(bounded*100/num_vals,2)
| chart max(percent) as "max bounded paths" by block]
| appendcols [search index=test_index3 base="*" NOT base="*nan*" corner=*100c*
| stats count(eval((base-min)>0)) AS bounded, count AS num_vals by block
| eval percent=round(bounded*100/num_vals,2)
| chart max(percent) as "min bounded paths" by block]
| fields block, paths, num_vals, "bounded paths", "max bounded paths", "min bounded paths"
| rename num_vals AS "% data"
| rename "bounded paths" AS "% of data bounded"
| rename "max bounded paths" AS "% of data bounded by max"
| rename "min bounded paths" AS "% of data bounded by min"
```

1 Solution

- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

martin_mueller

SplunkTrust

โ08-14-2014
09:50 AM

First off, `corner=*100c*`

usually is quite inefficient because of the leading wildcard. If the number of events scanned vs the number of events matched is high then you may be able to speed everything up here. How depends on your data, look for a way to identify relevant events based on full tokens or at least prefixes of tokens, ie without leading wildcards.

Looking at the search itself, here's how I'd start by combining the first two parts.

```
(1) index=test_index3 base="*" corner=*100c* | dedup dir corner path name | stats count as paths by block
```

The `dedup`

isn't present in the other searches, so let's avoid using it. This basically counts how many distinct quads of `(dir, corner, path, name)`

you have for each value of `block`

... so you can replace (1) with this:

```
(2) index=test_index3 base="*" corner=*100c* | eval dcpn = dir ."###". corner ."###". path ."###". name | stats dc(dcpn) as paths by block
```

Now let's look at the first subsearch:

```
(3) index=test_index3 base="*" corner=*100c* | stats count(eval(isNum(base))) as num_vals, count as num_paths by block | eval percent=round(num_vals*100/num_paths,2) | chart max(percent) as "num_vals" by block
```

That final chart looks useless to me. There only is one row for each block, so that basically does the same as `fields - num_paths | rename percent as num_vals`

.

```
(4) index=test_index3 base="*" corner=*100c* | stats count(eval(isNum(base))) as num_vals, count as num_paths by block | eval num_vals=round(num_vals*100/num_paths,2) | fields - num_paths
```

When you compare (2) and (4), the searches basically do the same thing. Load the same events and compute some statistics... so you can combine them like this:

```
(5) index=test_index3 base="*" corner=*100c* | eval dcpn = dir ."###". corner ."###". path ."###". name
| stats dc(dcpn) as paths, count(eval(isNum(base))) as num_vals, count as num_paths by block
| eval num_vals=round(num_vals*100/num_paths,2) | fields - num_paths
```

Without knowing your data - in (1) you dedup before counting the number of paths, in (4) you don't. Is that on purpose? From here it feels weird, but might match your data and use case so I left it as is.

Using this strategy you should be able to combine the other subsearches into the main search as well.

- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

martin_mueller

SplunkTrust

โ08-14-2014
09:50 AM

First off, `corner=*100c*`

usually is quite inefficient because of the leading wildcard. If the number of events scanned vs the number of events matched is high then you may be able to speed everything up here. How depends on your data, look for a way to identify relevant events based on full tokens or at least prefixes of tokens, ie without leading wildcards.

Looking at the search itself, here's how I'd start by combining the first two parts.

```
(1) index=test_index3 base="*" corner=*100c* | dedup dir corner path name | stats count as paths by block
```

The `dedup`

isn't present in the other searches, so let's avoid using it. This basically counts how many distinct quads of `(dir, corner, path, name)`

you have for each value of `block`

... so you can replace (1) with this:

```
(2) index=test_index3 base="*" corner=*100c* | eval dcpn = dir ."###". corner ."###". path ."###". name | stats dc(dcpn) as paths by block
```

Now let's look at the first subsearch:

```
(3) index=test_index3 base="*" corner=*100c* | stats count(eval(isNum(base))) as num_vals, count as num_paths by block | eval percent=round(num_vals*100/num_paths,2) | chart max(percent) as "num_vals" by block
```

That final chart looks useless to me. There only is one row for each block, so that basically does the same as `fields - num_paths | rename percent as num_vals`

.

```
(4) index=test_index3 base="*" corner=*100c* | stats count(eval(isNum(base))) as num_vals, count as num_paths by block | eval num_vals=round(num_vals*100/num_paths,2) | fields - num_paths
```

When you compare (2) and (4), the searches basically do the same thing. Load the same events and compute some statistics... so you can combine them like this:

```
(5) index=test_index3 base="*" corner=*100c* | eval dcpn = dir ."###". corner ."###". path ."###". name
| stats dc(dcpn) as paths, count(eval(isNum(base))) as num_vals, count as num_paths by block
| eval num_vals=round(num_vals*100/num_paths,2) | fields - num_paths
```

Without knowing your data - in (1) you dedup before counting the number of paths, in (4) you don't. Is that on purpose? From here it feels weird, but might match your data and use case so I left it as is.

Using this strategy you should be able to combine the other subsearches into the main search as well.

- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

martin_mueller

SplunkTrust

โ08-14-2014
03:01 PM

The dots concatenate strings, and without knowing your data I added a separator to make fairly sure that concatenating the four fields doesn't mess things up. Here's an example of how things would mess up with unlucky data:

```
d c p n
1 2 3 45
1 2 34 5
```

Concatenating those two quads and then `dedup`

ing or `dc()`

ing would treat them as duplicates despite them obviously not being duplicates - both would end up being `12345`

. Adding the separator avoids that with almost certainty without knowing the data.

- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

asherman

Path Finder

โ08-14-2014
02:54 PM

Sure.

By the way, what exactly is the ."###". doing? I'm not entirely clear on why that works.

- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

asherman

Path Finder

โ08-14-2014
02:54 PM

Final result:

```
index=test_index3 base="*" corner=100c*
| dedup dir corner path name hs_or_lp
| eval dcpn = dir ."###". corner ."###". path ."###". name
| stats dc(dcpn) as paths,
count(eval(isNum(base))) as num_vals,
count(eval((base-min)>0 AND (base-max)<0)) AS bounded,
count(eval((base-max)<0)) AS bounded_max,
count(eval((base-min)>0)) AS bounded_min by block
| eval percent=round(num_vals*100/paths,2)
| eval bounded_perc=round(bounded*100/num_vals,2)
| eval bounded_max_perc=round(bounded_max*100/num_vals,2)
| eval bounded_min_perc=round(bounded_min*100/num_vals,2)
| fields block, percent, bounded_perc, bounded_min_perc, bounded_max_perc
| rename percent as "% of data"
| rename bounded_perc as "% of data bounded"
| rename bounded_max_perc as "% of data bounded by max"
| rename bounded_min_perc as "% of data bounded by min"
```

- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

martin_mueller

SplunkTrust

โ08-14-2014
11:18 AM

Ah, I see - then it does indeed makes sense to use `dedup`

for the entire search.

Consider posting the finished search for others who may later find this question.

- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

asherman

Path Finder

โ08-14-2014
11:03 AM

Thank you for this - my query is now a lot neater and faster.

The purpose of the dedup was to remove duplicate entries for the same dir/corner/path/name (dcpn) set. It should have been with every search, but I don't think I had duplicate (dcpn) data entries yet. Using a single query as you've described, I should be able to use the dedup at the start to cover the entire query.

Thanks.

- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

asherman

Path Finder

โ08-14-2014
09:06 AM

Sure, done.

- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

martin_mueller

SplunkTrust

โ08-14-2014
08:56 AM

*and appendcols* entirely. Post the full search and we'll see.