Splunk Search

Multi value field search | Unexpected output

rahulkawadkar
Loves-to-Learn Lots

I need to report hosts that are configured to receive app.log details and also report the ones that are missing. For this, I use the query "index=application sourcetype="application:server:log" | stats values(host) as hostnames by customer_name". This retrieves the hostnames for each customer_name from the sourcetype.

 

I get a result as:

customer_name

host

customer1server1
customer2

server2

server3

 

Then, I join the result by customer_name field from the second part of the query "[| inputlookup server_info.csv | rename customer as customer_name | stats values(host) as hostnames_lookup by customer_name] which retrieves the hostnames for each customer_name from the server_info.csv lookup table."

 

Here I get result as:

customer_name host
customer_namehost
customer1

server1

server100

customer2

server2

server3

server101

Later, I expand both the multivalue fields and perform evaluation on both the fields to retrieve result as configured or not configured. The evaluation looks like this

| mvexpand hostnames
| mvexpand hostnames_lookup
| eval not_configured = if(hostnames == hostnames_lookup, hostnames, null())
| eval configured = if(hostnames != hostnames_lookup, hostnames, null())
| fields customer_name, hostnames, hostnames_lookup, configured, not_configured

 

My final query looks like this:
(index=application sourcetype="application:server:log)
| stats values(host) as hostnames by customer_name
| join customer_name
[| inputlookup server_info.csv
| rename customer as customer_name
| stats values(host) as hostnames_lookup by customer_name]
| mvexpand hostnames
| mvexpand hostnames_lookup
| eval not_configured = if(hostnames == hostnames_lookup, hostnames, null())
| eval configured = if(hostnames != hostnames_lookup, hostnames, null())
| fields customer_name, hostnames, hostnames_lookup, configured, not_configured



However, in the result when the evaluation is completed the results are not as expected, the matching logic doesn't work and the resultant output is incorrect. There are no values evaluated in the not_configured column and the configured column only returns the values in hostnames. However, I'd expect the configured field to show results of all the servers configured to receive app.log and not configured to have hostnames that are present in lookup but are still not configured to receive logs. 

Expected Output:

customer_namehostnameshostnames_lookupconfigurednot_configured
customer1server1

server1

server100

server1server100
customer2

server2

server3

server2

server3

server101

server2

server3

server101

 

Current Output:

customer_namehostnameshostnames_lookupconfigurednot_configured
customer1server1

server1

server100

server1 
customer2

server2

server3

server2

server3

server101

server2

server3

 

 

Essentially customer1 and customer2 should display server1 as configured and server100 not_configured and likewise for customer2 as mentioned in expected output table. Which will mean that server100 and 101 are part of the lookup but are not configured to receive app.log

How can I evaluate this differently, so that the comparison works as expected. Is it possible to compare the values in this fashion? Is there anything wrong with the current comparison logic? Should I not use mvexpand on the extracted fields so that they are compared expectedly?

 

 

 

 

Labels (1)
Tags (1)
0 Karma

rahulkawadkar
Loves-to-Learn Lots

So as I understand,

 

| append [
    | inputlookup server_info.csv
    | rename customer as customer_name
    | stats values(host) as hostnames by customer_name ]

is going to append the hostnames from lookup into the results received from the first query.

Finally,

| stats count by customer_name hostnames

 is going to count the value as 1 if it's present only in lookup, otherwise, if it's present in the first part of search and lookup then the count is going to be evaluated as 2? Is that correct?

However, in the result there are no values with count as 2. Which is unlikely as there are few hosts which are present in events and lookup as well.

Here, we try to fetch the events that contains hostnames(configured to receive application logs) and then compare them with the list of servers present in lookup(if found in lookup only then count=1). 

However it seems that the query still isn't performing the required search as there are no values with count as 2. Here I observer count is returned as 1 for the hosts that are received from the events in below

index=application sourcetype="application:server:log"
| stats values(host) as hostnames by customer_name

There is almost no trace of values in the lookup, so I'm not sure if they are even being compared. And this is what the issue earlier was.

 

How can these two lists be compared and listed?

Tags (1)
0 Karma

PickleRick
SplunkTrust
SplunkTrust

Close. The count will depend on how many of those hostname-customer_name pairs you have. If you have just one (supposedly only from the lookup), it will be 1. If you have two (from the lookup and from the data), it will be two.

At least it should work this way. Of course I don't know either your events nor your lookup contents so I'm only deducing the data format from your searches.

See this short run-anywhere example.

| makeresults 
| eval hosts=split("a,b,c,d:a,b,c",":")
| mvexpand hosts
| eval customers="aaa"
| eval hosts=split(hosts,",")

This prepares some mockup data. You have one row supposedly from your summarized values from events and one from the summarized lookup.

After you add

| stats count by customers hosts

You get three rows with count of 2 and one row with count of 1

 

0 Karma

PickleRick
SplunkTrust
SplunkTrust

Firstly, join is not a very friendly command, it has its quirks. In this case I'd rather use either append or inputlookup append=t

Another thing - if you do mvexpand on multiple multivalued fields you'll get a product of both sets of values. It can escalate quickly for bigger data sets. See the run-anywhere example

| makeresults 
| eval a=split("1,2,3,4,5",",")
| eval b=split("1,2,3,4,5",",")
| mvexpand a
| mvexpand b

Of course you can do some fancy set arithmetics on those multivalued fields but it's usually easier done another way - count and filter.

This part is OK, it lists which customers use which hosts

index=application sourcetype="application:server:log"
| stats values(host) as hostnames by customer_name

You're gonna get a list of hosts per customer. What is important here is that each host will be listed only once per customer.

So we expand our list by the servers defined for each customer

| append [
    | inputlookup server_info.csv
    | rename customer as customer_name
    | stats values(host) as hostnames by customer_name ]

So now for each customer you have a single value of hostnames field per customer per each host.

Nothing easier now than counting what we have

| stats count by customer_name hostnames

So for each pair of customer_name and hostnames values you will have a count value indicating whether it was only present in the lookup (value of 1) or both in the lookup and the indexed events (value of 2).

Now you can easily manipulate the data - filter, make a table, whatever.

All this assumes that you don't have any hosts in the event data which are not in the lookup. If you can have such situation, the search is getting a bit more complex because you need to add additional field with two different numerical values, depending on whether the data came from the events or from the lookup and do a sum() instead of count in your final stats so you can see where the data came from.

0 Karma
Get Updates on the Splunk Community!

What's New in Splunk Enterprise 9.4: Features to Power Your Digital Resilience

Hey Splunky People! We are excited to share the latest updates in Splunk Enterprise 9.4. In this release we ...

Take Your Breath Away with Splunk Risk-Based Alerting (RBA)

WATCH NOW!The Splunk Guide to Risk-Based Alerting is here to empower your SOC like never before. Join Haylee ...

SignalFlow: What? Why? How?

What is SignalFlow? Splunk Observability Cloud’s analytics engine, SignalFlow, opens up a world of in-depth ...