How to perform lookup from index search with dbxquery?
| index=vulnerability_index
| table ip_address, vulnerability, score
ip_address | vulnerability | score |
192.168.1.1 | SQL Injection | 9 |
192.168.1.1 | OpenSSL | 7 |
192.168.1.2 | Cross Site-Scripting | 8 |
192.168.1.2 | DNS | 5 |
| dbxquery query="select * from tableCompany"
ip_address | company | location |
192.168.1.1 | Comp-A | Loc-A |
192.168.1.2 | Comp-B | Loc-B |
192.168.1.5 | Comp-E | Loc-E |
After lookup IP in dbxquery:
ip_address | company | location | vulnerability | Score |
192.168.1.1 | Comp-A | Loc-A | SQL Injection | 9 |
192.168.1.1 | Comp-A | Loc-A | OpenSSL | 7 |
192.168.1.2 | Comp-B | Loc-B | Cross Site-Scripting | 8 |
192.168.1.2 | Comp-B | Loc-B | DNS | 5 |
Thank you so much
To put each vulnerability and score pair in a single row, first pair them then split them.
index=vulnerability_index
| table ip_address, vulnerability, score
| append
[| dbxquery query="select * from tableCompany" ```select ip_address,company,location from tableCompany```
| spath input="SubnetID-IP" path={}
| mvexpand {}
| spath input={} path=ip_address
| table Company, Location, ip_address ]
| stats values(Company) as Company values(Location) as Location values(eval(vulnerability. "<:>" . score)) as vulnerability by ip_address
| mvexpand vulnerability
| eval vulnerability = split(vulnerability, "<:>")
| eval score = mvindex(vulnerability, 1), vulnerability = mvindex(vulnerability, 0)
(The reason why I'm using a complex join-split pattern is because vulnerability may contain common separator characters.)
Hello,
Sorry I wasn't clear,
DBXquery has field "SubnetID-IP" , not ip_address
Company | Location | SubnetID-IP |
Comp-A | Loc- A | [{"subnet_id":"101","ip_address":"192.168.1.1"},{"subnet_id":"121","ip_address":"192.168.1.21"},{"subnet_id":"131","ip_address":"192.168.1.31"}] |
Comp-B | Loc-B | [{"subnet_id":"102","ip_address":"192.168.1.2"},{"subnet_id":"122","ip_address":"192.168.1.22"},{"subnet_id":"123","ip_address":"192.168.1.23"}] |
Comp-E | Loc-E | [{"subnet_id":"105","ip_address":"192.168.1.5"},{"subnet_id":"152","ip_address":"192.168.1.52"}] |
SubnetID-IP needs to be parsed using
| spath input=SubnetID-IP path={}
| mvexpand {}
| spath input={}
| table Company, Location, ip_address
The output after parsing is below:
company | location | ip_address |
Comp-A | Loc- A | 192.168.1.1 |
Comp-B | Loc-B | 192.168.1.2 |
Comp-E | Loc-E | 192.168.1.5 |
How do I combine dbquery and spath and index?
Thanks
Just add JSON handling inside the subsearch.
index=vulnerability_index
| table ip_address, vulnerability, score
| append
[| dbxquery query="select * from tableCompany" ```select ip_address,company,location from tableCompany```
| spath input="SubnetID-IP" path={}
| mvexpand {}
| spath input={} path=ip_address
| table Company, Location, ip_address ]
| stats values(*) as * by ip_address
This is my emulation
| makeresults
| eval _raw = "ip_address vulnerability score
192.168.1.1 SQL Injection 9
192.168.1.1 OpenSSL 7
192.168.1.2 Cross Site-Scripting 8
192.168.1.2 DNS 5"
| multikv forceheader=1
``` the above emulates index=vulnerability_index ```
| table ip_address, vulnerability, score
| append
[| makeresults
| eval _raw = "Company Location SubnetID-IP
Comp-A Loc-A [{\"subnet_id\":\"101\",\"ip_address\":\"192.168.1.1\"},{\"subnet_id\":\"121\",\"ip_address\":\"192.168.1.21\"},{\"subnet_id\":\"131\",\"ip_address\":\"192.168.1.31\"}]
Comp-B Loc-B [{\"subnet_id\":\"102\",\"ip_address\":\"192.168.1.2\"},{\"subnet_id\":\"122\",\"ip_address\":\"192.168.1.22\"},{\"subnet_id\":\"123\",\"ip_address\":\"192.168.1.23\"}]
Comp-E Loc-E [{\"subnet_id\":\"105\",\"ip_address\":\"192.168.1.5\"},{\"subnet_id\":\"152\",\"ip_address\":\"192.168.1.52\"}]"
| multikv forceheader=1
| rename *________ as *, *_______ as *
| fields - _* linecount
``` the above emulates | dbxquery query="select * from tableCompany" ```
| spath input="SubnetID_IP" path={} ``` extract changes field name ```
| mvexpand {}
| spath input={} path=ip_address
]
| stats values(Company) as Company values(Location) as Location values(vulnerability) as vulnerability values(score) as score by ip_address
The result is
ip_address | Company | Location | vulnerability | score |
192.168.1.1 | Comp-A | Loc-A | OpenSSL SQL Injection | 7 9 |
192.168.1.2 | Comp-B | Loc-B | Cross Site-Scripting DNS | 5 8 |
192.168.1.21 | Comp-A | Loc-A | ||
192.168.1.22 | Comp-B | Loc-B | ||
192.168.1.23 | Comp-B | Loc-B | ||
192.168.1.31 | Comp-A | Loc-A | ||
192.168.1.5 | Comp-E | Loc-E | ||
192.168.1.52 | Comp-E | Loc-E |
Hello,
Thank you for your help. I tried both of your queries, and they worked exactly shown in your result.
However, the result combined all the vulnerabilities in one row (OpenSSL and SQL Injection), and all the scores in one row (7 and 9), and after I expanded into a separate row, the score did not match the vulnerability because it got mixed up when it was merged (OpenSSL should have score 7, not both 7 and 9) .
Please suggest. Thank you
| search vulnerability=* 'remove blank space
| mvexpand vulnerability
| mvexpand score
ip_address | Company | Location | vulnerability | score |
192.168.1.1 | Comp-A | Loc-A | OpenSSL | 7 |
192.168.1.1 | Comp-A | Loc-A | SQL Injection | 9 |
192.168.1.1 | Comp-A | Loc-A | OpenSSL | 7 |
192.168.1.1 | Comp-A | Loc-A | SQL Injection | 9 |
192.168.1.2 | Comp-B | Loc-B | Cross Site-Scripting | 5 |
192.168.1.2 | Comp-B | Loc-B | DNS | 8 |
192.168.1.2 | Comp-B | Loc-B | Cross Site-Scripting | 5 |
192.168.1.2 | Comp-B | Loc-B | DNS | 8 |
To put each vulnerability and score pair in a single row, first pair them then split them.
index=vulnerability_index
| table ip_address, vulnerability, score
| append
[| dbxquery query="select * from tableCompany" ```select ip_address,company,location from tableCompany```
| spath input="SubnetID-IP" path={}
| mvexpand {}
| spath input={} path=ip_address
| table Company, Location, ip_address ]
| stats values(Company) as Company values(Location) as Location values(eval(vulnerability. "<:>" . score)) as vulnerability by ip_address
| mvexpand vulnerability
| eval vulnerability = split(vulnerability, "<:>")
| eval score = mvindex(vulnerability, 1), vulnerability = mvindex(vulnerability, 0)
(The reason why I'm using a complex join-split pattern is because vulnerability may contain common separator characters.)
Hello,
Thanks for your help.
Your suggestion worked fine on the emulated data, it still needs to filter out empty vulnerability ( | search vulnerability=*)
In real data, your query worked with the following findings:
1) Very slow
2) Location field also has "merged" data
3) There are additional fields that need to be merged using ("<:>"), then expand - which may cause more slowness
I will accept your solution in here
Do you have any other alternative solutions?
Can you perhaps give a simple example using external lookup - inside python code?
Thank you so much.
You can use the job inspector (in the Job menu) to gain some insight about time consumption. But performance tuning with Splunk - or any program, can be complicated. A lot depend on data characteristics. The interaction between dbxquery and an external database makes improvement even trickier.
Without knowing characteristics of dbxquery, I can think of the following
Combining these two thoughts, you can try the following:
index=vulnerability_index
| eval vulnerability = json_object("vulnerability", vulnerability, "score", score)
``` table at this point is generally a waste ```
| append
[| dbxquery query="select * from tableCompany" ```select 'SubnetID-IP',company,location from tableCompany```
| spath input="SubnetID-IP" path={}
| mvexpand {}
| spath input={} path=ip_address ]
| stats values(Company) as Company values(Location) as Location values(vulnerability) as vulnerability by ip_address
| stats values(ip_address) as ip_address by Company Location vulnerability
``` this automatically eliminates rows without vulnerability ```
| eval score = json_extract(vulnerability, "score"), vulnerability = json_extract(vulnerability, "vulnerability")
| table ip_address Company Location vulnerability score
You can also reduce number of rows by restricting dbxquery to only SubnetID-IP that the vulnerability_index search returns with a subsearch. How to do this depends on the real index search. (I suspect that you are not returning the entire index in the real job.)
Hope this helps.
Hello,
Thanks for your help. I was able to run your JSON query just fine. It turned out that one ip_address can be used in multiple company, so the merged data now is in "vulnerability" , "company", and "location" field
1. Is it possible to use lookup from index to Dbx query?
When I used lookup from index to CSV, it only returns merged data on the CSV side, not on the index side
2. Why don't we use JOIN?
Before parse:
company | location | SubnetID-IP |
Comp-A | Loc- A | [{"subnet_id":"101","ip_address":"192.168.1.1"}, |
Comp-Z | Loc- Z | [{"subnet_id":"101","ip_address":"192.168.1.1"}, |
Comp-B | Loc-B | [{"subnet_id":"102","ip_address":"192.168.1.2"}, |
Comp-Y | Loc-Y | [{"subnet_id":"102","ip_address":"192.168.1.2"}, |
Comp-E | Loc-E | [{"subnet_id":"105","ip_address":"192.168.1.5"}, |
After Parse:
ip_address | company | location |
192.168.1.1 | Comp-A | Loc-A |
192.168.1.1 | Comp-Z | Loc-Z |
192.168.1.2 | Comp-B | Loc-B |
192.168.1.2 | Comp-Y | Loc-Y |
192.168.1.5 | Comp-E | Loc-E |
After append: both vulnerability and company are merged
ip_address | vulnerability | score | company | location |
192.168.1.1 | OpenSSL | 7 | Comp-A | Loc-A |
192.168.1.2 | Cross Site-Scripting | 8 | Comp-B | Loc-B |
Expected result:
ip_address | vulnerability | score | company | location |
192.168.1.1 | OpenSSL | 7 | Comp-A | Loc-A |
192.168.1.1 | SQL Injection | 9 | Comp-Z | Loc-A |
192.168.1.2 | Cross Site-Scripting | 8 | Comp-B | Loc-B |
192.168.1.2 | DNS | 5 | Comp-Y | Loc-B |
Hello,
I ran your query and it seems like it just appended the row without matching the IP.
This question is related to the other post that you helped me "split pattern into multiple rows"
https://community.splunk.com/t5/Splunk-Search/Split-pattern-into-multiple-rows/m-p/650660
My goal is to replace the table.csv, which was obtained from
| dbxquery connection=visibility query="select Company, Location, SubnetID-IP from tableCompany"
then parsed the IP and SubnetID using spath, and performed mvexpand to split into single event
Instead of running dbxquery and put the data into table.csv to perform a IP lookup from index,
is it possible to run a single query that combine index search and dbxquery to match/lookup an IP address?
Thanks
First, there was a syntax error copied from the original post. There should not be a leading pipe for index search:
index=vulnerability_index
| table ip_address, vulnerability, score
| append
[| dbxquery query="select * from tableCompany" ```select ip_address,company,location from tableCompany```]
| stats values(*) as * by ip_address
Second, can you confirm that both index=vulnerability_index and | dbxquery query="select * from tableCompany" return a field named ip_address, and that some of their values match? In fact, if there is no matching values, no output should exist; if field name ip_address only exists in one search, output should contain only entries from that search. So your described output is very puzzling.
Yes, the above suggested search is intended to run a query directly from Splunk and perform a match. When I suggested running an external query command as external lookup, I mean Create an external lookup, not to run a query and create a static (CSV or KV store) lookup. (On such external lookup is shipped with Splunk Enterprise, dnslookup.) The effect is still the same: a query is performed at the time of the search. But you don't have to perform a stats to make association.
To help you diagnose, I have to emulations, one to emulate the index search and return rows as you posted, the other to emulate | dbxquery and return rows as you posted. The two are combined in the same fashion as the above command:
| makeresults
| eval _raw = "ip_address vulnerability score
192.168.1.1 SQL Injection 9
192.168.1.1 OpenSSL 7
192.168.1.2 Cross Site-Scripting 8
192.168.1.2 DNS 5"
| multikv forceheader=1
``` the above emulates index=vulnerability_index ```
| table ip_address, vulnerability, score
| append
[| makeresults
| eval _raw = "ip_address company location
192.168.1.1 Comp-A Loc-A
192.168.1.2 Comp-B Loc-B
192.168.1.5 Comp-E Loc-E"
| multikv forceheader=1
| fields - _* linecount
``` the above emulates | dbxquery query="select * from tableCompany" ```
]
| stats values(*) as * by ip_address
They give me the output that you desired, as
ip_address | company | location | score | vulnerability |
192.168.1.1 | Comp-A | Loc-A | 7 9 | OpenSSL SQL Injection |
192.168.1.2 | Comp-B | Loc-B | 5 8 | Cross Site-Scripting DNS |
192.168.1.5 | Comp-E | Loc-E |
You can play with the emulation and compare with real data.
dbxquery cannot be used as lookup directly. You can do something like
| index=vulnerability_index
| table ip_address, vulnerability, score
| append
[| dbxquery query="select * from tableCompany" ```select ip_address,company,location from tableCompany```]
| stats values(*) as * by ip_address
But lookup is much more efficient. You should consider set up an external lookup script to lookup from your database. That way, you can use lookup as intended.
| index=vulnerability_index
| table ip_address, vulnerability, score
| lookup lookupCompany ip_address ``` output company,location ```
where lookupCompany is the lookup you set up to perform the SQL query against that database.