Hi,
I have scan dataset. It has a field name TEXT. This field contains the data for test results. I am provided a data from SQL table which I am ingesting via DBConnect. This also contains TEXT field but along with Priorities to assign to. I am using data from SQL to create a saved search that outputs into a lookup.
My issue is that the data that is in the lookup has TEXT field with wildcard (%). I am not able to match what is in the raw data with what is in the lookup because of wildcard.
Sample:
In raw data:
TEXT= Cas is required to perform this test.
In lookup:
TEXT=
%Cas is required to perform this test.%
%Cas is required%
%Cas is required % this test.%
%
Test search:
index=sample sourcetype=sample_vuln TEXT IN ("*Cas is required*")
| stats count by TEXT
| lookup sample_rules TEXT AS TEXT OUTPUT TEXT AS TEXT_Test, Priority
How can I match the data in raw with the data in lookup containing same field values with wildcards in above combinations.
I did edited the lookup's definition and configured Match Type option as WILDCARD(TEXT) and this is not helping.
Thanks in advance!!!
If TEXT_updated has a value * then it must have come from the 4th row in your lookup.
I created this lookup file text.csv and made a lookup definition with WILDCARD(TEXT) , case insensitive and with no min/max matches.
This search
| makeresults
| eval TEXT=split("Cas is required to perform this test.:perform:is required:Cas is required:Cas is required bla bla bla this test.",":")
| mvexpand TEXT
| lookup text TEXT OUTPUT TEXT as TEXT_found
then results in
which is working as expected. As there is no max match, it will match all that apply and the * will match in all cases.
In Splunk, % is not a wildcard, so that will not work in the lookup definition - it would need a *
However, if the data in the lookup always has the start and end %, you can do
| eval lookup_TEXT="%".TEXT."%"
| lookup sample_rules TEXT AS lookup_TEXT OUTPUT TEXT AS TEXT_Test, Priority
so effectively just adding on the % chars for the lookup.
Alternatively you could clean the lookup if that's under your control, but from what you said, it sounds like you can create the lookup without %.
Hi @ bowesmana,
To test your "*" suggestion, I created another job that used below eval to create another field where % is replaced by * and output to a new lookup table which I then used to create new lookup definition.
| eval TEXT_Updated = replace (TEXT, "%", "*")
So all the Test value below became like this:
*Cas is required to perform this test.*
*Cas is required*
*Cas is required * this test.*
*
Match Type as WILDCARD(TEXT_Updated)
Then I adjusted and re-ran below query
index=sample sourcetype=sample_vuln TEXT IN ("*Cas is required*")
| stats count by TEXT
| lookup sample_rules_updated TEXT_Updated AS TEXT OUTPUT TEXT_Updated, Priority
Result was:
TEXT=Cas is required to perform this test.
TEXT_Updated=*
At least a value (*) appeared this time. So it did not match. Feedback pls?
If TEXT_updated has a value * then it must have come from the 4th row in your lookup.
I created this lookup file text.csv and made a lookup definition with WILDCARD(TEXT) , case insensitive and with no min/max matches.
This search
| makeresults
| eval TEXT=split("Cas is required to perform this test.:perform:is required:Cas is required:Cas is required bla bla bla this test.",":")
| mvexpand TEXT
| lookup text TEXT OUTPUT TEXT as TEXT_found
then results in
which is working as expected. As there is no max match, it will match all that apply and the * will match in all cases.