I am trying to take the results of one search, extract a field from those results (named "id") and take all of those values (deduped) and use them to get results from another search. Unfortunately the second search doesn't have this field name directly in the sourcetype either so it has to be extracted with rex.
I've been having issues with this though. From what I've read I need to use the subsearch to extract the id's for the outer search. It's not working though. Each search is from a competely different data set that has very little in common.
index=index1 source="/somefile.log" uri="/path/with/id/some_id/"
| rex field=uri "/path/with/id/(?<some_id>[^/]+)/*"
[ search index=index2 source="/another.log"" "condition-i-want-to-find"
| rex field=_raw "some_id:(?<some_id>[^,]+),*"
| dedup some_id
| fields some_id
]
I've tried a bunch of variations of this with no luck. Including renaming field some_id to "search" as some have said that would help. I don't necessarily need the original uri="/path/with/id/some_id" in the outer search but that would be nice to limit those results.
Whereas the syntax problem that @PickleRick pointed out can be rectified by adding a pipe like this
index=index1 source="/somefile.log" uri="/path/with/id/some_id/"
| rex field=uri "/path/with/id/(?<some_id>[^/]+)/*"
| search
[ search index=index2 source="/another.log"" "condition-i-want-to-find"
| rex field=_raw "some_id:(?<some_id>[^,]+),*"
| dedup some_id
| fields some_id
]
this method reduces the advantage of using subsearch in your dataset.
To improve efficiency, "renaming field some_id to "search" as some have said would help" actually will help. (In part because / is a hard separator in Splunk.) You just need to add a format command:
index=index1 source="/somefile.log" uri="/path/with/id/some_id/"
[ search index=index2 source="/another.log"" "condition-i-want-to-find"
| rex field=_raw "some_id:(?<search>[^,]+),*"
| dedup search
| fields search
| format
]
| rex field=uri "/path/with/id/(?<some_id>[^/]+)/*"
Here is an emulation. Play with it and compare with your data.
index = _internal log/splunk
``` the above emulates
index=index1 source="/somefile.log" uri="/path/with/id/some_id/"
```
[makeresults format=csv data="search
supervisor.log
splunkd_ui_access.log"
``` the above emulates
[ search index=index2 source="/another.log"" "condition-i-want-to-find"
| rex field=_raw "some_id:(?<search>[^,]+),*"
| dedup search
| fields search
| format
]
```
| format]
| rex field=series "log/splunk/(?<some_id>[^\"]+)" ``` emulates | rex field=uri "/path/with/id/(?<some_id>[^/]+)/*" ```
| stats count by some_id
On my laptop, it gives
some_id | count |
splunkd_ui_access.log | 59 |
supervisor.log | 1045 |
As you can see, among all the logs, the output is limited to the two values in the subsearch.
Whereas the syntax problem that @PickleRick pointed out can be rectified by adding a pipe like this
index=index1 source="/somefile.log" uri="/path/with/id/some_id/"
| rex field=uri "/path/with/id/(?<some_id>[^/]+)/*"
| search
[ search index=index2 source="/another.log"" "condition-i-want-to-find"
| rex field=_raw "some_id:(?<some_id>[^,]+),*"
| dedup some_id
| fields some_id
]
this method reduces the advantage of using subsearch in your dataset.
To improve efficiency, "renaming field some_id to "search" as some have said would help" actually will help. (In part because / is a hard separator in Splunk.) You just need to add a format command:
index=index1 source="/somefile.log" uri="/path/with/id/some_id/"
[ search index=index2 source="/another.log"" "condition-i-want-to-find"
| rex field=_raw "some_id:(?<search>[^,]+),*"
| dedup search
| fields search
| format
]
| rex field=uri "/path/with/id/(?<some_id>[^/]+)/*"
Here is an emulation. Play with it and compare with your data.
index = _internal log/splunk
``` the above emulates
index=index1 source="/somefile.log" uri="/path/with/id/some_id/"
```
[makeresults format=csv data="search
supervisor.log
splunkd_ui_access.log"
``` the above emulates
[ search index=index2 source="/another.log"" "condition-i-want-to-find"
| rex field=_raw "some_id:(?<search>[^,]+),*"
| dedup search
| fields search
| format
]
```
| format]
| rex field=series "log/splunk/(?<some_id>[^\"]+)" ``` emulates | rex field=uri "/path/with/id/(?<some_id>[^/]+)/*" ```
| stats count by some_id
On my laptop, it gives
some_id | count |
splunkd_ui_access.log | 59 |
supervisor.log | 1045 |
As you can see, among all the logs, the output is limited to the two values in the subsearch.
A subsearch will get executed first and if it completes successfully (which might not happen - subsearches have limitations and throwing heavy raw-data based searches into them is not a good idea) will return a set of conditions or a search string which will get substituted in the main search.
So your search as it is will make no sense syntactically because the rex command doesn't take more arguments.
If anything you'd need to do
<something>
| search [ your subsearch here ]