Splunk Search

Map command subsearch results losing initial search value from the output?

sureshtskumar
Explorer

Here is an example of SPL I am trying to run.

| makeresults
| eval ProxyUser="User1,User2,User3"
| makemv delim="," ProxyUser
| mvexpand ProxyUser
| map maxsearches=0 search="search index=edrlogs* SubjectName=*$ProxyUser$ earliest=-24h | eval ProxyUser1=$ProxyUser$"
| fillnull value="N/A"
| table _time SubjectName EndpointName IPAddress ProxyUser1

I am getting results, however the ProxyUser1 field is empty. The initial searched value of ProxyUser has been  eval'd to a new field named ProxyUser1, within the map command. I have read some other posts where the eval command after the map search should do the trick, but I believe I am doing something wrong here

Any leads would be much appreciated

Labels (2)
Tags (3)
0 Karma
1 Solution

sureshtskumar
Explorer

This is finally found to be a potential bug or atleast a limitation in the map command. I went ahead and performed some very generic searches in Splunk using the map command to pass three types of data to be searched in the map subsearch; text, numbers, alpanumeric with special characters. The results of the test shows that any values containing pure text strings (no numbers, no special characters) that are passed on to the map command cannot be recovered in the search results.

Here is the proof:

Query Used for test:

| makeresults
| eval field1="TextString", field2="12345", field3="user@12345mail.com"
| table field1, field2, field3
| map maxsearches=300 search="search index=_internal ($field1$ OR $field2$ OR $field3$) earliest=-2h | eval TextString=$field1$, Number=$field2$, Alphanum=$field3$"
| table _time index TextString Number Alphanum

Results:

Notice the TextString field is empty while the number and alphanumeric and special character values are all retained in the output.

I will try and bring this to Splunk's attention, however not sure if this is going to be fixed or left as is. Thanks to all who took the time to read through and helped with suggestions

Splunk_Map_Limitation.png

View solution in original post

0 Karma

PickleRick
SplunkTrust
SplunkTrust

Firstly - ughhhh... Not only you use map which should not be used unless it absolutely cannot be avoided. And it should not be run with big datasets.

If you run your search built properly, offloading as much processing as possible to indexers, you're splitting your work across your whole deployment. If you do "tricks" like subsearches and map, you spawn multiple searches, bounce the data back and forth between search heads and indexers and generally make it as inefficient as can be 😉

And additionally, in your original search you use wildcard at the beginning of your match term. No wonder it's taking forever to finish since it has to scan all events!

But secondly - in your search spawned from the map command the token is effectively substituted "directly" with given string. So you're getting something like

| eval resultfield=TextString

Which most probably is not what you need because you're trying to set value of resultfield to a value of a field called TextString instead of TextString itself.

So you need to do

| eval resultfield=\"$field1$\"

And this does work.

0 Karma

sureshtskumar
Explorer

Thanks @PickleRick for taking the time and effort to respond.

I am personally not a fan of map command and am using it as it appeared to be the best option to pass multiple parameters from the main search to multiple other indexes to get the final output. The solution to the issue I was facing with map command wasn’t documented anywhere, and it did appear to be a bug/limitation in handling certain type of data while other types were handled correctly.  Hope it clarifies 

0 Karma

sureshtskumar
Explorer

Hi @gcusello ThankYou so much for the response. Just to give you a little more insight, the ProxyUser parameter values are obtained from the first search in a network index (although I am just using makeresults to simulate those users). This list of users from the network index is then passed on to an EDR index to get the hostname these users use. So ideally, this should look as:

index=proxy domain="somebad.com" earliest=-24h
| stats values(ProxyUser) as ProxyUser 
| map maxsearches=0 search="search index=edrlogs SubjectName=*$ProxyUser$ earliest=-24h | eval ProxyUser1=$ProxyUser$"
| fillnull value="N/A"
| table _time SubjectName EndpointName IPAddress ProxyUser1

So the values _time, SubjectName, EndpointName and IPAddress are all coming from EDR logs and I want to retain the original username from the first proxylogs index, which is ProxyUser1. This is where the values are getting empty. Hope this clarifies

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @sureshtskumar,

if you need to extract values from two indexes correlating events from both the indexes, you can use the transaction command, but I use this solution only when I haven't any other choice because it's very slow.

Please try this different approach using stats:

(index=proxy domain="somebad.com") OR (index=edrlogs) earliest=-24h
| rename ProxyUser AS SubjectName
| stats 
   earliest(_time) AS _time 
   values(EndpointName) AS EndpointName
   values(IPAddress) AS IPAddress
   dc(index) AS index_count
   BY SubjectName 
| where index_count>1

in this way you have the fields of only ProxyUsers that are in both the indexes.

Ciao.

Giuseppe

0 Karma

sureshtskumar
Explorer

Hi @gcusello thanks again for your kind response.

That search query is very expensive and my search times out as the edr index is humongous. So this couldnot be tested in the actual environment.

I would rather need a search where the usernames are restricted to those who visited the somebad.com domain as seen in the proxy logs, then pipe it to the EDR logs. The very reason why I am using the map command is that I can add more values from the first search in the proxy logs, such as the earliest/latest time this traffic was seen. This would help in pinpointing to the user especially when a shared machine is being investigated. 

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @sureshtskumar,

add these additional conditions to the main search to limit results.

I'd like that you understand the approach and adapt it to your needs.

ciao.

Giuseppe

0 Karma

sureshtskumar
Explorer

Hi @gcusello Thanks for the reply. Unfortunately that search  doesn't return any hits. I will keep troubleshooting and let you know if I find where I am messing up

(index=proxy domain="somebad.com") OR (index=edrlogs) earliest=-24h
| rename ProxyUser AS SubjectName
| stats
earliest(_time) AS _time
values(EndpointName) AS EndpointName
values(IPAddress) AS IPAddress
dc(index) AS index_count
BY SubjectName
| where index_count>1

 

 

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @sureshtskumar,

you should anakyze the correlation key to understand if there are matching values in the two fields.

If not, you have to find a common part of those fields.

Ciao.

Giuseppe

0 Karma

sureshtskumar
Explorer

Hi @gcusello @Will get my head around the suggested query. Meanwhile if you can find out what is breaking my original map command, especially why is the value from the first search not retained in the final output of the second search, it would be really helpful. Have seen others raising similar question with the map command and so far haven’t seems any good reason as to why this happens

0 Karma

sureshtskumar
Explorer

This is finally found to be a potential bug or atleast a limitation in the map command. I went ahead and performed some very generic searches in Splunk using the map command to pass three types of data to be searched in the map subsearch; text, numbers, alpanumeric with special characters. The results of the test shows that any values containing pure text strings (no numbers, no special characters) that are passed on to the map command cannot be recovered in the search results.

Here is the proof:

Query Used for test:

| makeresults
| eval field1="TextString", field2="12345", field3="user@12345mail.com"
| table field1, field2, field3
| map maxsearches=300 search="search index=_internal ($field1$ OR $field2$ OR $field3$) earliest=-2h | eval TextString=$field1$, Number=$field2$, Alphanum=$field3$"
| table _time index TextString Number Alphanum

Results:

Notice the TextString field is empty while the number and alphanumeric and special character values are all retained in the output.

I will try and bring this to Splunk's attention, however not sure if this is going to be fixed or left as is. Thanks to all who took the time to read through and helped with suggestions

Splunk_Map_Limitation.png

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @sureshtskumar,

good for you, see next time!

Ciao and happy splunking

Giuseppe

P.S.: Karma Points are appreciated 😉

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @sureshtskumar,

why don't you use the first part in a subsearch?

somethng like this:

index=edrlogs* SubjectName=*$ProxyUser$ earliest=-24h [ | makeresults
| eval ProxyUser="User1,User2,User3" | makemv delim="," ProxyUser | mvexpand ProxyUser | eval SubjectName="*"."ProxyUser | fields SubjectName ]
| fillnull value="N/A"
| table _time SubjectName EndpointName IPAddress ProxyUser1

see my approach and adapt it to your need.

 Ciao.

Giuseppe

0 Karma
Get Updates on the Splunk Community!

Join Us for Splunk University and Get Your Bootcamp Game On!

If you know, you know! Splunk University is the vibe this summer so register today for bootcamps galore ...

.conf24 | Learning Tracks for Security, Observability, Platform, and Developers!

.conf24 is taking place at The Venetian in Las Vegas from June 11 - 14. Continue reading to learn about the ...

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...