Splunk Search

How to pass results of a search to a subsearch for correlation on VPN Logs (Security Use Case)?

enemymind
Explorer

Ive been making some headway on this query, not totally there yet however. I cant seem to get it to return the bytes in / bytes out in the results with the session IDs, its looking at one group of alerts for the username and session, and the subsearch is telling the top search what sessions to look for, but I cant seem to pass the bytes_in/bytes_out over in a way that doesnt mess with the top query.

The basics are this

  • Multiple sessions from the same user (log has username and session ID)
  • Second step log entry has no username and shows data is transferred by sessionID bytes_in and bytes_out are used

    sourcetype=apm_log index=vpn bytes_in>0 OR user!=n/a
    [ search sourcetype=apm_log index=vpn bytes_in>0
    | dedup session_id
    | fields session_id
    | mvexpand session_id
    | format]
    | stats dc(session_id) as count values(session_id) as SessionID by user
    | where count >2

so the problem is I cant seem to get it to add in the bytes_in / bytes out of those sessions it is finding the end result. Any help would be appreciated.

0 Karma

jeffland
SplunkTrust
SplunkTrust

I don't think you need a subsearch. You could try something like this:

sourcetype=F5VPN index=vpn user!=n/a bytes_in!=0 | dedup session_id | stats count values(session_id) as SessionID by user | where count>=2

This should directly give you the results in the form you were already looking at, minus those not interesting to you (i.e. those which have bytes_in!=0).

0 Karma

enemymind
Explorer

Nah apologies for not being clear. the log entries that contain the session id and username do not include the bytes in or out. other logs do. so I need to put the user name and session IDs together.. then take those session ids and look at the other logs that include only the session IDs and bytes in /out

0 Karma

jeffland
SplunkTrust
SplunkTrust

Oh I see. Sorry, could've understood that from your question had I read more carefully.

I don't know how it would look exactly with your data, but I imagine you could do it like this:

source=whereByteInIsFound bytes_in!=0 [search sourcetype=F5VPN index=vpn user!=n/a bytes_in!=0 | dedup session_id | stats count values(session_id) as SessionID by user | where count>=2 | mvexpand SessionID | fields SessionID | format]

You can see what the subsearch does when you open a new search and run it alone. You should try that with and without the format at the end to see what happens there.
I have added an mvexpand to get all SessionIDs separately, I am not sure if you need to consider the user as well - this information would be lost with this method, but could be included in your results as well.
I don't know if this works right away as I don't know what your data looks like, but the basic idea is from this post on SO:
http://stackoverflow.com/questions/15163497/filtering-splunk-results-using-results-of-another-splunk...

0 Karma

enemymind
Explorer

Ive just spent a good chunk of time trying to modify it a number of ways and it doesnt appear to work in the manner we are heading down.

0 Karma

jeffland
SplunkTrust
SplunkTrust

Hm. Have you tried searching the subsearch on its own, with and without format, to see what happens there?
I also just noticed that I copied over the bytes_in!=0 from my first answer over into the subsearch from my last comment, that does obviously not belong there.
If you could paste some log samples from each of your sources, I (or others) could maybe help you further.

0 Karma

enemymind
Explorer

I also modified it enough to pass the session ID's in a large or list into the first query, but then it simply returns an error, its as if the main search is not using what is passed to it to search again.

0 Karma

enemymind
Explorer

It i believe it said " where " and then just the long list of session_id's that it was passed. I did however take that long OR list of session ID's and pass it directly to the first part of the query that just looks for anything with bytes_in>0 I did get results... so the data is good. but how we are telling splunk to pass it or pick it up is clearly not right.

0 Karma

jeffland
SplunkTrust
SplunkTrust

Ah, so we are making progress! 🙂
Ok, this is strange. If you paste the result of the subsearch as a string into your main search it works, but not when you have it as a subsearch? Phew. That's weird. I'm sorry, but I can't imagine why that could be. I can try to ask around, but for now I'm at my wits' end.

0 Karma

enemymind
Explorer

Yeah def. its very odd. but thanks for the help in getting this far.

0 Karma

jeffland
SplunkTrust
SplunkTrust

What kind of error do you get in that case?

0 Karma

enemymind
Explorer

Yes I had taken that out and tried a number of things. Here are some samples of the logs

`2015-03-26T04:03:34-07:00 xxxx13650]: Rule /Common/APM-Logging : 01490001: Session ID: 1d1a8ab9|ClientIP: 111.222.5555.35|Username: user| Policy Result: allow|Resources Assigned: /Common/CORP_Full|Authentication: xxxxxxx.COM|Client Platform: Win7|User Agent: Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.1; WOW64; Trident/5.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; .NET4.0C; .NET4.0E; InfoPath.3)|Machine Cert Subject: /CN=.xxxxxxcom|Machine Cert Serial: 389EFC2Cxxxxxxxxx13EFB|Cert Verify: Successful |Process/File Check Result: Successful |AV Check Result: Successful |HD Encryption Result: Successful |Continent: EU|Country: United Kingdom| State: Greater London`

from the previous we pull the username and the session id

'2015-03-26T04:06:17-07:00 mmm-xxxxx-001 tmm1[13900]: 01490521:5: 573fc180: Session statistics - bytes in: 1810, bytes out: 461'

From this event we need to pull bytes_in or out >0

I was debating maybe a transaction command to join the events?. Thanks for all the help

0 Karma

jeffland
SplunkTrust
SplunkTrust

I also thought of transactions, but I have to admit my knowledge about those is slim.

One thing I could imagine that could cause trouble is when your subsearch does not produce exactly the same field name you are looking for in your main search. For example, I don't see a field "session_id" in your first log (only a "Session ID"), but that is what we used for the subsearch so far. I'm assuming there is some rename or eval somewhere else in your search as you mentioned you were getting nice results with the subsearch alone.
Still, when you get a resulting list from your subsearch like ((SessionID = 1d1a8ab9) OR (SessionID = xyz)), a field with exactly this name ("SessionID") has to be present in your main search. In your second sample, I don't see any field name like that at all, so your subsearch will not match anything. It will only work if that data has exactly what is returned by the subsearch.

0 Karma

enemymind
Explorer

I made sure to normalize the field throughout the search. I also just rechecked to make sure that it was extracting the fields the same throughout both events. It is. I admit in the first search its likely that was one of the isssues having changed it.

0 Karma
Get Updates on the Splunk Community!

New Dates, New City: Save the Date for .conf25!

Wake up, babe! New .conf25 dates AND location just dropped!! That's right, this year, .conf25 is taking place ...

Introduction to Splunk Observability Cloud - Building a Resilient Hybrid Cloud

Introduction to Splunk Observability Cloud - Building a Resilient Hybrid Cloud  In today’s fast-paced digital ...

Observability protocols to know about

Observability protocols define the specifications or formats for collecting, encoding, transporting, and ...