About ericl42

ericl42 · ‎01-19-2021

Hello, I've ready a ton of forums posts regarding this but I still cannot get it to work so I'm hoping someone could point out what I'm doing wrong. The scenario I have is there are multiple hosts with the Splunk agent installed on it and we're currently logging that data to our Splunk indexers + a syslog server. For a short period of time, I want to send a subset of logs only to syslog but I can't seem to get that to work. Below is my current config on my heavy forwarders. I expect this to send all hosts with server* to Splunk and syslog but only endpoint* to syslog. Right now no matter what I do, everything still goes to Splunk. I even fully commented out the routeSubset section and "splunk reload deploy-server" and I still got those logs in Splunk Any thoughts would be greatly appreciated. props.conf [source::WinEventLog:Security] TRUNCATE = 0 SEDCMD-win = s/(?mis)(Token\sElevation\sType\sindicates|This\sevent\sis\sgenerated).*$//g TRANSFORMS-routing = routeSubset, routeSubset2 transforms.conf [routeSubset] SOURCE_KEY=MetaData:Host REGEX=(?i)^server[0-9][0-9].* DEST_KEY=_TCP_ROUTING FORMAT=splunkssl [routeSubset2] SOURCE_KEY=MetaData:Host REGEX=(?i)(.*endpoint[0-9][0-9].*|^server[0-9][0-9].*) DEST_KEY=_SYSLOG_ROUTING FORMAT=syslog_server

ericl42 · ‎08-06-2020

I ended up going down the Event Hub route. You can use the Azure Diagnostic agent to push all Linux/Windows logs to EH and then use Microsoft Azure Add-on for Splunk to then ingest those logs into Splunk. This allows the traffic to not go over the VPN connection and just pay for back egress traffic outbound to Azure. Since it's SSL, it should do some encryption and save you a few costs. You can then modify the transforms and props files to help divide the logs up to the appropriate indexes and what not.

ericl42 · ‎07-13-2020

I know there have been a lot of conversations around this topic and technology is constantly changing to make things easier, so I was curious if anyone has had recent experience ingesting all Azure logs to an on-premise Splunk instance. We're looking at both keeping logs local to Azure and utilize Sentinel but we'd prefer to keep them in Splunk so we have one location for logs no matter where they are (i.e. Azure, AWS, on-premise, etc.) Potential solutions: Install a forwarder on every VM and back haul the traffic across the VPN to on-premise indexers. I don't want to pay the VPN costs for this or have that dependency. it's also just not very "cloud-ish" solution. Stand up indexers in Azure and replicate the clusters in Azure & on-premise. These VMs cost a lot of money. Send all logs to Event Hub and use Splunk to pull from there. This seems like a decent solution but I'm not sure of the costs or parsing issues this may entail. I would love to hear how others compared Sentinel to Splunk and justified sticking with Splunk in Azure when you had an on-premise Splunk architecture. Note that we want the infrastructure/platform logs but have a hard requirement to get the OS and app logs (i.e. Windows security, RHEL /var/log/secure, Apache, Squid proxy, etc.) Thanks!

ericl42 · ‎05-22-2020

I need to monitor all file reads, writes, deletes, etc. on a SMB share from a Windows server. In the past, I've just turned on full file auditing on the folder in question and used the Splunk Universal Forwarder to grab those events and it worked great. However, I'm not sure how to complete that with a SMB share. I've looked at the forums and I see people referencing fschange but that appears to be been deprecated so I'd like to go the normal Windows logging route. Questions If I turn on file auditing on \smbshareserver\share1 and mount it to server1.company.com, would all of the file access attempts be logged to local Event Log even if another server mounts that share? I would think no and therefore defeats the purpose. Does the Isilon storage have it's own log file that would write this somewhere else and I could grab it there? Is there a better/easier solution to do this? The whole goal is I need to fully monitor this SMB share from one location even though lots of computers and users could access it.

ericl42 · ‎12-18-2019

I've tried that as well and it still doesn't appear to be working. Code: events = helper.get_events() for event in events: print(event) risk_object = event.get("risk_object") helper.log_info("event.get(\"risk_object\")={}".format(risk_object)) risk_object_type = event.get("risk_object_type") helper.log_info("event.get(\"risk_object_type\")={}".format(risk_object_type)) risk_message = event.get("risk_message") helper.log_info("event.get(\"risk_message\")={}".format(risk_message)) Output: signature="event.get("risk_object_type")=None"

ericl42 · ‎12-17-2019

I've been using AR rules within notables for about a year now and I've had quite a bit of success with it. Previously I always just used AR to pull variables from my notables via something like this: host = helper.get_param("host") And since host is a field in my notable, it pulls it fine. However, this does not work for risk_object or risk_object_type. Attached is just one example of a notable that I tripped but it will not pull the risk_object or risk_object_type variable. The odd part is, that it pulls the risk_message variable fine. I've tested this with two correlation rules that I have and neither one will pull risk_object but if I alias it to something else, it pulls it fine. Any idea what this is occurring? Update: It looks like the variable is just being pulled out correctly and I'm not sure why. Below is the output from the AR log. risk_object = $risk_object$ | table _time risk_object_type = $risk_object$ spanning $sourceCount$ Risk Rules

ericl42 · ‎12-12-2019

I finally found a solution using a combination of timechart which instantly adds 0 to empty bucks and then untabling it so I can then format it however I want. | timechart span=5m limit=0 dc(dest_ip) as num_dest_ips by src_ip | untable _time, src_ip, num_dest_ips | eventstats avg(num_dest_ips) as avg stdev(num_dest_ips) as stdev by src_ip | eval avg = round(avg,2) | eval stdev = round(stdev,2)

ericl42 · ‎12-11-2019

I like the concept of potentially changing the logic if I can't make the other time show num_dest_ips as 0. However, I could see there being issues where if two hosts have the same amount of IPs during both buckets (e.g. 20), then the stdev is going to be 0 there too.

ericl42 · ‎12-11-2019

We utilize adaptive response rules quite a bit within Splunk and have had quite a bit of success manually running them after the notable event is created. Recently we have had a few use cases where we want an adaptive response rule to automatically run once the notable event is tripped and then close out the notable. The issue I'm having is that it appears to be some race condition where if I create a correlation rule that has both the action of create a notable and run my adaptive response rule, it's not working. With my adaptive response action, I normally pull variables from the notable and then auto update it but I'm not sure how to do all of that after the notable is created. Has anyone doing something along these lines? Thanks.

ericl42 · ‎12-11-2019

Thanks for your response. I tried your query and the only thing I got was two rows that did a 9:00 and 9:15 bucket but no other data is in any of the other columns. Attaching a screenshot to my initial question of what I would like to see and then I'll only alert where outLier>0.

ericl42 · ‎12-10-2019

I've read quite a few forum posts about this but honestly didn't find a great solution for my use case (note that is probably because I didn't fully understand some of the items going on). I'm trying to create a rule to alert on internal IP scanning. Below is my current logic. index="firewall*" | bucket _time span=15m@m | stats dc(dest_ip) as num_dest_ips values(dest_ip) as dest_ips dc(dest_port) as num_dest_ports values(dest_port) as dest_ports count by src_ip, _time | eventstats avg(num_dest_ips) as avg stdev(num_dest_ips) as stdev by src_ip | eval avg = round(avg,2) | eval stdev = round(stdev,2) | lookup dnslookup clientip AS src_ip OUTPUT clienthost as src_dns | eval temp=split(src_dns,".") | eval src_dns=mvindex(temp,0) | eval src_system = coalesce(src_dns, src_ip) | search NOT src_system=dns* | eval lower_bound = avg-(stdev*.3) | eval lower_bound = round(lower_bound,2) | eval upper_bound = avg+(stdev*.3) | eval upper_bound = round(upper_bound,2) | eval isOutlier = if(dest_ips>upper_bound OR dest_ips<lower_bound,1,0) | eval difference = upper_bound-lower_bound | eval different = round(different,2) | table _time, src_ip, src_system, dest_ips, num_dest_ips, dest_ports, num_dest_ports, count, avg, lower_bound, upper_bound, difference, stdev, isOutlier This works very well but as I mentioned, if 192.168.1.1 has no entries during a 15 min time bucket, then no entries will show up and therefore the math won't occur. I've seen some posts where they append data and make the bucket command work but this is a fairly large index and I'd prefer not to double up on the search. I've seen others that recommend using timechart but my confusion on that front is how I would get all of my various stats commands to work with it. Overall, I Just want something like the table below to occur and the 9:15 line be populated to zero if there are truly no entries. Unless there is a better way to do this entire setup, which I'm definitely open to. _time src_ip num_dest_ips avg 9:00 192.168.1.1 30 15 9:15 192.168.1.1 0 15 Any hep would be greatly appreciated. Below is what I would like to see.

ericl42 · ‎11-26-2019

Thank you very much. Yesterday before this post I accidentally started testing with aligntime and it seemed to fix the issue but I wasn't 100% why. I don't think I can use sliding windows because I'm mocking all of these rules up for ES correlation searches.

ericl42 · ‎11-25-2019

Thanks for the response. After digging around a little, I think I may have fixed the issue by adding the aligntime portion. However I'll take a look at your new query as well. | bucket _time span=2h@h aligntime=-1h@h

ericl42 · ‎11-25-2019

Timechart may work for this one scenario, but I have others where I count by multiple fields and timechart only allows me to do one.

ericl42 · ‎11-22-2019

Thanks for the quick response. I like the concept of bins so I always know it's two items I'm comparing against vs. potentially three. I tried this on on my query and the time just now says 2019-11-22 and doesn't have another hour or delimiter. So basically even though I said 2 bins, I'm only seeing one row per user ID.

ericl42 · ‎11-22-2019

I've read a few other forum posts with similar issues but I never found a true solution for. Overall I'm trying to mock up some correlation rules within Enterprise Security where my time frame is going to be -5h@h to -1h@h. I want to make this into two bucks so I can compare two hour time frames against one another. I continually get 3 buckets even though there should only be two. In my most recent test I did this search: (index=os_windows* OR index=os_unix*) (source=WinEventLog:Security OR sourcetype=linux_secure OR tag=authentication) action=failure NOT Result_Code=0x17 NOT Account_Name="*$" earliest=-5h@h latest=-1h@h | bucket _time span=2h | stats values(user) AS affected_users, values(ComputerName) as dc, dc(user) AS num_users count BY src_ip _time | where num_users > 3 When I look at the results, I see multiple src_ip's that have a 9:00, 11:00, and 13:00 row. It's currently 2:30 pm so the breakdown is: 2:30 = current time 1:30 = -1h@h 12:30 = -2h@h 11:30 = -3h@h 10:30 = -4h@h 09:30 = -5h@h So I should have a 9:00 - 11:00 bucket and an 11:00 - 1:00 bucket. I have no idea why it's also showing me a 13:00 bucket in my search results. This is throwing off my math since that number is quite a bit different as I assume it's not the full hour and it's not snapping correctly or something.

ericl42 · ‎11-18-2019

I've read a few forum posts on using standard deviations or MLTK to alert on large increases in events but unfortunately the solutions vary greatly. For the sake of this post, I'm using the general logic that is incorporated within various Security Essentials searches that look for significant increases and tailoring that to a password spraying scenario where one host is failing to login to multiple user accounts. index=os_logs (source=WinEventLog:Security OR sourcetype=linux_secure) action=failure | bucket _time span=1h | stats dc(user) as count by _time src_ip | eventstats max(_time) as maxtime | stats count as num_data_samples max(eval(if(_time >= relative_time(maxtime, "-1h@h"), 'count',null))) as "count" avg(eval(if(_time<relative_time(maxtime,"-1h@h"),'count',null))) as avg stdev(eval(if(_time<relative_time(maxtime,"-1h@h"),'count',null))) as stdev by "src_ip" | eval lowerBound=(avg-stdev*2), upperBound=(avg+stdev*2) For this search, everything works the way I expect it to until I get into the eventstats maxtime line and then start doing the math. For every scenario I've seen via Security Essentials, they are either looking back 30 days and doing 1 day buckets or in one scenario it's going back 7 days and doing a 70 min bucket which makes no sense to me. I essentially just want to see the following: 2 hours ago, 192.168.1.10 failed logging into 3 different user accounts 1 hour ago, 192.168.1.10 failed logging into 10 different user accounts Therefore my num_data_samples should be two since I'm only looking at a two hour span, my count would be 13 I believe since that's the amount of users (assuming it's 10 different users and not 3 overlapping), and then the lower and upperbound would be the 3 and 10 and I could alert off of a stdev > X. When I only try to do this type of logic for a 2 hr. time period I never get the correct stdev or upper/lower bounds. I'm not very familiar with the maxtime vs. the relative_time of -1h@h so I assume something is wrong here. In my mind I would think my time range would only be -2 hours and then I would bucket those into one hr. blocks to compare against. The closest post I found was here and it has some similar logic but I'd like to base mine off of Security Essentials if that is truly the best way to do this.

ericl42 · ‎10-18-2019

Hello, I utilize Adaptive Response quite a bit for automatically creating incident tickets and dumping all of the relevant data in there. Overall this has worked out pretty well with help.get and pulling variables from the notable event that I created. The issue I'm running into is that I can only pull specific variables that I define in the notable event. I have a few use cases that I'd like to dump rows from the notable event or rows from a drill down search vs. just a basic variable. Scenario One How can I pull rows vs. just variables and make those rows a variable that I can put into a ticket? I know there is the helper.get_events() but it's a dictionary format I believe and I had some issues with making it a variable where I can paste the string into a ticket. As well as only doing X rows or rows with Y criteria. Scenario Two I want to pull rows down from the drill down search I have for the notable event. For this I assume I need to setup HEC or do some sort of Python search API call into Splunk and then pull that back into Adaptive Response and convert it to a string? Has anyone done that and could share their code? I love the emails you get from a notable event that has the data in a column format. I want to be able to put that "pretty" data into my tickets. Thanks in advance.

ericl42 · ‎08-22-2019

The notable events will be ran in the background as some system account. If Bob is logged in and working on a notable event, I was the logged in user variable to be Bob so I can auto assign the external ticket I create to him by passing his username variable. If John is logged in doing a notable event, it should be John that is the user variable.

ericl42 · ‎08-22-2019

We're using an adaptive response rule to create tickets for our notable events. One item that I need is the current logged in user variable that I can call and then pass to the ticketing system. I would prefer not to modify all of my correlation rules and insert the logged in user name there and just rely on an environment variable or another form. I've read a few articles and I know I can query the API via the command below to grab the information but I hope there is an easier way. | rest /services/authentication/current-context splunk_server=local | fields username I've also read some forum posts stating that $env:user$ should work. All of the examples I've seen are in XML and Dashboards. When I try to call that within my adaptive response rule either via Python code or alert action parameters, it doesn't work. It just prints out $env:user$ instead of any variable. Most of my variables today follow the $result.something$ format since they are all in the notable event, but as I mentioned above, I would prefer not to have to insert that in all of my events. What is the easiest way to get the logged in user variable via adaptive response/Python code?

ericl42 · ‎03-11-2019

What ended up being the issue was that even though I told the add-on builder to not use a proxy, it was still using one from Splunks server.conf and splunk-launch.conf. I added the IPs/hostnames to the no proxy rule thinking that would work, but unfortunately it still didn't. The final solution to my problem was adding the following lines in my Python code to 100% tell it not to use any proxy. import os os.environ['no_proxy'] = '*'

ericl42 · ‎03-11-2019

Under "Add-on Setup Parameters" I have the "proxy settings" section checked but I don't have anything configured for it (i.e. no text boxes which the information in there). When I go to the Python code, I don't see any proxy references other than this, which is obviously commented out. # response is a response object in python requests library #response = helper.send_http_request("http://www.splunk.com", "GET", parameters=None, # payload=None, headers=None, cookies=None, verify=True, cert=None, timeout=None, use_proxy=True) When I log in as the splunk use, I don't see any proxy env variable set.

ericl42 · ‎03-08-2019

I created an adaptive response via the Splunk Add-on Builder in my dev environment and everything is working fine. When I export the TA to production, the adaptive response shows up correctly,` and all of my predefined items are in there, but the code is not working. Below is a simplified version of the Python code. import sys import json import requests import getpass import requests.packages.urllib3 from requests.packages.urllib3.exceptions import InsecureRequestWarning requests.packages.urllib3.disable_warnings(InsecureRequestWarning) def process_event(helper, *args, **kwargs): helper.set_log_level(helper.log_level) auth_url = "https://ticketing_system.domain.com" login_data = { 'username': 'YYYYY', 'password': 'XXXXXXXX' } session = requests.session() try: r = session.post(auth_url, json=login_data, verify=False) web_token = r.text r.raise_for_status() except Exception as e: print(e) print(r.text) print(r.status_code) sys.exit() parsed_token = web_token.split('"')[3] headers = {'Authorization': 'Bearer ' + parsed_token, 'Content-Type': 'application/json'} helper.log_info("Alert action ticketing started.") resp = requests.post("https://ticketing_system.domain.com", headers=headers, data=json.dumps(sn_input), verify=False) if resp.text: print(resp.text) helper.addevent(resp.text, sourcetype="ticketing") helper.writeevents(index="_internal", host="localhost", source="ticketing") return 0 When I run this in prod, I get the following error. signature="Unexpected error: local variable 'r' referenced before assignment." So, for some reason, it's not respecting my try statement and the sessions.post. Therefore I simplified that even more and did this. r = session.post(auth_url, json=login_data, verify=False) web_token = r.text r.raise_for_status() This then gave me a different error. signature="Unexpected error: HTTPSConnectionPool(host='ticketing_system.domain.com', port=443): Max retries exceeded with url: / (Caused by ProxyError('Cannot connect to proxy.', error('Tunnel connection failed: 503 Service Unavailable',)))." I have confirmed that the Python code outside of Splunk works fine. Firewall rules are opened for direct connections to the ticketing.domain.com site. I thought it may have been picking up on the proxy anyway but when I perform a tcpdump on the server running the adaptive response, I see no outbound connections even occurring. I also can't seem to get the logging level set to debug that will provide me any additional information on what is truly going on. Any help would be greatly appreciated as I have no idea why it works in dev but not prod. Note that on dev, I'm using the same box that the add-on builder is running on. I have thought about installing the add-on builder directly to our prod Enterprise Security box but I know that is frowned upon.

ericl42 · ‎03-06-2019

Putting an official answer on here for anyone else that is having issues with this. If I used dc on the signatures field and then modified the where clause to be total_signatures, it worked perfectly for me. I still have all of the variables that I need for adaptive response. index=security*sep sourcetype IN (symantec:ep:proactive:file, symantec:ep:risk:file) | stats count by dest, signature, file_name, file_path, file_hash | stats dc(signature) AS total_signatures, values(file_name) as process, values(file_path) AS full_path, values(file_hash) AS sha256 count by dest | where total_signatures > 1

ericl42 · ‎03-06-2019

Unfortunately that doesn't work. The count still counts whichever field has the most entries in it and the signature_count does something crazy and makes the number really large. There is one with 4 risk_signatures and 10 full_paths, and 6 sha256s. The signature_count it gives is 36 for some reason. There is another one with even less and the signature count is 147.

Posts	38
Solutions	3
Karma Given	5
Karma Received	4
Member Since	‎11-27-2018

Online Status	Offline
Date Last Visited	‎01-21-2021 12:30 PM

Heavy Forwarded Filtering Hosts

Ingesting Azure Logs (IaaS, PaaS, OS, app, etc.)

Monitor File Activity on SMB Share

Adaptive Response Not Pulling Variables

Adaptive Response & Notable Race Condition

Stdev against Null Time Bucket- How to create a ru...

Why are bucket times expanding?

Standard Deviation for Security Events

Adaptive Response Variables

Current User Variable within Adaptive Response

Heavy Forwarded Filtering Hosts

Re: Ingesting Azure Logs (IaaS, PaaS, OS, app, etc...

Ingesting Azure Logs (IaaS, PaaS, OS, app, etc.)

Monitor File Activity on SMB Share

Re: Adaptive Response Not Pulling Variables

Adaptive Response Not Pulling Variables

Re: Stdev against Null Time Bucket

Re: Stdev against Null Time Bucket

Adaptive Response & Notable Race Condition

Re: Stdev against Null Time Bucket

Stdev against Null Time Bucket- How to create a ru...

Re: Why are bucket times expanding?

Re: Why are bucket times expanding?

Re: Why are bucket times expanding?

Re: Why are bucket times expanding?

Why are bucket times expanding?

Standard Deviation for Security Events

Adaptive Response Variables

Re: Current User Variable within Adaptive Response

Current User Variable within Adaptive Response

Re: How to make the code work in an Adaptive Respo...

Re: How to make the code work in an Adaptive Respo...

How to make the code work in an Adaptive Respone v...

Re: How do you do a stats count by a specific fiel...

Re: How do you do a stats count by a specific fiel...