All Posts

Find Answers
Ask questions. Get answers. Find technical product solutions from passionate members of the Splunk community.

All Posts

facing same issue, any solution?
No! Don't try to handle structured data with simple regexes. Unless you're very very very sure that the format is constant and it always will be (which is typically not something you can rely on sinc... See more...
No! Don't try to handle structured data with simple regexes. Unless you're very very very sure that the format is constant and it always will be (which is typically not something you can rely on since even the developers writing the solutions that produce such events don't know the exact order of fields that will be sent by their program) handling json or XML with regex is asking for trouble.
Yup. If the threat actor has control over the machine, it could - for example - completely delete the Splunk forwarder from the computer so you cannot be sure of anything after such situation happene... See more...
Yup. If the threat actor has control over the machine, it could - for example - completely delete the Splunk forwarder from the computer so you cannot be sure of anything after such situation happened (I've seen very sensitive setups where events - not necessarily Windows Event Logs but the general idea is the same - were printed out onto a printer as a non-modifiable medium so that they couldn't be in any way changed after they had been created). For normal situations where you expect a network downtime from time to time (like sites with unstable network connections, mobile appliances and so on), you can tweak your forwarder's buffer sizes so that it can hold the data back for the needed period of time and then send the queued data when it regains downstream connectivity. Be aware thought that such setup will create a host of potential problems resulting from the significant lag between the time the event is produced and the time it's being indexed. They can be handled but it needs some preparation and tweaking some limits.
As @richgalloway already pointed out - the format is wrong. You need the key=regex format. And you need to split it into separate whitelist entries (each entry can have multiple key=regex parameters)... See more...
As @richgalloway already pointed out - the format is wrong. You need the key=regex format. And you need to split it into separate whitelist entries (each entry can have multiple key=regex parameters). The trick here is that Account Name is not a field within the event but a field in the Message field of the event. So you need to match it as a regex within the Message field. So you'd effectively end up with something like whitelist1 = EventCode=%(4624|4634|4625)% Message=%Account Name:.*\.adm% whitelist2 = EventCode=%(4659|4663|5145)% Message=%Object Name:.*Test_share%
It's also worth noting that typically SOC (similarily to NOC, support and similar groups) is organized hierarchically. Regardless of the actual tool used (there might be ES or any other SIEM, there ... See more...
It's also worth noting that typically SOC (similarily to NOC, support and similar groups) is organized hierarchically. Regardless of the actual tool used (there might be ES or any other SIEM, there could be a SOAR tool in place to simplify the process or automate some of its steps), the 1st line operator's task is to check the actual alert (it might be a Notable in ES, it might be an asset exceeding risk score threshold, it might be anything that is defined procedurally for a given SOC) and then verify it (typically according to a predefined playbook), react to it if it's something "standard" and reaction procedure is known and possibly pass the further processing to the 2nd line if the situation cannot be handled within the playbook defined parameters. 1st line operator will typically use some set of predefined dashboards if using ES or might be just having some playbooks defined in SOAR solution and not have to touch SIEM solution at all. 2nd line analyst has usually more knowledge about the company environment and access to more tools. Since it's this analyst's task to get more insight into a situation when there is an alert which cannot be handled in a predefined way, this analyst will typically use ES and Splunk in general along with other tools to find more information about context of the possible threat. Most cases end at 2nd line analysts. If everything else fails 3rd line experts are called in (many smaller companies due to cost reason's don't even employ 3rd line analysts in-house but rather have some amount of man-hours purchased as a subscription service from an external service provider) and they will utilize everything at their disposal, including of course digging through the data in Splunk as well as reaching out to the solutions that generated those events and wil generally try to do whatever humanly possible to either stop the threat or - if the attack has already succeeded - limit its aftermath and restore the environment to normal state. In other words - the more advanced work you do in the incident handling process, the more you'll probably be dealing with ES and Splunk in general.
The overall idea is more or less correct but the details are a bit more complicated than that. 1. The summary-building search is spawned according to schedule and builds the summary data similarily ... See more...
The overall idea is more or less correct but the details are a bit more complicated than that. 1. The summary-building search is spawned according to schedule and builds the summary data similarily to writing indexed fields when ingesting data (in fact accelerated summaries are stored the same way it .tsidx files as indexed fields are). 2. The accelerated summaries are stored in buckets, corresponding with buckets of raw data. 3. The old buckets are not removed by the summary-building process but - as far as I remember - by the housekeeper thread (the same that is responsible for rolling event buckets). So it's not that straightforward FIFO process. Also the summary range is not a 100% precise setting. Due to data being stored in buckets and managed as whole buckets you might still have some parts of your summaries exceeding the defined summary range. Another thing worth noting (because I've seen such questions already) - no, you cannot have longer acceleration range than event data retention. When the event bucket is rolled to frozen, the corresponding datamodel summary bucket is deleted as well.
Hi, @gcusello  Thank you very much for your reply. However, there is something I am still confused about.   1. Exact meaning of data retention period For example, if you set the data retention p... See more...
Hi, @gcusello  Thank you very much for your reply. However, there is something I am still confused about.   1. Exact meaning of data retention period For example, if you set the data retention period to 1 year, Does initial acceleration mean that the summarized data will be kept for 1 year?   2. Meaning of data summary scope Assuming that one month's data is set as the summary range and the cron expression is set to */5 * * * *, If one month's worth of data is summarized every 5 minutes, the latest data continues to be summarized every 5 minutes. If it becomes past data, will it be deleted? I would appreciate your reply. Thank you
Thank you   And how do you read properties from it? I was looking in the documents you have attached and could not find reference to self.service.confs object...  could you please attach an exa... See more...
Thank you   And how do you read properties from it? I was looking in the documents you have attached and could not find reference to self.service.confs object...  could you please attach an example of how to read a specific prop from a specific stanza?   Than you 
Ok, let me add my three cents here. There are two separate things that get confused often - datamodels themselves and datamodel acceleration. Datamodel on its own is just a layer of abstraction - i... See more...
Ok, let me add my three cents here. There are two separate things that get confused often - datamodels themselves and datamodel acceleration. Datamodel on its own is just a layer of abstraction - it defines field and constraints so that you can (if your data is properly onboarded!) search without getting into gory technical details of each specific source and sourcetype. So instead of digging through all your separate firewall types across your enterprise, you can just search from the Network Traffic datamodel and look for particular src_ip. That makes your life easier and makes maintenance of your searches easier. It also makes it possible to build generalized searches and use cases for ES (of course DM usage is not limited to ES but it's most obvious there) which contain the required logic but are independent of the technical details. It makes an admin's life harder though because you need to make sure your data is properly normalized to applicable datamodels during onboarding. But this is a job you do once and use it forever after. Performance-wise however DM on its own is completely neutral - if you're not using acceleration the search from datamodel is silently translated into a "normal" search (you can see that in job log). Datamodel acceleration is another thing though. Since datamodels define a predefined set of fields the data can be indexed similarily to indexed fields. So Splunk spawns a search using a schedule defined for acceleration summary building. Then you can use such summary with tstats. And that gives a huge performance boost like any other use of tstats (as long as you're using summariesonly=t and the search doesn't "leak out" onto raw data). The downside of course is that you have those summary-building searches running every 5, 10 or 15 minutes (and they can eat up quite a lot of resources) but the upside is that searches using those accelerated summaries are lightning-fast compared to normal event searches.
Hi @sankar_1986 , in addition to the checks hinted by @deepakc , I hint to check: the csv dimensions: it's relevant because more time is required for ingestion and maybe you have only to wait for t... See more...
Hi @sankar_1986 , in addition to the checks hinted by @deepakc , I hint to check: the csv dimensions: it's relevant because more time is required for ingestion and maybe you have only to wait for the completion of the upload, if in the header there are strange field names. Without vieving your data I vote for the first solution. Ciao. Giuseppe  
Hi @DilipKMondal , good for you, see next time! Ciao and happy splunking Giuseppe P.S.: Karma Points are appreciated by all the contributors
Hi @munang, at first, you can configure the retention you want for your Data Model, so if you want a longer retention time, you can configure it, you need only more storage: the requested storage fo... See more...
Hi @munang, at first, you can configure the retention you want for your Data Model, so if you want a longer retention time, you can configure it, you need only more storage: the requested storage for one year for data Models is around 3.4 times the average of daily indexed data. Then Accelerated data Models are usually used for the most searches that must be very fast, if you need to search in older data, you can also use your data in indexer or summary indexes. As I said, usually in the last 30 days there are the data of more than 85% of the searches, that you need they are faster. Ciao. Giuseppe
Hello Using Splunk db connect 3.13.0 which worked fine until I've restarted the server since then the task server is not starting and im getting this error : message from "/opt/splunk/etc/apps/s... See more...
Hello Using Splunk db connect 3.13.0 which worked fine until I've restarted the server since then the task server is not starting and im getting this error : message from "/opt/splunk/etc/apps/splunk_app_db_connect/bin/dbxquery.sh" com.splunk.modularinput.Event.writeTo(Event.java:65)\\com.splunk.modularinput.EventWriter.writeEvent(EventWriter.java:137)\\com.splunk.dbx.command.DbxQueryServerStart.streamEvents(DbxQueryServerStart.java:51)\\com.splunk.modularinput.Script.run(Script.java:66)\\com.splunk.modularinput.Script.run(Script.java:44)\\com.splunk.dbx.command.DbxQueryServerStart.main(DbxQueryServerStart.java:95)\\ ERROR ExecProcessor [15275 ExecProcessorSchedulerThread] - message from "/opt/splunk/etc/apps/splunk_app_db_connect/bin/dbxquery.sh" action=dbxquery_server_start_failed error=java.security.GeneralSecurityException: Only salted password is supported  
Hi @splunky_diamond , the normal activity flow for a SOC Analyst is the following: there a defined monitoring perimeter and some Correlation Searches that monitor the above perimeter to find some ... See more...
Hi @splunky_diamond , the normal activity flow for a SOC Analyst is the following: there a defined monitoring perimeter and some Correlation Searches that monitor the above perimeter to find some possible threat, if one or more CS thigger an alert, it creates a Notable, I think that an eMail notification can be useful only for night monitoring because during the day, the SOC Analysts should be aways connected to ES, when a Notable is triggered (a Notable is one or more events that match a condition to check, not a securty indicent!), a SOC Analyst takes in care the Notable ana ivestigate using the investigation panels and eventually its own searches, He/she could also use the other ES dashboards, even if I never saw this! based on the investigation the SOC Analyst defines if: it's a real security incident, it's a false positive, the Notable requires an escalation for adeeper check, if it's a false positive THE SOC Analist closes the case ,eventualy adding a suppression rule, if the Notable requires an escalation check, the SOC Analyst passes to case following the indication of the related playbook, if it's a real security indicent, the SOC Analyst apply the predefined playbook actions or passes the activity to the colleagues enabled to intervene. This is a general fow, and it depends on the internal processes of the SOC. Only one additional information: if (as usual) in your SOC there are few SOC Analysts, it could be a good idea, doesn't associate to a Correlation Search a Notable but a Risk Score addition; in this way the SOC Analyst is informed in delay of a threat, but the SOC has to manage less Notables, in other words, if there are three SOC Analysts and the SOC receive 10,000 Notables/day they cannot check a of them. Ciao. Giuseppe
Hello, Splunkers! I hope there are some SOC analysts around who are using Splunk Enterprise and Splunk ES in their work. I've been learning Splunk for the past month and I have worked with Splunk ES... See more...
Hello, Splunkers! I hope there are some SOC analysts around who are using Splunk Enterprise and Splunk ES in their work. I've been learning Splunk for the past month and I have worked with Splunk ES a bit and tried configuring some correlation searches with automated notable generation along with email notification alerts. I now have to present some cases in my test lab, where I have an attacker who performs some malicious activity that triggers some of the correlation searches that I have configured, and then I need to demonstrate the full investigation process from SOC analyst's POV.  The problem is, I have almost 0 knowledge of how SOC operates and if they were to use Splunk Enterprise and Enterprise Security app, what would they do exactly? Would they just go over all the new notables and look at the drill-down searches trying to understand what notables are related to other notables? Would they try to correlate the events by time? Would they only work around Splunk ES, or would they also go to the dashboards and search for some data there?  I would appreciate it if someone could explain how SOC works with Splunk ES in case of some simple, uncomplicated attacks, that trigger 2-3 correlation searches max.   Also small question, since I have the email notifications configured, who is usually the one receiving the email notifications about triggered correlation searches, is it a SOC director, or analyst, or someone else? Please let me know if more information is required, I would love to provide as many details as needed, as long as I get the best answer that would help me. Thanks in advance for taking the time to read and reply to my post!
Thank you very much! That pretty much explains everything!  
Hello. I'm a Splunk newbie. There is confusion about setting up data model acceleration. According to the official documentation, if the data in your data model is out of date, Splunk will continuo... See more...
Hello. I'm a Splunk newbie. There is confusion about setting up data model acceleration. According to the official documentation, if the data in your data model is out of date, Splunk will continuously delete it and keep the data in your data model up to date. So, for example, if you summarize a month's data model in 0 12 * * *cycles, 1. -30 to 0 days data summarized 2. Day after day 3. Data from day -29 to +1 is summarized. 4. -30 days data is deleted Is this process correct? If this process is correct, why is it being done this way? And, information summarized through data model acceleration Is there a way to keep them consecutively like a summary index without them being deleted?
Hello tshah-splunk, Increasing the max_upload_size in web.conf worked in my case. Gave you a Karma point. Thanks 
The First Law of asking an answerable question states: Present your dataset (anonymize as needed), illustrate desired output from illustrated dataset, explain the logic between illustrated dataset a... See more...
The First Law of asking an answerable question states: Present your dataset (anonymize as needed), illustrate desired output from illustrated dataset, explain the logic between illustrated dataset and desired output. (Without SPL.) If attempted SPL does not give desired output, also illustrate actual output (anonymize as needed), then explain its difference from desired results if it is not painfully clear. I am able to pull my AD users account information successfully except for their email addresses.  Can you explain from which source are you pulling AD info?  Your SPL only uses a lookup file.  Do you mean lookup table AD_Obj_User contains email addresses but the illustrated SPL does not output them, or your effort to populate AD_Obj_User fails to obtain email addresses from a legitimate AD source (as @deepakc speculated)? If former, what is the purpose of the SPL?  What is the content of AD_Obj_User?  What is the desired output and the logic between the content and desired output? If latter, what is the purpose of showing SPL?
It’s mainly around performance, time to value, and using all the ES feature, you could be a large enterprise ingesting loads of data sources, and a big SOC operation, and you might want to run many m... See more...
It’s mainly around performance, time to value, and using all the ES feature, you could be a large enterprise ingesting loads of data sources, and a big SOC operation, and you might want to run many many different correlations rules, this would not be practical on raw data, and it would take a long time to develop new rules when so many come out of the box. So, this is where DM's come into play, faster and better all round.     For you it sounds like you have just a few use cases and can run your own rules on raw data, and if your happy with that, then that’s fine. But you’re not then exploiting what ES has to offer and all the use cases build around data models.