All Posts

Find Answers
Ask questions. Get answers. Find technical product solutions from passionate members of the Splunk community.

All Posts

Hi @Karthikeya, False positive, false negative, etc. have the same definitions in Splunk that they have in statistics. I'm in the United States, and I find NIST/SEMATECH e-Handbook of Statistical M... See more...
Hi @Karthikeya, False positive, false negative, etc. have the same definitions in Splunk that they have in statistics. I'm in the United States, and I find NIST/SEMATECH e-Handbook of Statistical Methods, Chapter 6, "Process or Product Monitoring and Control," a useful day-to-day reference: https://www.itl.nist.gov/div898/handbook/index.htm. In your example, you're counting events. For example, a basic search scheduled to run every minute: index=web status=400 earliest=-15m@m latest=@m | status count | where count>5 gives you the count of status=400 events over the prior 15 minutes. In this context, false positive and false negative could relate to the time the events were generated and the delay between that time and the index time. If a status=400 event occurred at 00:14:59 but was indexed by Splunk at 00:15:04, then a search that executes at 00:15:01 for the interval [00:00:00, 00:015:00) would not count the event because it has not been indexed by Splunk. This is a false negative. You can reduce the probability of false negatives by adding a backoff to your search--1 minute in this example: index=web status=400 earliest=-16m@m latest=-1m@m | status count | where count>5 However, that will not eliminate all false negatives because there is still a non-zero probability that an event will be indexed outside your search time range. False positives are more typically associated with measuring against a model. Let's say you've modeled your application's behavior and determined that more than 5 status=400 events over a 15 minute interval likely indicates a client-side code deployment issue as opposed to "normal" client behavior. "More than 5" is associated with a control limit, for example a deviation from a mean; however, the number of status=400 events is a random variable. A bad client-side code deployment may trigger 4 status=400 events, which is a false negative, and a good client-side deployment may trigger 6 status=400 events, which is a false positive. Several Splunk value-added products like Splunk Enterprise Security and Splunk IT Service Intelligence provide ready-to-run modeling and monitoring solutions, but in general, you would model your application's behavior using either traditional methods outside Splunk or statistical functions or an add-on like the Splunk Machine Learning Toolkit inside Splunk. You would then apply your model using custom Splunk searches.
Thank you @tscroggins 
It is not a malfunction of Splunk - false positives and negatives could arise if your monitoring solution is not robust enough for your requirements. For example, in your scenario, if you are monitor... See more...
It is not a malfunction of Splunk - false positives and negatives could arise if your monitoring solution is not robust enough for your requirements. For example, in your scenario, if you are monitoring every 15 minutes, let's say at 00, 15, 30 and 45 minutes past the hour but you get 400 errors at 12, 13, 14, 15, 16, and 17, you have 6 errors but 3 fall into 00-14 time bucket and 3 fall into 15-29 time bucket. Would you say this is a missed alert (false negative) or something you would tolerate? In another scenario, let's say you have errors occurring 13, 14, 15, 16, 28 and 29, but the 13 and 14 errors arrive late so they are picked up in the 15-29 time bucket, so you raise an alert, this might be seen as a false positive, i.e. an alert that you didn't really want. It all comes down to what your requirements are and what tolerances you are prepared to accept in your monitoring environment.
Hi @NavS, Refer to https://docs.splunk.com/Documentation/SplunkCloud/latest/Service/SplunkCloudservice for supported data egress methods: Data Egress Dynamic Data Self-Storage export of aged d... See more...
Hi @NavS, Refer to https://docs.splunk.com/Documentation/SplunkCloud/latest/Service/SplunkCloudservice for supported data egress methods: Data Egress Dynamic Data Self-Storage export of aged data per index from Splunk Cloud Platform to Amazon S3 or Google Cloud Storage No limit to the amount of data that can be exported from your indexes to your Amazon S3 or Google Cloud Storage account in the same region. Dynamic Data Self-Storage is designed to export 1 TB of data per hour. Data Egress Search results via UI or REST API Recommend no more than 10% of ingested data For optimal performance, no single query, or all queries in aggregate over the day from the UI or REST API, should return full results of more than 10% of ingested daily volume. To route data to multiple locations, consider solutions like Ingest Actions, Ingest Processor, or the Edge Processor solution. Data Egress Search results to Splunk User Behavior Analytics (UBA) No limit Data as a result of search queries to feed into Splunk User Behavior Analytics (UBA). To stream events to both Splunk Cloud and another destination, an intermediate forwarding solution is required. You should contact your client's Splunk account team for confirmation, but your Splunk Cloud native options are likely limited to the table above.
Hi Splunk Community, I need advice on the best approach for streaming logs from Splunk Cloud Platform to an external platform. The logs are already being ingested into Splunk Cloud from various appl... See more...
Hi Splunk Community, I need advice on the best approach for streaming logs from Splunk Cloud Platform to an external platform. The logs are already being ingested into Splunk Cloud from various applications used by my client's organization. Now, the requirement is to forward or stream these logs to an external system for additional processing and analytics. #Splunk cloud Thank you  Nav
Thank you @ITWhisperer  So on daily basis in splunk environment, what will be the most possible and frequent scenario in above 4 cases? How to avoid that? So you are saying false alerts will be trig... See more...
Thank you @ITWhisperer  So on daily basis in splunk environment, what will be the most possible and frequent scenario in above 4 cases? How to avoid that? So you are saying false alerts will be triggered but condition will not be met...how is it possible? What is the mechanism for false positives? Example: from status=400 reaches count more than 5 in last 15 min alert should be triggered. We will correctly set the alert. But still why alert will be triggered? Is it malfunction of Splunk? I didn't get you. Is false positives generally happen? Can you please more detail on this. Thanks once again. 
A false positive is something that is reported as being true when it is false. A false negative is something that is reported as being false when it is actually true. In monitoring terms, this coul... See more...
A false positive is something that is reported as being true when it is false. A false negative is something that is reported as being false when it is actually true. In monitoring terms, this could be related to, for example, an alarm being raised when the condition / threshold has not been reached (false positive) or an alarm not being raised when the condition / threshold has been reached (false negative). Both these situations should be avoided whenever possible, although for some environments, this is not always possible. If these perfect monitoring scenario cannot be reached, you have to decide at what point the number of false alarms are tolerable for your organisation.
Hello, let me explain my architecture. Multi site cluster (3 site cluster)... 2 indexers, 1 SH, 2 syslog servers (UF installed)... In each site 1 Dep server, 1 Deployer overall, 2 cluster managers... See more...
Hello, let me explain my architecture. Multi site cluster (3 site cluster)... 2 indexers, 1 SH, 2 syslog servers (UF installed)... In each site 1 Dep server, 1 Deployer overall, 2 cluster managers (1 stand by)... As of now, network logs are configured to our syslog server and UF forward the data to indexers. We will configure logs with the help of FQDN.  For example we have X application which may or may not contain FQDN. If it contains FQDN, it will go to that app index or else it will go to different index. (Wrote these props and transforms in cluster manager). In deployment server inputs.conf we just given log path along with different index (which specified in transforms of Cluster manager). So all the logs will flow to cluster manager and then we wrote props and transforms to filter the data. Is there any other way to write these configurations other than this? Giving props and transforms of cluster manager: cat props.conf [f5_waf] TIME_PREFIX = ^ MAX_TIMESTAMP_LOOKAHEAD = 25 TIME_FORMAT = %b %d %H:%M:%S SEDCMD-newline_remove = s/\\r\\n/\n/g LINE_BREAKER = ([\r\n]+)[A-Z][a-z]{2}\s+\d{1,2}\s\d{2}:\d{2}:\d{2}\s SHOULD_LINEMERGE = False TRUNCATE = 10000 # Leaving PUNCT enabled can impact indexing performance. Customers can # comment this line if they need to use PUNCT (e.g. security use cases) ANNOTATE_PUNCT = false TRANSFORMS-0_fix_hostname = syslog-host TRANSFORMS-1_extract_fqdn = f5_waf-extract_fqdn TRANSFORMS-2_fix_index = f5_waf-route_to_index   cat transforms.conf # FIELD EXTRACTION USING A REGEX [f5_waf-extract_fqdn] SOURCE_KEY = _raw REGEX = Host:\s(.+)\n FORMAT = fqdn::$1 WRITE_META = true # Routes the data to a different index-- This must be listed in a TRANSFORMS-<name> entry. [f5_waf-route_to_index] INGEST_EVAL = indexname=json_extract(lookup("fqdn_indexname_mapping.csv", json_object("fqdn", fqdn), json_array("indexname")), "indexname"), index=if(isnotnull(indexname), indexname, index), fqdn:=null(), indexname:=null()   cat fqdn_indexname_mapping.csv fqdn indexname selenium.systems.us.fed xxx_app_selenium1 v-testlab-service1.systems.us.fed xxx_app_testlab_service1   Gone through documents but just asking for any better alternatives?? 
What exactly is false positives, false negatives, true positives, true negatives means? How to identify them in Splunk and can we trigger them and how it is useful to us in monitoring Splunk? Please ... See more...
What exactly is false positives, false negatives, true positives, true negatives means? How to identify them in Splunk and can we trigger them and how it is useful to us in monitoring Splunk? Please explain.
Hi @jaibalaraman yes this should be fine, visit compatibility Matrix: https://docs.splunk.com/Documentation/Splunk/9.3.2/Installation/Systemrequirements#Supported_Operating_Systems  
There is no need to install Splunk Enterprise and Universal Forwarder on the same server.  It can be done, but requires special effort with little gain.  Splunk Enterprise is capable of everything th... See more...
There is no need to install Splunk Enterprise and Universal Forwarder on the same server.  It can be done, but requires special effort with little gain.  Splunk Enterprise is capable of everything the UF does. 1) Put the UF on the syslog server and SE on separate servers. 2) The receiver address is that of Splunk.  It's the server that will receive data from the UF. 3) Which Microsoft add-on?  There are several and most are not needed. 4) Configure syslog to save events to disk files.  Configure the UF (in inputs.conf) to monitor those disk files.
The scenario is there are 100 endpoints sending logs to there internal inhouse syslog server. We need to deploy Splunk here. So that admin will be able to monitor logs on Splunk Enterprise. Make sure... See more...
The scenario is there are 100 endpoints sending logs to there internal inhouse syslog server. We need to deploy Splunk here. So that admin will be able to monitor logs on Splunk Enterprise. Make sure both the Universal Forwarder and Splunk Enterprise should be present in the same syslog server. I am here for the steps I need to follow for this deployment.  I am mentioning below the steps I am thinking to take place. 1.) First I am thinking to install Splunk Enterprise on the server and then to install universal forwarder. 2.) During the installation process of universal forwarder I choose local system rather then domain deployment, then in deployment server i have to leave it blank and on receiver server I have to put the syslog server's IP address and port number which I can be able to get by running command ipconfig on cmd. 3.) I need to download Microsoft add on Splunk base on the same server. 4.) Extract the Splunk base file and create a local folder in Splunkforwarder > etc and paste the input.conf file there and do the required changes. 5.) Then I will be able to get all the syslog server's log on Splunk Enterprise. Please correct me, or add other steps which I need to follow.
I can't see any obvious issue with your code. What happens if you include debug log statements that output the report_id value, and then the resulting URL? Assuming logging mode is set to debug: ... See more...
I can't see any obvious issue with your code. What happens if you include debug log statements that output the report_id value, and then the resulting URL? Assuming logging mode is set to debug: helper.log_debug(f"Report ID is: {report_id}") url = f"https://example_url/{report_id}/download" helper.log_debug(f"URL is: {url}") headers = { "accept": "application/json", "Authorization": f"Bearer {jwt_token}", }  
Hi @sainag_splunk , Also please explain what is index peers you mean and index cluster bundle? Please reply
A streaming language generally do not use command branching.  However, SPL has plenty of instruments to obtain the result you want.  So, let me rephrase your requirement. What I want is to extract ... See more...
A streaming language generally do not use command branching.  However, SPL has plenty of instruments to obtain the result you want.  So, let me rephrase your requirement. What I want is to extract from events is a vector of three components, field1, field2, field3.  The method of extraction is based on whether the event contains dog or cat. To illustrate, given this dataset _raw a b c |i|j|k| Dog woofs l m n |x|y|z| Cat meows e f g |o|p|q| What does fox say? I want the following results _raw field1 field2 field3 a b c |i|j|k| Dog woofs a b c l m n |x|y|z| Cat meows x y z e f g |o|p|q| What does fox say?       (This is based on reverse engineering your regex.  As I do not know your real data, I have to make the format more rigid to make the illustration simpler.) Let me demonstrate a conventional method to achieve this in SPL.   | rex "(?<field1_dog>\S+)\s(?<field2_dog>\S+)\s(?<field3_dog>\S+)\s" | rex "\|(?<field1_cat>[^\|]+)\|(?<field2_cat>[^|]+)\|(?<field3_cat>[^|]+)\|" | foreach field1 field2 field3 [eval <<FIELD>> = case(searchmatch("Dog"), <<FIELD>>_dog, searchmatch("Cat"), <<FIELD>>_cat)] | fields - *_dog *_cat   As you can see, the idea is to apply both regex's, then use case function to selectively populate the final vector. This idea can be implemented in many ways. Here is the emulation that generates my mock data.  Play with it and compare with real data.   | makeresults format=csv data="_raw a b c |i|j|k| Dog woofs l m n |x|y|z| Cat meows e f g |o|p|q| What does fox say?"   In many traditional languages, the requirement can also be expressed as conditional evaluation. While this is less conventional, you can also do this in SPL, usually with more cumbersome code.
Can i know what were the changes down on values file? Otel chart I was able to get in the Github project
I'm a little bit lost on your architecture to be honest. But if I understand your later comments correctly, you want to restrict teams responsible for sending the data from sending to unauthorized i... See more...
I'm a little bit lost on your architecture to be honest. But if I understand your later comments correctly, you want to restrict teams responsible for sending the data from sending to unauthorized indexes, right? It can be tricky depending on the overall ingestion process but while "normal" s2s has no restrictions on the sent data so as long as you accept data from a forwarder you're accepting it into whatever index it's meant for, the new Splunk versions let you limit s2s-over-http connections only to given index(es) authorized for particular HEC token. If we're talking syslog here then you should handle it on the syslog daemon level.
what i want to say is:   if _raw contains the word "Dog" then rex "(?<field1>([^\s]+))\s(?<field2>([^\s]+))\s(?<field3>([^\s]+))\s" if _raw contains the word "Cat" then rex "(?<field1>([^\|]+))\|(... See more...
what i want to say is:   if _raw contains the word "Dog" then rex "(?<field1>([^\s]+))\s(?<field2>([^\s]+))\s(?<field3>([^\s]+))\s" if _raw contains the word "Cat" then rex "(?<field1>([^\|]+))\|(?<field2>([^\|]+))\|(?<field3>([^\|]+))\|" because if the line contians Dog, fields are delimited by spaces but if it contains Cat, fields are delimited by pipe symbol. I want the same field names just need to use a different rex based on delimiters. I cant formulate one rex that contains both delimiters    
yeah i am getting syntax error Invalid Argument on rex
The segregated data should be written to separate files to monitored by separate inputs.conf stanzas ----> where I need to give this inputs.conf? In deployment server? Because in UF deploymentclient.... See more...
The segregated data should be written to separate files to monitored by separate inputs.conf stanzas ----> where I need to give this inputs.conf? In deployment server? Because in UF deploymentclient.conf will be given right?  how to get data from particular FQDN from syslog server to indexer finally? What conf should be given? Where to declare index for that? Please be specific please