All Posts

Find Answers
Ask questions. Get answers. Find technical product solutions from passionate members of the Splunk community.

All Posts

I'm afraid I still don't understand what you are trying to do.  It makes absolutely no sense to join 50000 raw events.  In fact, it generally makes zero sense to join raw events to begin with. It is... See more...
I'm afraid I still don't understand what you are trying to do.  It makes absolutely no sense to join 50000 raw events.  In fact, it generally makes zero sense to join raw events to begin with. It is best to describe what the end goal is with zero SPL.  I posted my four commandments of asking an answerable question many times.  Here they are again. Illustrate data input (in raw text, anonymize as needed), whether they are raw events or output from a search that volunteers here do not have to look at. Illustrate the desired output from illustrated data. Explain the logic between illustrated data and desired output without SPL. If you also illustrate attempted SPL, illustrate actual output and compare with desired output, explain why they look different to you if that is not painfully obvious. In your case, simply avoid illustrating SPL.  Just illustrate what your data is,characteristics, etc., and what your result look like and why the illustrated data should lead to illustrated result, all without SPL.  There is a chance that some volunteers can understand if you do NOT show SPL.
If you have lots of events, performing lookup after stats will be more efficient. index=* | chart count by X | lookup my-lookup.csv Y AS X OUTPUT X_description This will add an extra field.  If you... See more...
If you have lots of events, performing lookup after stats will be more efficient. index=* | chart count by X | lookup my-lookup.csv Y AS X OUTPUT X_description This will add an extra field.  If you don't want to see X, just remove it with fields command.
With the same log, I would expect a single duration.  Perhaps the maxspan option to the transaction command will help.
Yes, It is the best practice  to have consistent index configurations & definitions throughout the cluster. Thanks @gcusello @PickleRick  for the good points. To further elaborate on this topic and... See more...
Yes, It is the best practice  to have consistent index configurations & definitions throughout the cluster. Thanks @gcusello @PickleRick  for the good points. To further elaborate on this topic and provide more details, I'd like to add the following: EDIT: I'd like to expand on my previous answer with some additional best practices: 1. Separate Index Definition from Storage Definition: - It's typically best practice to keep these configurations separate. - In a production environment, use a legitimate app for your main indexes.conf file, not the system/local directory . - This ensures better manageability, version control, and consistency across your Splunk deployment. 2. Use Separate Apps for Configurations: - Implement a base config methodology with different apps for different aspects. - Create apps like: a) org_all_indexes: For consistent index definitions across the deployment. b) org_idxer_volume_indexes: For indexer-specific configurations. c) org_srch_volume_indexes: For search head-specific configurations. 3. Flexibility and Scalability: - This approach allows different storage tiers for indexers and search heads as needed. - It maintains a consistent view of available indexes across the deployment while allowing for component-specific optimizations. These practices will help create a more robust, manageable, and scalable Splunk infrastructure.
We are looking to deploy Edge Processors (EP) in a high availability configuration - with 2 EP systems per site and multiple sites. We need to use Edge Processors (or Heavy Fowarders, I guess?) to in... See more...
We are looking to deploy Edge Processors (EP) in a high availability configuration - with 2 EP systems per site and multiple sites. We need to use Edge Processors (or Heavy Fowarders, I guess?) to ingest and filter/transform the event logs before they leave our environment and go to our MSSP Splunk Cloud. Ideally, I want the Universal Forwarders (UF) to use the local site EPs. However, in the case that those are unavailable, I would like the UFs to failover to use the EPs at another site. I do not want to have the UFs use the EPs at another site by default, as this will increase WAN costs, so I can't simply list all the servers in the defaultGroup. For example: [tcpout] defaultGroup=site_one_ingest [tcpout:site_one_ingest] disabled=false server=10.1.0.1:9997,10.1.0.2:9997 [tcpout:site_two_ingest] disabled=true server=10.2.0.1:9997,10.2.0.2:9997 Is there any way to configure the UFs to prefer the local Edge Processors (site_one_ingest), but then to failover to the second site (site_two_ingest) if those systems are not available? Is it also possible for the configuration to support automated failback/recovery?
i am currently taking udemy classes also and my splunk enterprise health is in the red. who can help me with this?  
still a total newb here so please be gentle, on Microsoft Window 2019 servers we have an Index cluster and here's how the Hot and Cold volumes are defined on it: C:\Program Files\Splunk\etc\system\... See more...
still a total newb here so please be gentle, on Microsoft Window 2019 servers we have an Index cluster and here's how the Hot and Cold volumes are defined on it: C:\Program Files\Splunk\etc\system\local\indexes.conf [default] [volume:cold11] path = E:\Splunk-Cold maxVolumeDataSizeMB = 12000000 [volume:hot11] path = D:\Splunk-Hot-Warm maxVolumeDataSizeMB = 1000000   that I can live with, but on our Search Heads here's how we point on the volumes, and this don't look right to me: C:\Program Files\Splunk\etc\apps\_1-LDC_COMMON\local\indexes.conf [volume:cold11] path = $SPLUNK_DB [volume:hot11] path = $SPLUNK_DB   should the stanzas on the Search Heads match the ones on our Indexers?
Thanks for the quick response! U saved me a lot of time
Does Splunk for Cisco Identity Services (ISE) support data containing IPv6 addresses?  
Event type cannot "merge" multiple events. As simple as that. So either process your data prior to ingesting so that you have a whole login event containing all interesting fields or do summary index... See more...
Event type cannot "merge" multiple events. As simple as that. So either process your data prior to ingesting so that you have a whole login event containing all interesting fields or do summary indexing and create synthetic events after ingesting original events.
The general answer is no - you have no indication in the main search whatsoever that your subsearches (regardless of whether this is a "straight" subsearch, append or join) have been finalized before... See more...
The general answer is no - you have no indication in the main search whatsoever that your subsearches (regardless of whether this is a "straight" subsearch, append or join) have been finalized before full completion due to hitting limits. They are simply silently finalized and the returns yielded so far are returned and that's it. This is why using subsearches is tricky and they're best avoided unless you can make strong assumptions about their time of execution and size of the result set. Maybe, just maybe (haven't checked it) you could retroactively find that information in _internal but to be honest, I doubt it. The search itself doesn't return such metadata.
Ok. In order to reliably split a "variable format" time string you must have some strong assumptions you can make about it. For example, the order of the fields must be constant, the time specifier m... See more...
Ok. In order to reliably split a "variable format" time string you must have some strong assumptions you can make about it. For example, the order of the fields must be constant, the time specifier must be in a relatively well-defined format and so on. Otherwise you wouldn't be able to tell whether "10 23" means 10:23 AM or 23rd of October. Or maybe 10 minutes past some hour in 23rd day of some month. You must have something to anchor your extraction to.
Unfortunately Dashboard Studio does not have an option for this. Are you able to make the visualization wider to make room for the numbers? Or you calculate the numbers into different units to reduce... See more...
Unfortunately Dashboard Studio does not have an option for this. Are you able to make the visualization wider to make room for the numbers? Or you calculate the numbers into different units to reduce the number of digits.
How did you install TA-pfsense on the Search heads? 
No. In SimpleXML dashboard you can include any custom JS which could potentially make your browser play media file (which might not necessarily be the best idea). Dashboard Studio doesn't let you d... See more...
No. In SimpleXML dashboard you can include any custom JS which could potentially make your browser play media file (which might not necessarily be the best idea). Dashboard Studio doesn't let you do this level of customization.
It looks like service downtime. Especially considering a sudden spike in throughput after a drop - the forwarders were pushing the queued data. Check your splunkd.log immediately before and after th... See more...
It looks like service downtime. Especially considering a sudden spike in throughput after a drop - the forwarders were pushing the queued data. Check your splunkd.log immediately before and after that outage.
It's also worth explaining why the where command is usually way slower than adding another condition to the original search (or adding another search command in the pipeline). Firstly, Splunk is rel... See more...
It's also worth explaining why the where command is usually way slower than adding another condition to the original search (or adding another search command in the pipeline). Firstly, Splunk is relatively smart and when it sees search condition1 | search condition2 it internally optimizes it out and treats it as search condition1 AND condition2 But that's a minor point here. The major point (and that's really very important in understanding why some things work faster with Splunk than others) is _how_ Splunk searches the indexes for data. Your typilcal "other solution" (like RDBMS or some object database which indexes documents) splits the data into discrete fields on ingestion and holds each of those fields in a separate "compartment" (we can call it columns in database table, we can call it object properties, doesn't matter here). So when you have to look for key=value pair, the solution looks into the "drawer" called "key" and looks for "value". Splunk (mostly; the exception being indexed fields) works the other ways around. It stores the "values" in form of tokens into which it splits the input data. And during searching if you search for key=value condition it searches for all events containing the "value" token and parses all of them to see if the value is in the proper place within the event to match the defined extraction for key. Of course the more values you're looking for (because you have separate conditions for many fields containing separate values like key1=value1 AND key2=value2 AND key3=value3 and so on), the lower is the count of events containing all those values at the same time and the fewer events Splunk has to actually parse to see if those field definitions match what you're searching for. So if you're adding more conditions to your search by AND you're telling Splunk to consider fewer and fewer events in your search. But where does not work like that. Where works only as a streaming command and has to process all the events that come from the preceeding command(s). So for example, if you have in your index 100 thousands events of which 10000 contain "value1", 10000 contain "value2" (1000 of them overlap and contain both of those values), if you're searching for index=myindex key1=value1 key2=value2 Splunk has to only parse 1000 events which contain both values at the same time to find if they contain it in places corresponding to key1 and key2 respectively. But if you do index=myindex key1=value1 | where key2="value2" Splunk has to parse all 10000 events containing value1 to see if they match key1. From the resulting set of this search it needs to match all events where key2="value2". Even worse if you just did index=myindex | where key1="value1" AND key2="value2" Splunk then would have to read all 100k events from your index and parse those two fields out of them to later compare their values with the given condition. To show you what difference that can make an example from my home lab box. index=winevents EventCode=4799 EventRecordID=461117 I ran this search over last 30 days. This search has completed and has returned 1 results by scanning 1 events in 0.278 seconds EventRecordID is a pretty unique identifier so Splunk already had only a single record to check. If we move this condition to the where part index=winevents EventCode=4799 | where EventRecordID=461117 We get This search has completed and has returned 1 results by scanning 9,768 events in 1.045 seconds As you can see, Splunk had to do much more work because I had 9768 events which matched the value 4799 (and from the further job inspection which I'm not pasting here I see that all of them were in the EventCode field) and all those events had to be processed further by the where command. It's still relatively fast, because 10k events is not that much but it's about 4 times slower (the difference on bigger sets would be more noticeable - here the big part of the time used is just spawning the search). If we move both conditions to the where part: index=winevents | where EventCode=4799 AND EventRecordID=461117 We still get the same 1 result which is not surprising but... This search has completed and has returned 1 results by scanning 63,740 events in 6.017 seconds I have exactly 63740 events in the winevents index and they all had to be parsed and processed further down the pipeline by the where command. And it's no wonder that since there's about 6 times more events to process than in previous variant it took about 6 times as much time. So yes, where is a fairly sophisticated and flexible command letting you do many things that ordinary search command won't but the tighter you can "squeeze" your indexes with the initial search the better the overall performance.
The exact search to produce a visualization would depend on which fields are extracted for your logs. Assuming they are normalized such that e.g. the field "user" and the field "status" are the same ... See more...
The exact search to produce a visualization would depend on which fields are extracted for your logs. Assuming they are normalized such that e.g. the field "user" and the field "status" are the same between the Windows and RHEL logs, then you could find the 5 users with the most failed logins for the past week with: index=<yourwindowslogindex> OR index=<yourlinuxlogindex> earliest=-7d status="failed" | top limit=5 user If the fields are not normalized, then you may need to extract them. In this case could you post some sanitized samples of the successful and failed login events? They should be retrievable by searching something like: index=<yourindex> (EventCode=4624 OR EventCode=4624 OR "Login")
YES! That's what I'm looking. I have both windows and RHEL machines. I'm using the Cisco network app to track logins to the network on there if that makes sense. I'd like to have it do logins over th... See more...
YES! That's what I'm looking. I have both windows and RHEL machines. I'm using the Cisco network app to track logins to the network on there if that makes sense. I'd like to have it do logins over the course of 7 days with the top 5 users like you were saying. That just makes sense. I'm learning a bunch of stuff.