All Posts

Find Answers
Ask questions. Get answers. Find technical product solutions from passionate members of the Splunk community.

All Posts

I have tried append like this and it has not worked: When you have a requirement like this, the method @Richfez presented should be your first choice and not thinking in terms of append or appendc... See more...
I have tried append like this and it has not worked: When you have a requirement like this, the method @Richfez presented should be your first choice and not thinking in terms of append or appendcols.  Meanwhile, there shouldn't be any reason why append would not work if you combine the two rows with another stats   index=index ("ProducerClass" AND "*Sending message:*" ) NOT "*REFRESH*" ``` ^^^ do not separate filter from index search ``` | stats count as actual_count | append [search index=index "OpportunityClass" AND "Processing file: " | rex field=_raw "Processing file: file_name with (?<record_count>[^\s]+) records" | stats sum(record_count) as expected_count] | stats values(*) as * | eval percent =expected_count/actual_count * 100   Note in the first line, do not use a pipe sign to add a search filter if the filter applies directly to index search.  Adding all applicable filters in index search will greatly improve performance. Similarly, there is no reason why appendcols will not work to your expectation.   index=index ("ProducerClass" AND "*Sending message:*" ) NOT "*REFRESH*" ``` ^^^ do not separate filter from index search ``` | stats count as actual_count | appendcols [search index=index "OpportunityClass" AND "Processing file: " | rex field=_raw "Processing file: file_name with (?<record_count>[^\s]+) records" | stats sum(record_count) as expected_count] | eval percent =expected_count/actual_count * 100   (If you think appendcol didn't work, you should post your search, and give sample (anonymized) results, then explain why the results are wrong.) Here are three simulations to illustrate how they give the same results 1. append   index=_internal | stats count as actual_count | append [search index=_audit | stats count as expected_count] | stats values(*) as * | eval percent = round(expected_count/actual_count * 100, 2)   actual_count expected_count percent 237531 4686 1.97 2. appendcols   index=_internal | stats count as actual_count | appendcols [search index=_audit | stats count as expected_count] | eval percent = round(expected_count/actual_count * 100, 2)   actual_count expected_count percent 236458 4660 1.97 3. Rich's method   index = _internal OR index = _audit | eval is_actual = if(searchmatch("index = _internal"), "true", null()) | eval is_expected = if(searchmatch("index = _audit"), "true", null()) | stats count(is_actual) as actual_count count(is_expected) as expected_count | eval percent = round(expected_count/actual_count * 100, 2)   actual_count expected_count percent 245978 4861 1.98 (Any difference in illustrated numbers is due to data change between searches.)
I have the following sample events coming from  source="/project/admin/git/ys/es/perf/de/pure/abc0*/logs/*/results.csv" Event1: with no timestamp  and these type of data is in files  which are 2 d... See more...
I have the following sample events coming from  source="/project/admin/git/ys/es/perf/de/pure/abc0*/logs/*/results.csv" Event1: with no timestamp  and these type of data is in files  which are 2 days older abc|pxyz|0.1054|ops|0|null|null   Event 2 with timestamp --these are new files from same location and going forward the data would be the same as below  2024-02-23T00:48:17|AID|read|454482.351348|PS|0|null|null   I want to send data to splunk that has a timestamp and send the other to null queue or not to ingest it . Firstly I tried MAX_DAYS_AGO=2 which did not work then I tried the following props and transforms but it did not work . transforms [filter] REGEX = ^^\D* DEST_KEY = queue FORMAT = nullQueue props.conf CHARSET=AUTO SHOULD_LINEMERGE=false category=Custom disabled=false pulldown_type=true TRANSFORMS-null=filter Thanks in Advance
Ok, from the start. The source. Syslog is meant as a simple, low overhead protocol for sending data "locally". The source typically can send events to one destination. Some sources can send the sam... See more...
Ok, from the start. The source. Syslog is meant as a simple, low overhead protocol for sending data "locally". The source typically can send events to one destination. Some sources can send the same event to multiple destinations at the same time (sometimes in different formats). And that's it. You can't do any load-balancing on the source level. At least I've never seen a source capable of something like that and I've seen quite a few. Oh, and if you're sending udp syslogs, you can't verify reachability. So any syslog solution you come up with will have a single receiving point (for a given source). You might try to do HA setup with multiple receivers in active-passive setup but that's it. I don't think anyone bothered to try to implement a network-level loadbalancer for syslog (especially that syslog is usually meant to be sent only across your local network segment; it's generally not a good practice to route syslog events across big networks). After this point you can do load-balancing to downstream components.
As @gcusello said, default retention for _internal index is not for 60day. You must extend it. Check also other internal indexes to avoid harming surprises.
A guess! I see nowhere where you specify the date or time of the contents of the files, so it's being "automatically figured out" and that process often goes wrong. Try searching index=admin over *... See more...
A guess! I see nowhere where you specify the date or time of the contents of the files, so it's being "automatically figured out" and that process often goes wrong. Try searching index=admin over *all time* for any fairly unique string that is in one of the files.  I'll bet you'll find them 6 days in the future, or somehow ingested as if they were from 2015. If that's not the problem then it might be helpful to have a snippet of the first bits of one of those files. Happy Splunking! -Rich
I found this - https://community.snowflake.com/s/article/Integrating-Snowflake-and-Splunk-with-DBConnect It looks like it walks you through exactly how to do that. If that helps, karma's always ap... See more...
I found this - https://community.snowflake.com/s/article/Integrating-Snowflake-and-Splunk-with-DBConnect It looks like it walks you through exactly how to do that. If that helps, karma's always appreciated! Happy Splunking Snowflake data! -Rich
I think you have to make all IP values CIDR (1.2.3.4/32, for example).
There's a few different techniques for combining things like this.  The one I think you might find most useful could be... OK, example off some silly data I have.  Once I work through that and expla... See more...
There's a few different techniques for combining things like this.  The one I think you might find most useful could be... OK, example off some silly data I have.  Once I work through that and explain, I'll make an attempt at doing your searches too.  Anyway - Amazon Glacier uploads for my little server, every night it tries to push up new files.  I think it's similar enough to your data that the example may work, though forgive me for it being stupidly contrived in so many ways.  The idea is there are two messages.  One is contains "uploaded part" and the other contains "created an upload_id".  I don't have a real good "number" to rex out, but I have a PID I can steal the first two digits of to pretend I have numbers.  index="glacier" ( "uploaded part" OR "created an upload_id") | eval is_actual = if(searchmatch("*created an upload_id*"), 1, 0) | rex "PID\s+(?<dumb_counter>\d\d)" | eval is_expected = if(searchmatch("*uploaded part*"), dumb_counter, 0) | stats sum(dumb_counter) as is_expected, sum(is_actual) as is_actual So looking at that, the first line gets all the data, both types. The second line is using an eval to create "is_actual". And when the event matches 'created an upload_id', that is_actual will be set to 1.  Otherwise 0. The third line is a rex, just like yours only more dumb.  It creates a field "dumb_counter" which will either be a two digit number, or will be null if it didn't match.  (Unfortunately, ALL my lines have a PID, so ... this is broken, but I fix it in the next line using logic just like in line 2. Line 4 then is the fix, where I eval 'is_expected' to either be the dumb_counter I wrote IF the line matches what I need it to match, or 0 if it doesn't.  (I don't think you'll need this extra logic, but I do and it was easy enough to explain!) The the last line just adds up the two independently.  Afterwords you can easily do a new eval for percent or whatever.  We'll do this when we try YOUR search. And it's time for that now. We'll use the same technique, only it'll be messier because you have more conditions to work with. index=index ( ("ProducerClass" AND "*Sending message:*") NOT "*REFRESH*") OR ("OpportunityClass" AND "Processing file: file_name") | eval is_actual = if(searchmatch("*ProducerClass*") AND searchmatch("*Sending message:*"), 1, 0) | rex field=_raw "Processing file: file_name with (?<record_count>[^\s]+) records" | eval is_expected = if(searchmatch("*OpportunityClass*") AND searchmatch("*Processing file: *"), record_count, 0) | stats sum(is_expected) as is_expected, sum(is_actual) as is_actual | eval percent = (is_expected / is_actual) *100 Again, line 1 pulls all the data in.  (Special note, you use NOT ... which means those records won't be there and we can ignore them in the eval, you'll see! Line 2 creates our is_actual.  This line could be left here or moved to after the rex - it won't really matter. Line 3 is our rec to get our record count... Which in line 4 we convert into a new field 'is_expected' ONLY if the event is the right event - this is very, very likely to not be needed, you could extract the field in line 3 with the name 'is_expected', remove this line, and it probably should all work the same.  But we're being careful here.  The we just sum those in line 5, and do some math in line 6. So of special note! If "file_name" actually stands in for the filename which changes, we'll have to work around that with a wildcard or something.  OR if you can drop in a line from each event type (appropriately obfuscated, of course) then we can just work it using one of the other methods. For instance, we may be able to ignore "filename" in the base search, then just edit the rex a wee bit to work around it later, too.   Anyhow, give those a shot, and if it works for you (or is easily "fixed" because I'm sure there's some typos in it), then great!  Otherwise, let us know what's happening and we can help more.   Happy Splunking, Rich  
Hi @Rao_KGY , probably the issue is the one indicated by @ITWhisperer : are you sure to use the same time frame i both the panels? In addition, in your search, you dont need the table command and y... See more...
Hi @Rao_KGY , probably the issue is the one indicated by @ITWhisperer : are you sure to use the same time frame i both the panels? In addition, in your search, you dont need the table command and you could use the timechart command (https://docs.splunk.com/Documentation/SCS/current/SearchReference/TimechartCommandUsage): index=app_pl "com. thehartford.pl.model.exception. Csc ServiceException: null at com.thehartford.pl. rest.UserProfileController.buildUserProfile" | timechart count AS Failure span=1h Ciao. Giuseppe
Hi @scout29, let me understand: you want: number of events by host in the last hour and the hourly average in the last seven days, is it correct? please try this: | tstats count WHERE index=* BY ... See more...
Hi @scout29, let me understand: you want: number of events by host in the last hour and the hourly average in the last seven days, is it correct? please try this: | tstats count WHERE index=* BY host _time span=1h | stats avg(count) AS Average values(eval(if(_time>=now()-3600,count,0))) AS "Last hour" BY host Ciao. Giuseppe
Hi @rbakeredfi , I usually enable wineventlog, and eventually some script. Ciao. Giuseppe
Hi @karthi2809, don't use join, Splunk isn't a DB use stats or something similar to this   index="xxx" applicationName="api" (environment=$env$ timestamp correlationId trace message ("Ondemand Sta... See more...
Hi @karthi2809, don't use join, Splunk isn't a DB use stats or something similar to this   index="xxx" applicationName="api" (environment=$env$ timestamp correlationId trace message ("Ondemand Started*" OR "Expense Process started") OR (trace=ERROR) OR (message="*Before Calling flow archive-Concur*") | rename sourceFileName as SourceFileName content.JobName as JobName | eval "FileName/JobName"= coalesce(SourceFileName,JobName) | rename timestamp as Timestamp correlationId as CorrelationId tracePoint as Tracepoint message as Message | eval JobType=case(like('Message',"%Ondemand Started%"), "OnDemand", like('Message',"Expense Process started%"), "Scheduled", true(), "Unknown") | eval Message=trim(Message,"\"") | rename correlationId as CorrelationId traceas TracePoint message as StatusMessage | rename correlationId AS CorrelationId content.loggerPayload.archiveFileName AS ArchivedFileName | stats earliest(Timestamp) AS Timestamp values(Tracepoint) AS Tracepoint values(JobType) AS JobType values("FileName/JobName") AS "FileName/JobName" values(Message) AS Message values(StatusMessage) AS StatusMessage values(ArchivedFileName) AS ArchivedFileName BY CorrelationId   in other words: put all the searches in OR in the main search, use all the renames and evals, and at east correlate results using the join key in a stats command. If you want some additional field, add it to the stats command. Ciao. Giuseppe
Hi @karthi2809, as I said in the previous answer: don't use join, Splunk isn't a DB use stats or something similar to this index="xxx" applicationName="api" (environment=$env$ timestamp correlation... See more...
Hi @karthi2809, as I said in the previous answer: don't use join, Splunk isn't a DB use stats or something similar to this index="xxx" applicationName="api" (environment=$env$ timestamp correlationId trace message ("Ondemand Started*" OR "Expense Process started") OR (trace=ERROR) OR (message="*Before Calling flow archive-Concur*") | rename sourceFileName as SourceFileName content.JobName as JobName | eval "FileName/JobName"= coalesce(SourceFileName,JobName) | rename timestamp as Timestamp correlationId as CorrelationId tracePoint as Tracepoint message as Message | eval JobType=case(like('Message',"%Ondemand Started%"), "OnDemand", like('Message',"Expense Process started%"), "Scheduled", true(), "Unknown") | eval Message=trim(Message,"\"") | rename correlationId as CorrelationId traceas TracePoint message as StatusMessage | rename correlationId AS CorrelationId content.loggerPayload.archiveFileName AS ArchivedFileName | stats earliest(Timestamp) AS Timestamp values(Tracepoint) AS Tracepoint values(JobType) AS JobType values("FileName/JobName") AS "FileName/JobName" values(Message) AS Message values(StatusMessage) AS StatusMessage values(ArchivedFileName) AS ArchivedFileName BY CorrelationId in other words: put all the searches in OR in the main search, use all the renames and evals, and at east correlate results using the join key in a stats command. If you want some additional field, add it to the stats command. Ciao. Giuseppe
Taking "Splunk Add-on for Microsoft Windows (https://splunkbase.splunk.com/app/742)" for example, all of those inputs are disabled by default. The root of my question is which inputs should remain di... See more...
Taking "Splunk Add-on for Microsoft Windows (https://splunkbase.splunk.com/app/742)" for example, all of those inputs are disabled by default. The root of my question is which inputs should remain disabled to avoid causing issues with a Heavy Forwarder. For example, if Splunk logging/forwarding logs were logged to the local wineventlogs:application, then would enabling wineventlogs:application cause an infinite loop of logging that it is logging?
How should I format the lookup definition so that it takes both CIDR match and indivisual IP match What I mean to say is if I go to advance settings and change the match criteria to CIDR(IP) its not... See more...
How should I format the lookup definition so that it takes both CIDR match and indivisual IP match What I mean to say is if I go to advance settings and change the match criteria to CIDR(IP) its not matching the elements where the IP is single IP and not a CIDR 
Hi, I have two separate searches that are working independently (expected count, actual count).  I want to combine the searches to get a percentage for actual count to expected count; however append... See more...
Hi, I have two separate searches that are working independently (expected count, actual count).  I want to combine the searches to get a percentage for actual count to expected count; however append, appendcols, and other ways to add the searches together have so far not worked for me.  Curious if there's a better way to use stats, eval, transaction commands to achieve the combination of these searches. The end goal is to provide a visualization to understand if there's an issue when the actual count does not match the expected count - so open to suggestions on better ways to achieve that goal.   Search 1 (counting all records that are sent through producer class not part of refresh process): index=index | search ("ProducerClass" AND "*Sending message:*") NOT "*REFRESH*" | stats count as actual_count Search 2 (sum of record counts on files processed through opportunity class):  index=index | search "OpportunityClass" AND "Processing file: file_name" | rex field=_raw "Processing file: file_name with (?<record_count>[^\s]+) records" | stats sum(record_count) as expected_count I have tried append like this and it has not worked: index=index | search ("ProducerClass" AND "*Sending message:*" ) NOT "*REFRESH*" | stats count as actual_count | append [ search index=index "OpportunityClass" AND "Processing file: " | rex field=_raw "Processing file: file_name with (?<record_count>[^\s]+) records" | stats sum(record_count) as expected_count] | eval percent =expected_count/actual_count * 100 appendcols similarly did not work ("Aborting Long Running Search").  Assuming I am incorrectly understanding how I am combining these searches and it is causing issues when using append type commands.  Using an OR on the searches works, but unsure how to use other commands to group the results properly after: index=index | search (("ProducerClass" AND "*Sending message:*" ) NOT "*REFRESH*") OR ("OpportunityClass" AND "Processing file: ") | ...
Unless the lookup is defined with the CIDR, WILDCARD, or case-insensitive option, it will do exact string matching.  Even with those options, there is no way to get "1.2.3.4" to match "['1.2.3.4', '2... See more...
Unless the lookup is defined with the CIDR, WILDCARD, or case-insensitive option, it will do exact string matching.  Even with those options, there is no way to get "1.2.3.4" to match "['1.2.3.4', '2.3.5.0/24']".  The multiple IP addresses in each line should be put on separate lines. IP Name 1.2.3.4 name1 2.3.5.0/24 name1 1.2.3.4 name2 6.7.8.9/31 name2 4.5.6.7 name2 1.1.1.1 name2 3.3.3.3/31 name3 4.4.4.4 name3   Note that the IP field is ambiguous because both name1 and name2 share an IP address.  The lookup command will return only the first matching IP.
1. Check out the Workload Management feature.  https://docs.splunk.com/Documentation/SplunkCloud/9.0.2305/Admin/WorkloadManagement 2. That's about as much art as it is science.  The Search Manual ha... See more...
1. Check out the Workload Management feature.  https://docs.splunk.com/Documentation/SplunkCloud/9.0.2305/Admin/WorkloadManagement 2. That's about as much art as it is science.  The Search Manual has a chapter on it that should get you started.  https://docs.splunk.com/Documentation/Splunk/latest/Search/Aboutoptimization
Your timeframes could be different?
No not confusing the two.. I'm well aware of the differences. My scenario is that I have about a hundred devices all sending syslog data between two receivers in two sites currently. Which is then pi... See more...
No not confusing the two.. I'm well aware of the differences. My scenario is that I have about a hundred devices all sending syslog data between two receivers in two sites currently. Which is then picked up by two UF's and then forwarded to Splunk Cloud. When those two sites get rolled up together into a single colo I'll need to combine them (for lack of better words). Hence the load balancing.  HA would be fine, if I could determine if both UF and rsyslog could be operated in a high availability setting. Pretty sure the former is possible.. the latter though not from what I understand.