About pjb2160

pjb2160 · ‎03-09-2018

So, I'm running the following on some logs which have the dest_ip but no url. I want to report on the top domains by bandwidth: index=proxy bytes=* | fields + user dest_ip bytes bytes_in bytes_out | stats sum(bytes) as total_bytes sum(bytes_in) as total_bytes_in sum(bytes_out) as total_bytes_out by dest_ip | eval "Total Bandwidth"=round(((total_bytes/1024)/1024),2), "Download"=round(((total_bytes_in/1024)/1024),2), "Upload"=round(((total_bytes_out/1024)/1024),2) | table dest_ip,"Total Bandwidth","Download","Upload" | rename dest_ip as "Target IP Address" | sort limit=10 -"Total Bandwidth" I figure, it'd be most efficient to try and resolve the top 10 ip addresses only. Any pointers? Cheers, pjb2160

pjb2160 · ‎07-27-2016

OK, so I've been working away on this one for a little while now and can't see what I've missed. I've created a base search, but it doesn't return any results. Rather, it reads "No search query provided", please refer following code sample: <form> <label>AV Dashboard</label> <fieldset submitButton="false"> <input type="time" token="time_token"> <label></label> <default> <earliest>-24h@h</earliest> <latest>now</latest> </default> </input> <input type="text" token="event_desc_token" searchWhenChanged="true"> <label>Event Description</label> <default>*</default> </input> <input type="text" token="user_token" searchWhenChanged="true"> <label>User</label> <default>*</default> </input> </fieldset> <search id="baseSearch1"> <query> index=sec_antivirus sourcetype="antivirus:symantec:ids" Event_Description="$event_desc_token$" user="$user_token$" | fields * </query> <earliest>$time_token.earliest$</earliest> <latest>$time_token.latest$</latest> </search> <row> <panel> <title>All Events</title> <single> <option name="drilldown">none</option> <option name="colorBy">value</option> <option name="colorMode">none</option> <option name="numberPrecision">0</option> <option name="showSparkline">1</option> <option name="showTrendIndicator">1</option> <option name="trendColorInterpretation">standard</option> <option name="trendDisplayMode">absolute</option> <option name="useColors">0</option> <option name="useThousandSeparators">1</option> <option name="linkView">search</option> <search base="baseSearch1"> <query>stats count</query> </search> </single> </panel> </row> </form> Please help. many thanks, P

pjb2160 · ‎10-22-2015

Hello, I am looking to generate a report which indicates the current roles and for each role: What indexes the role has visibility of What capabilities are specific to the role Which users can use the role; and Whether each user has used the role in a period of time (as controlled by the time picker) I was hoping there may be an app which someone has already pulled together but I can't see anything in Splunk Base that looks relevant. In terms of a search query, I have been playing with the below query which shows the user and the actions they have performed. Clearly this is not quite right as I want it to start with the role, then show which indexes the role can access and the capabilities specific to the role. It does show that the role has been used though which is helpful. index=_audit user=* action=* | dedup user action | stats list(action) AS actions by user Any thoughts, suggestions? Cheers, P

pjb2160 · ‎09-21-2015

Excellent advice! I will be sure to look into these apps further.

pjb2160 · ‎09-21-2015

Good advice re: custom dashboards consolidating info across apps. thanks!

pjb2160 · ‎09-21-2015

Some excellent suggestions there much appreciated! As a matter of interest, I have been going through the following book which discusses the use of Security Onion in some detail: The Practice of Network Security Monitoring: Understanding Incident Detection and Response (https://booko.com.au/9781593275099/The-Practice-of-Network-Security-Monitoring-Understanding-Incident-Detection-and-Response) So far, this seems a pretty good resource.

pjb2160 · ‎09-14-2015

Hello, I am wondering what the general thoughts of the Splunk community are in terms of which apps would you most recommend for use within a Security Operations Centre (SOC)? We do have the Splunk App for Enterprise Security which I would think is a pretty good starting point, however, I'm certain there would be a bunch of others some of you would find invaluable!?! I look forward to hearing your thoughts. many thanks, P

pjb2160 · ‎06-23-2015

Thanks @ChrisG

pjb2160 · ‎06-23-2015

I am keen to get an idea on some best practice for how to estimate impact on our Splunk deployment to suit a client request. For example, the client might say they want real-time monitoring across multiple log sources rendered in a dashboard with other complex correlations. The resulting question is do we need to scale our deployment to suit their specific need? It may mean minor changes to the Search Head (e.g. more cores, more memory) or it may mean we need to significantly adjust our deployment model (e.g. purchase more storage; build more Search Heads and Indexers; use a Heavy Forwarder to pre-process rather than a Universal Forwarder...). If we do need to make such changes, it might result in a determination that the cost of updating the Splunk deployment to suit the client’s request is not commensurate with the benefit derived from the client's monitoring. I have asked around and got some good suggestions (see dot points below) but they are after we would have spent some time trying to pull the data in, write the query, etc. I'm particularly interested in trying to make such a determination well before we get to that point. I'm sure others have experienced this many times and am hoping at least one of you out there is happy to impart some quality advice. Determine time to complete query Determine the burden of concurrent searches on a given search head over time I hope this makes sense. Cheers, P

pjb2160 · ‎03-31-2015

Agreed, however Australia/Hobart? That's pretty odd? No offence to anyone from Hobart of course but Hobart is undoubtedly the least significant of the Australian capital cities and the absence of Sydney seems particularly strange as Sydney is often, incorrectly, referred to as the national capital... I would hope Splunk review this at some point as there are can be some peculiar timezone offsets for daylight savings time along the Australian east coast.

pjb2160 · ‎03-19-2015

Oh, and in relation to the TZ offset, I forgot to mention I think this must be being picked up by my local user settings. The offset is +10 hours though and it should be +11 for an Australia/Sydney timezone. Interestingly there is no Australia/Sydney timezone to select in my local user settings...

pjb2160 · ‎03-19-2015

So... it appears I have been somewhat mistaken! In my original post I said the DBX Application uses epoch time for its time stamps and that it was GMT. Wrong(ish)! It uses UTC time not GMT (see here for the differences: http://www.timeanddate.com/time/gmt-utc-time.html) Also, what was happening is the Oracle View had the CREATION_DATE datatype set to a datetimestamp which was converted by DBX to an epoch time (I had incorrectly assumed it to be that the Oracle view recorded the timestamp as an epoch time). I made numerous attempts to resolve this, not limited to but including: trying to apply the TZ offset via the various props.conf files (this did not work, I could see no affect) rewriting the SQL within DBX to create a new column which applied the time zone offset (this did not work, the new column was not created) deleting and recreating the database input exactly as it was before (this did not work) Interestingly, when editing the below line in the DBX inputs.conf file (mostly just to see that it would have an affect) the value in the database input field within the DBX GUI ("Manage Database Inputs") did not change: output.timestamp.format = yyyy-MM-dd'T'HH:mm:ss.SSSZ In the end, mostly to deal with another problem (logs were no longer being indexed) I deleted the database input again and re-created it but did not check the "Output timestamp" box and didn't set any value value in the "Timestamp column" or "Timestamp format" fields... Voila, all of a sudden the logs were being indexed again and the time zone offset was being applied!!! In addition, the "CREATION_DATE" which was not appearing in my event data, was now included. Interestingly, this is still not perfect (am I expecting too much???). The CREATION_DATE when converted from an epoch time, to a human readable one, does not exactly match _time (_indextime is typically > than _time by < 1 second). It does not match _indextime and although there is one more time stamp field in the data this is exactly the same as the CREATION_DATE so I know _time is not inadvertently picking up this other time stamp. I can live with this but would naturally prefer perfection. Any thoughts on what is informing _time?? Many, many thanks, P

pjb2160 · ‎03-16-2015

Hello Mark, Thanks for your response (and apologies for the sizable delay in mine). I suppose this is probably more of a theoretical question so I don't really have any specific example data. I am interested to see what people in the community are doing to ensure the quality of their log data and the field extractions when they are initially ingested AND over time. I am particularly interested in any automated checks which may be performed. The types of things I believe will affect the quality of log data may include: changes to log outputs (e.g. revising the format of a time stamp or the introduction of new data may break the regex) poorly developed and/or inadequately tested regex unexpected failure to ingest log data (e.g. infrastructure renames the host; or the password is changed on a database connection) I hope this clarifies my request. Many thanks, P

pjb2160 · ‎02-23-2015

Hello, So I am pulling together a checklist of things to ensure initial and ongoing log data quality. This is obviously a pretty broad and universal topic and one which I'm certain a PhD or two may have been written on. I was hoping to initially pull together a list of things to check for (the checklist) and then to work out how I might apply some automated checking within Splunk. I started some research and plenty of thinking and it occurred to me... there's probably plenty of people in this community who have already done this and will most likely have come up with something much better than I will so, I put it to you (the Splunk>answers community): What do you look for, and how, to ensure initial log data quality?; and How do you ensure it is ongoing? So you don't feel I am entirely shirking my own cerebral responsibilities, this is what I have: What to look for 1 - All expected logs are being indexed 2 - All logs are being indexed as expected (e.g. they are complete, there is no truncation and/or concatenation) 3 - _time appropriately matches log generation time-stamp 4 - I have appropriately applied field extractions (e.g. they are complete, there is no truncation and/or concatenation) How do I look ensure initial log data quality? 1 - I perform a manual check on the log source to see what has been generated and then run a query within the Splunk Search Head to confirm I can see all the different types of logs 2 - I manually perform ad hoc queries across the log source and all relevant source types to see if there is any obvious truncation or concatenation. Following this, and provided I am confident in the quality of my regex, I use field extractions which I expect to appear on every log (e.g. Active Directory Event ID) and check to see if they appear on 100% of my events. 3 - I manually check each source and sourcetype to ensure the correct time-stamp is being interpreted by Splunk (especially if the log is not being directly ingested from the log source, e.g. if via a syslog repository where the log data may be pre-pended with the syslog time-stamp) 4 - Individually test every field extraction over "All time" to ensure is appearing in 100% of my events (if expected to do so) and then I visually test the values by rendering the values in a table (after applying a dedup of course). How do I ensure the data quality is maintained over time? 1 - I have scheduled a daily query which looks to identify when a log has not been ingested in over 25 hours | metadata index=* type=sourcetypes | eval age = now()-lastTime | where age > 90000 | sort age d | convert ctime(lastTime) | fields age,sourcetype,lastTime 2 - No ongoing automated process for this one 3 - No ongoing automated process for this one 4 - No ongoing automated process for this one In terms of how do I identify data quality issues, I feel my approach is too reliant on me visually detecting an anomaly. I appreciate in some circumstances this is the best I can expect but in others I'm sure there's a better way. Please, please, I invite you to critique my approach and to pass on what you do as I think this is probably going to provide many more people other than myself some significant value. Many thanks, P

pjb2160 · ‎02-19-2015

Hello, and thank you for responding to my post, it seems I may need to provide more information. At this point, the inputs.conf file in the DBX app has the following key lines: [dbmon-tail://LogSourceX/LogSourceX] ... output.format = mkv output.timestamp = 1 output.timestamp.column = CREATION_DATE output.timestamp.format = yyyy-MM-dd'T'HH:mm:ss.SSSZ ... sourcetype = oracle_datasource ... I have applied the following update to the props.conf file in the search app on the same server (search head) where DBX app is installed: [oracle_datasource] TZ=Australia/Sydney Currently, the log output for an epoch value of "1424353872.000" is rendering in Splunk as "2015-02-20T03:58:44.000+1100". This is exactly the same as it was prior to applying these changes. I have subsequently applied the same change (TZ=...) to the props.conf on the indexer and alas, nothing there either (still rendering ""2015-02-20T03:58:44.000+1100") I would have thought by applying the TZ=... line in props.conf it would have rendered as "2015-02-20T16:58:44.000". Is it possibly I need to apply some TIME_FORMAT setting or something? Any additional thoughts? (Thanks again for your response!) Cheers, P

pjb2160 · ‎02-16-2015

Hello, I have a data source, an Oracle View, providing log data via the DBX application which uses epoch time for its time stamps. Naturally this is GMT and I need to apply an +11 hour offset. I have read a number of Splunk docs and Splunk Answers but am quite confused. It appears I need to just add something as simple as the following to a props.conf file: [oracle_datasource] TZ=Australia/Sydney however I'm unclear as to which exact props.conf file. Does it need to be on the Indexer, or Search Head? Or, as this is via the DBX app do I need to apply some configuration changes to the inputs.conf file in the ... /dbx/local/ folder? Many thanks, Paul (p.s. sorry for the forward slashes! The backslash escape doesn't seem to work here, i.e. 2 x backslashes still don't render in the preview pane)

pjb2160 · ‎02-11-2015

Sorted, I've used @David's first suggestion to filter out the non-relevant results at the end of it all: index=approvals sourcetype=approval_logs [| makemv delim=";" groups_affected | makemv delim=";" users_affected | mvexpand groups_affected | mvexpand users_affected] [|inputlookup ad_group_exclusions | stats values(ex_group_name) as search | eval search="NOT (groups_affected=" . mvjoin(search, " OR groups_affected=") . ")"] | eval common_group=lower(groups_affected) | eval common_user=lower(users_affected) | rex field=common_group "\s+(?<common_group>.*)" | rex field=common_user "\s+(?<common_user>.*)" | table common_group common_user _raw | lookup ad_group_exclusions ex_group_name AS common_group OUTPUT should_exclude | search NOT should_exclude="exclude" | where NOT ([ | inputlookup ad_group_exclusions | fields ex_group_name ]) I also added some regex to strip out any of the special characters prepended to each new entry resulting from the mvexpand command. Cheers, P

pjb2160 · ‎02-11-2015

pjb2160 · ‎02-08-2015

I've now tried to apply it to the approval_logs only however the approval_logs have two multi-value fields in them. I previously removed the line in the query addressing this as i initially thought it not to be relevant. Here's my current query: index=approvals sourcetype=approval_logs [| makemv delim=";" groups_affected | makemv delim=";" users_affected | mvexpand groups_affected | mvexpand users_affected] [|inputlookup ad_group_exclusions | stats values(ex_group_name) as search | eval search="NOT (groups_affected=" . mvjoin(search, " OR groups_affected=") . ")"] | eval common_group=lower(groups_affected) | eval common_user=lower(users_affected) | table common_group common_user _raw Also, I updated the props.conf file under the relevant stanza: LOOKUP-ad_group_exclusions = ad_group_exclusions ex_group_name AS groups_affected OUTPUT ex_group_name, should_exclude but the group exclusion doesn't do anything. I think it's because the multi-value field, groups_affected hasn't yet been made MV and expanded. It's my understanding that the look up will be run first so that won't match on any of the multi-value fields. I have tried swapping parts of the search around but i have had no success, either the query breaks syntactically or yields no results at all. Any additional help would be greatly appreciated.

pjb2160 · ‎02-08-2015

OK, good news, I've resolved the case sensitivity issue when searching only on the AD logs by updating the transforms.conf to: [ad_group_exclusions] filename = ad_group_exclusions.csv case_sensitive_match = false match_type = WILDCARD(ex_group_name) added the following line to the props.conf: LOOKUP-ad_group_exclusions = ad_group_exclusions ex_group_name AS group_name OUTPUT ex_group_name, should_exclude have updated the .csv to look like: ex_group_name,should_exclude group-foo-d-*,exclude group-bar-t-*,exclude And the resulting query is now: index=active_directory sourcetype=ad_logs [| inputlookup ad_event_codes | fields event_id ] [|inputlookup ad_group_exclusions | stats values(ex_group_name) as search | eval search="NOT (group_name=" . mvjoin(search, " OR group_name=") . ")"] | eval common_group = lower(group_name) | eval common_user = lower(changed_user_id) | table common_group common_user _raw I've updated the .csv file with a variety of values to test the filtering and it works well.

pjb2160 · ‎02-08-2015

Thanks David, I can see the option whereby I run the sub-search twice is the better option so I will look to progress with this. I have been playing around with this just using the one data source (active_directory) and it seems to work for the most part. index=active_directory sourcetype=ad_logs [| inputlookup ad_event_codes | fields event_id ] [|inputlookup ad_group_exclusions | stats values(group_name) as search | eval search="NOT (group_name=" . mvjoin(search, " OR group_name=") . ")"] | table group_name changed_user_id _raw The new issue is how do I ensure the comparison for the sub-search NOT is done with no case sensitivity? Previously, in the join I've used the lower(...) command to ensure the join compares values in lower case but it's not clear how I can do so using your suggestion? Thanks again, P

pjb2160 · ‎02-05-2015

Hello, I have two log sources (AD logs and approval logs) which I am performing a correlation on (via a join). Each log source has a group_name field. I wish to use a lookup table to filter out events which match the values in the lookup table. However, the lookup table will have wild cards AND I wish to apply the filter after I have performed the join. My train of thought is that way I don't have to do the look up twice (once for each sub-search). For example, my lookup table, called ad_group_exclusions, has the following values: group_name (this is the column heading) group-foo-d-* group-bar-t-* And I am thinking my query should look something like this: index=active_directory sourcetype=ad_logs [| inputlookup ad_event_codes | fields event_id ] | eval common_group = lower(group_name) | eval common_user = lower(changed_user_id) | join type=outer common_group common_user [search index=approvals sourcetype=approval_logs | eval common_group=lower(groups_affected) | eval common_user=lower(users_affected) | eval approval_logs_raw = _raw] | table common_group common_user _raw approval_logs_raw | where NOT ([ | inputlookup ad_group_exclusions | fields group_name ]) Obviously this is not working as I've tested it by populating the lookup with exact matches as well as the wild card values. Any suggestions? Cheers, P

pjb2160 · ‎02-03-2015

This is a strange one, I have a data source which has multiple values in two separate fields so I use the makemv and then mvexpand commands which works well, and as expected, when rendered in Splunk. Here's my code: index=main sourcetype=approvals | makemv delim=";" groups | makemv delim=";" users | mvexpand groups | mvexpand users | table groups admin users action _raw I can see each new event and the relevant value from the mvexpand looks to only include the characters I expect (e.g. no special characters) when rendered in Splunk. The problem is when I export the results to a .csv file. The new events resulting from the mvexpand command pick up some special characters. I can see this when I view the .csv in an editor such as Notepad++. I've mocked up an example here showing only two events. I've used [LF], [CR] and [Tab] to represent the special characters (line feed, carriage return and tab): "groups","admin","users",action,"_raw" [LF] "GR-group-1",admin_1,user_1,Add,"2015-02-02T22:40:05.000 [LF] [Tab] GROUPS=GR-group-1; [LF] GR-group-2; [LF] ACTION=Add [LF] USERS=user_1 [LF] ADMIN=admin_1" [CR][LF] " [LF] [Tab] GR-group-2;",admin_1,user_1,Add,"2015-02-02T22:40:06.000 [LF] GROUPS=GR-group-1; [LF] [Tab] GR-group-2; [LF] ACTION=Add [LF] USERS=user_1 [LF] ADMIN=admin_1" [CR][LF] Each line ends with a [LF] and the event, after the closing quotation for _raw ends in a [CR][LF]. The issue is the beginning of the second event (highlighted in red). I do not want to include the special characters in the .csv export. Does my question make sense? I would welcome the opportunity to discuss further and would even more so welcome someone showing me I have just over looked the simplest of solutions!!! (simple is good) Many thanks, Paul

pjb2160 · ‎01-29-2015

Thanks @lguinn for your suggestion, I spent some time working with this however it didn't quite work for me. That said, it was very useful to try and resolve the problem with your suggestion however in the end I reverted to my initial approach. I revised my initial query to use a left outer join. As a matter of interest I've shown my final result below (with some additional explanation probably more so for the benefit of other users who may refer to this post in the future) : index=active_directory sourcetype=ad_logs [| inputlookup ad_event_codes | fields event_id ] | eval common_group = lower(group_name) | eval common_admin = lower(change_user_id) | eval common_user = lower(changed_user_id) | eval common_action = lower(event_id_action) | join type=outer common_group common_admin common_user common_action[search index=approvals sourcetype=approval_logs | makemv delim=";" groups_affected | makemv delim=";" users_affected | mvexpand groups_affected | mvexpand users_affected | eval common_group=lower(groups_affected) | eval common_admin=lower(admin_officer) | eval common_user=lower(users_affected) | eval common_action=lower(action) | eval approval_logs_raw = _raw] | table common_group common_user common_action _raw approval_logs_raw | where NOT len(approval_logs_raw)>0 In the new query I have added a couple of additional elements: makemv and mvexpand to handle multi-value fields; lower(...) to ensure comparisons aren't limited by case sensitivity; and the final where statement which is what effectively turns my query into an left outer join rather than a full outer join (http://blog.codinghorror.com/a-visual-explanation-of-sql-joins/) Thanks again, Paul

pjb2160 · ‎01-22-2015

Hello, I have two data sources Active Directory (Source 1) and Change Approvals (Source 2). I need to identify any Active Directory record which does not have a corresponding Change Approval. Prior to correlating between the two data sources I need to filter my Active Directory results to only include those with specific Event IDs. To start with I am attempting to find the records that match and I believe this is working (for the most part), here's my sample code for the first part: index=active_directory sourcetype=ad_logs [| inputlookup ad_event_codes | fields event_id ] | eval common_group = group_name | eval common_admin = change_user_id | eval common_user = changed_user_name | eval common_action = event_id_action | join type=inner common_group common_user common_action[search index=approvals sourcetype=approval_logs | rename groups_affected as common_group | rename admin_officer as common_admin | rename users_affected as common_user | rename action as common_action | eval approvals_raw = _raw ] | table common_group common_admin common_user common_action _raw approvals_raw As you can see I've used an inner join type and am using common fields between the two data sources (group name, administrator name and changed user name) as the join keys. (Please note: At this point in time this is the best I have and am aware I am likely to yield numerous false positives in my exception reporting once I have this query working) Now i need to determine which of the filtered Active Directory records do not match as part of the join. Can anyone offer some guidance here? Is it a different join type or is it as simple as saying "NOT join" or something? I would welcome the opportunity to discuss this further. many thanks, Paul

Posts	30
Solutions	4
Karma Given	4
Karma Received	9
Member Since	‎11-05-2014

Online Status	Offline
Date Last Visited	‎06-05-2020 02:04 AM

How to resolve top 10 ip addresses

"No search query provided" when using base search ...

Is there currently an app or search for reporting ...

What Splunk apps would be most useful for a Securi...

Guidance on how to estimate impact on Splunk deplo...

What do you look for, and how, to ensure initial l...

How to configure timezone offset for log data comi...

How to use lookup with wildcards to filter events

Why are new events resulting from mvexpand picking...

With two data sources Source1 and Source2 , how to...

How to resolve top 10 ip addresses

"No search query provided" when using base search ...

Is there currently an app or search for reporting ...

Re: What Splunk apps would be most useful for a Se...

Re: What Splunk apps would be most useful for a Se...

Re: What Splunk apps would be most useful for a Se...

What Splunk apps would be most useful for a Securi...

Re: Guidance on how to estimate impact on Splunk d...

Guidance on how to estimate impact on Splunk deplo...

Re: How to configure timezone offset for log data ...

Re: How to configure timezone offset for log data ...

Re: How to configure timezone offset for log data ...

Re: What do you look for, and how, to ensure initi...

What do you look for, and how, to ensure initial l...

Re: How to configure timezone offset for log data ...

How to configure timezone offset for log data comi...

Re: How to use lookup with wildcards to filter eve...

Re: Why are new events resulting from mvexpand pic...

Re: How to use lookup with wildcards to filter eve...

Re: How to use lookup with wildcards to filter eve...

Re: How to use lookup with wildcards to filter eve...

How to use lookup with wildcards to filter events

Why are new events resulting from mvexpand picking...

Re: With two data sources Source1 and Source2 , ho...

With two data sources Source1 and Source2 , how to...

Are you a member of the Splunk Community?