All Posts

Find Answers
Ask questions. Get answers. Find technical product solutions from passionate members of the Splunk community.

All Posts

Thank you for your response. To achieve this, we will run the query every 7 days in a loop from the end time until the earliest start time of the data, and write the results to the intermediate index
Thanks guys! index=sky sourcetype=sky_trade_murex_timestamp OR sourcetype=mx_to_sky ``` Parse sky_trade_murex_timestamp events (note that trade_id is put directly into the NB field) ``` | rex fi... See more...
Thanks guys! index=sky sourcetype=sky_trade_murex_timestamp OR sourcetype=mx_to_sky ``` Parse sky_trade_murex_timestamp events (note that trade_id is put directly into the NB field) ``` | rex field=_raw "trade_id=\"(?<NB>\d+)\"" | rex field=_raw "mx_status=\"(?<mx_status>[^\"]+)\"" | rex field=_raw "sky_id=\"(?<sky_id>\d+)\"" | rex field=_raw "event_id=\"(?<event_id>\d+)\"" | rex field=_raw "operation=\"(?<operation>[^\"]+)\"" | rex field=_raw "action=\"(?<action>[^\"]+)\"" | rex field=_raw "tradebooking_sgp=\"(?<tradebooking_sgp>[^\"]+)\"" | rex field=_raw "portfolio_name=\"(?<portfolio_name>[^\"]+)\"" | rex field=_raw "portfolio_entity=\"(?<portfolio_entity>[^\"]+)\"" | rex field=_raw "trade_type=\"(?<trade_type>[^\"]+)\"" ``` Parse mx_to_sky events ``` | rex field=_raw "(?<NB>[^;]+);(?<TRN_STATUS>[^;]+);(?<NOMINAL>[^;]+);(?<CURRENCY>[^;]+);(?<TRN_FMLY>[^;]+);(?<TRN_GRP>[^;]+);(?<TRN_TYPE>[^;]*);(?<BPFOLIO>[^;]*);(?<SPFOLIO>[^;]*)" ``` Reduce to just the fields of interest ``` | fields sky_id, NB, event_id, mx_status, operation, action, tradebooking_sgp, portfolio_name, portfolio_entity, trade_type, TRN_STATUS, NOMINAL, CURRENCY, TRN_FMLY, TRN_GRP, TRN_TYPE, BPFOLIO, SPFOLIO ``` "Join" events by NB using stats ``` | head 1000 | stats values(*) as * by NB | fillnull | head 1 | transpose 0 Now its this I get a 1000 events for an hour range estimated and results as shown column row1 NB 0 action 0 event_id 0 mx_status live operation sydeod portfolio_entity SG KOREA USA ... portfolio_name AUD APT ... sky_id 673821 ... trade_type VanillaSwap tradebooking_sgp 2024/12/26 00:06:34.3572 ...  
Just ask the "Men with the Black Fez"     
I worked it out just now too with appendpipe, thanks a lot for your detailed response. I should've marked your response as the solution.. Thanks again!
Fixed it myself with appendpipe
--if your base search includes N1, N2, N3, ... Nn, you can add additional logic to generate the N-value dynamically:   | appendpipe [| search Category=N* | stats count sum(*) as * | ... See more...
--if your base search includes N1, N2, N3, ... Nn, you can add additional logic to generate the N-value dynamically:   | appendpipe [| search Category=N* | stats count sum(*) as * | eval Category="N".tostring(count+1) | fields - count ]   Or if you want to sub all Category values:   | appendpipe [| rex field=Category "(?<Category>[^\d]+)" | stats count sum(*) as * by Category | eval Category=Category.tostring(count+1) | fields - count ]   => Category A B C D E F N1 1 2 4 2 4 1 N2 0 5 4 3 5 7 M1 1 0 1 0 4 3 M2 1 1 3 5 0 1 U1 0 4 6 5 4 3 M3 2 1 4 5 4 4 N3 1 7 8 5 9 8 U2 0 4 6 5 4 3 But note that you'll need to add custom sorting logic if you want something other than the default sort order.
Ah, that detail wasn't clear from the original message. You can use the appendpipe command to stream the base search results through a subsearch and then append the subsearch results to base search r... See more...
Ah, that detail wasn't clear from the original message. You can use the appendpipe command to stream the base search results through a subsearch and then append the subsearch results to base search results:   | makeresults format=csv data="Category,A,B,C,D,E,F N1,1,2,4,2,4,1 N2,0,5,4,3,5,7 M1,1,0,1,0,4,3 M2,1,1,3,5,0,1 U1,0,4,6,5,4,3" | table Category * | appendpipe [| search Category=N* | stats sum(*) as * | eval Category="N3" ]   => Category A B C D E F N1 1 2 4 2 4 1 N2 0 5 4 3 5 7 M1 1 0 1 0 4 3 M2 1 1 3 5 0 1 U1 0 4 6 5 4 3 N3 1 7 8 5 9 8
Sorry I'm still not quite getting it, this is what I would like to achieve: Category A B C D E F N1 1 2 4 2 4 1 N2 0 5 4 3 5 7 M1 1 0 1 0 4 3 M2 1 1 3 5 0... See more...
Sorry I'm still not quite getting it, this is what I would like to achieve: Category A B C D E F N1 1 2 4 2 4 1 N2 0 5 4 3 5 7 M1 1 0 1 0 4 3 M2 1 1 3 5 0 1 U1 0 4 6 5 4 3 I would like to create an additional row that is the sum of N1 and N2, and append it to the table above. N3 1 7 8 5 9 8
You can put whatever you want in the label argument. It sets the value of the field specified by labelfield in the totals row: | addcoltotals labelfield=Category label="Total" | addcoltotals label... See more...
You can put whatever you want in the label argument. It sets the value of the field specified by labelfield in the totals row: | addcoltotals labelfield=Category label="Total" | addcoltotals labelfield=Category label="SUM[n1..nN]" | addcoltotals labelfield=Category label="Life, the Universe and Everything"
Hi thanks for the reply, I have more rows other than N1 and N2
Hi @MachaMilkshake, The addcoltotals command should do exactly what you need: | addcoltotals labelfield=Category label=N3
Hi @ranjith4, What is the aggregate throughput of all sources? If you're unsure, what is the peak daily ingest of all sources? Splunk Universal Forwarder uses very conservative default queue sizes ... See more...
Hi @ranjith4, What is the aggregate throughput of all sources? If you're unsure, what is the peak daily ingest of all sources? Splunk Universal Forwarder uses very conservative default queue sizes and a throughput limit of 256 KBps. As a starting point, you can disable the throughput limit in $SPLUNK_HOME/etc/system/local/limits.conf: [thruput] maxKBps = 0 If the forwarder is still not delivering data as quickly as it arrives, we can adjust output queue sizes based on your throughput (see Little's Law). As @PickleRick noted, the forwarder may be switching to an effectively single-threaded batch mode when reading files larger than 20 MB. Increase the min_batch_size_bytes setting in limits.conf to a value larger than your largest daily file or some other arbitrarily large value [default] # 1 GB min_batch_size_bytes = 1073741824 If throughput is still an issue, you can enable additional parallel processing with the server.conf parallelIngestionPipelines setting, but I wouldn't do that until after tuning other settings.
I have created a table with a series of outer joins. I now have column 'Category', and another 6, call them A to F. in Category, I have values N1 and N2, and I would like to create a new row with C... See more...
I have created a table with a series of outer joins. I now have column 'Category', and another 6, call them A to F. in Category, I have values N1 and N2, and I would like to create a new row with Category=N3, and values for A to F equals to the sum of those for N1 and N2. I've tried every possible thing I could find but couldn't get it to work, any help is appreciated, thanks!  Current code looks something like this: ... queries & joins... | table "Category" "A" ... "F"
You can use the regex command to filter by a regular expression, but it's slower and more cumbersome than just combining TERM() functions in a search predicate. As alternatives, you can extract and ... See more...
You can use the regex command to filter by a regular expression, but it's slower and more cumbersome than just combining TERM() functions in a search predicate. As alternatives, you can extract and normalize a mac field at index time with a combination of transforms or you can create a single-field data model that acts as a secondary time series index. For the latter, create a search-time field extraction using a transform with MV_ADD = true to capture strings that look like MAC addresses matching your 48-bit patterns (xx-xx-xx-xx-xx-xx, xx:xx:xx:xx:xx:xx, and xxxx.xxxx.xxxx). For example, using source type mac_addr: # props.conf [mac_addr] REPORT-raw_mac = raw_mac # transforms.conf [raw_mac] CLEAN_KEYS = 0 MV_ADD = 1 REGEX = (?<raw_mac>(?<![-.:])\b(?:[0-9A-Fa-f]{2}(?:(?(2)(?:\2)|([-:]?))[0-9A-Fa-f]{2}){5}|[0-9A-Fa-f]{4}(?:(\.)[0-9A-Fa-f]{4}){2})\b(?!\2|\3)) Create a subsequent calculated (eval) field that removes separators: # props.conf [mac_addr] REPORT-raw_mac = raw_mac EVAL-mac = mvdedup(mvmap(raw_mac, replace(raw_mac, "[-.:]", ""))) Then, define and accelerate a data model with a single dataset and field: # datamodels.conf [my_mac_datamodel] acceleration = true # 1 month, for example acceleration.earliest_time = -1mon acceleration.hunk.dfs_block_size = 0 # data/models/my_mac_datamodel.xml { "modelName": "my_mac_datamodel", "displayName": "my_mac_datamodel", "description": "", "objectSummary": { "Event-Based": 0, "Transaction-Based": 0, "Search-Based": 1 }, "objects": [ { "objectName": "my_mac_dataset", "displayName": "my_mac_dataset", "parentName": "BaseSearch", "comment": "", "fields": [ { "fieldName": "mac", "owner": "my_mac_dataset", "type": "string", "fieldSearch": "mac=*", "required": true, "multivalue": false, "hidden": false, "editable": true, "displayName": "mac", "comment": "" } ], "calculations": [], "constraints": [], "lineage": "my_mac_dataset", "baseSearch": "index=main sourcetype=mac_addr" } ], "objectNameList": [ "my_mac_dataset" ] } All of the above can be added to a search head using SplunkWeb settings in the following order: Define shared field transformation. Define shared field extraction. Define shared calculated field. Define shared data model. Finally, use the datamodel command to optimize the search: | datamodel summariesonly=t my_mac_datamodel my_mac_dataset flat | search mac=12EA5F7211AB Note that some undocumented conditions (source type renaming?) may force Splunk to disable the optimizations used by the datamodel command when distributing the search, in which case it will be no faster than a regular search of the extracted mac field. If it's working correctly, the search log should include an optimized search with a READ_SUMMARY directive as well as various ReadSummaryDirective log entries. The datamodel command with the flat argument will return the raw events and the undecorated mac field values, but no other extractions will be performed.
Good catch. I remembered that dedup was needlessly used in both "compound searches" but didn't notice that it was transfered to the "composite search". Indeed it only leaves us with one of the two or... See more...
Good catch. I remembered that dedup was needlessly used in both "compound searches" but didn't notice that it was transfered to the "composite search". Indeed it only leaves us with one of the two or more "joinable" events.
Be aware though that it's not _searching_ for particular MAC address - it's extraction. So if you want to find a specific MAC you'll have to firstly extract it with rex _from every event_ and then co... See more...
Be aware though that it's not _searching_ for particular MAC address - it's extraction. So if you want to find a specific MAC you'll have to firstly extract it with rex _from every event_ and then compare the extracted value with what you're looking for. It's not very effective performance-wise.
The overall architecture is ok. There might be some issues with the configuration. If the delay is consistent and constant it might be a problem with timestamps. If it's being read in batches, you'r... See more...
The overall architecture is ok. There might be some issues with the configuration. If the delay is consistent and constant it might be a problem with timestamps. If it's being read in batches, you're probably ingesting from already rotated files.
Remove the dedup NB! This is reducing your events to one event per NB which is why you are only getting half your data!
That is very strange because that suggests that you have much more fields that you're creating in this search. Just for testing, replace the last stats command with | head 1000 | stats values(*) a... See more...
That is very strange because that suggests that you have much more fields that you're creating in this search. Just for testing, replace the last stats command with | head 1000 | stats values(*) as * by NB | fillnull | head 1 | transpose 0 Oh, and since you're doing stats values anyway, the dedup command is not needed. It can in fact give you a performance penalty because dedup is a centralized command while all preceeding ones are distributed streaming and stats can be distributed to some extent.
Thank you! While not the solution I was hoping for, this'll get the job done easily enough. I'd actually already considered using the rex command, but wasn't able to get my regex to look neat enough ... See more...
Thank you! While not the solution I was hoping for, this'll get the job done easily enough. I'd actually already considered using the rex command, but wasn't able to get my regex to look neat enough for me to be happy with it.