Hello All,
I have a large dataset "audit.cost_records" wherein I am trying to locate a correlation based on a large number of fields. These fields are large in number (over 1000 total) and many can be grouped (for my current purposes). Some groups may consist of 10+ fields while others may be only 1.
Some example field names are:
ab, ab-HE, ab-SC, ab-LS, rs, rs-SH, rz, xr, xr-FL, xr-SH, xr-SS
in this example, all of the ab items should be grouped, as with rs, and xr. Unfortunately, I am new to splunk and my understanding of the splunk language is elementary at best. I do have somewhat advanced or at least jorneyman knowledge in SQL and basic in a few programming languages (Java and the like). Unfortunately that doesn't seem to be helping me here.
Based on several hours of searching this community and trial and error I have arrived at the below. I was trying to use wildcards to group by similar field names. I've just read somewhere that Splunk may segment the field based on the '-' character, which makes my wildcard not work as I intend.
| from datamodel:"AUDIT.COST_RECORDS"
| eval Group1=if(match(fieldName,"ab*"), "ABGroup", Group1)
| eval Group1=if(match(fieldnName,"rs*"),"RSGroup",Group1)
| timechart span=30d sum(cost) as Cost by Group1
Does anyone have any recommendations on how to solve this search? My overall intent is to have a year-to-date line chart (spanned monthly) showing cost over time for each "Group".
Firstly, if these are all constant sets of fields, you could add calculated field(s) to your datamodel.
Secondly, match() function matches using regex. So, for example, "ab*" means letter a followed by any number of letters b (possibly zero).
But most importantly, it's not exactly clear how your dataset looks. Do you indeed get a field called "fieldName" with a value of ab, ab-HE and so on? Or do you have fields named ab, ab-HE and so on with values in them?
Because it looks as if you were trying to iterate over field names and it doesn't work that way (at least not if written like that)
Hello Mr.Rick and thank you for the prompt response. To clarify (and thank you for pointing this out).
I have a field called fieldName, that contains values {ab, ab-HE, ab-XO...} etc. and I am looking to group the cost of these like records by similar fieldName's.
And correct, I am not trying to find a letter a followed by 0-infinity b's. I am looking for the letters 'ab' followed by 0-inifinity anythings (most likely a dash followed by 2 letters XD, but also potentially not followed by anything at all).
That being said, I did come accross a substr function that I was trying to get to work as well,but I'm only getting NULL. The intent would be to only consider the first 2 characters of the fieldName field and group by that, which should get me the results I am looking for.
| from datamodel:"AUDIT.COST_RECORDS"
| eval temp=substr(fieldName,1,2)
| timechart span=30d sum(cost) as Cost by temp
That seems a reasonable way to do it, as far as I understand your data 🙂
And about the clarification - it's typical that something that seems to be obvious for us, since it's our data and we know it like the back of our hand might not be that easily understandable for others. So it's good to be precise about what we have and what we're trying to achieve. substr() seems indeed a good candidate for your groupping. If you had more complicated groupping rules, you might look into a lookup which could produce a desired group value from input provided by your field. It's a nice flexible way of creating static 1 to 1 or many to 1 mappings.