I have millions of values indexed that look like this
,A}MCTEST1_SI_EVENTS_TEST1_SI_EVENTS_no_event_id_total_value_season_percent_stars_33097521 A}MCTEST1_SI_EVENTS_TEST1_SI_EVENTS_no_event_id_total_value_season_percent_20709664 A}MCTEST1_SI_EVENTS_TEST1_SI_EVENTS_586_user_by_outfit_32587030 A}MCTEST1_SI_EVENTS_TEST1_SI_EVENTS_592_impression_33141624 ,A}MCTEST1_SI_EVENTS_TEST1_SI_EVENTS_no_event_id_total_value_season_percent_stars_33952008 A}MCTEST1_SI_EVENTS_TEST1_SI_EVENTS_no_event_id_total_value_new_33208512 A}MCTEST1_SI_EVENTS_TEST1_SI_EVENTS_no_event_id_total_value_new_stars_32270501 ,A}MCTEST1_SI_EVENTS_TEST1_SI_EVENTS_no_event_id_total_value_season_percent_stars_32635194 ,A}MCTEST1_SI_EVENTS_TEST1_SI_EVENTS_no_event_id_total_value_season_percent_stars_32635194 A}USER_V2_ID_TO_CAREER_TIER_1417875 A}US_MC_USER_EVENTS_BY_UID_U_802735 A}USER_OUTFIT_LOOK_RATING_STARS_SAVED_KEY3_17481979 A}USEROUTFIT_32305379
There is many more variations. I need to create a field that grabs all the alpha characters and excludes all numbers to be able to see top patters/values IE
MCTEST1_SI_EVENTS_TEST1_SI_EVENTS_no_event_id_total_value_season_percent_stars_
MCTEST1_SI_EVENTS_TEST1_SI_EVENTS__impression_
MCTEST1_SI_EVENTS_TEST1_SI_EVENTS__user_by_outfit_
USER_V2_ID_TO_CAREER_TIER_
US_MC_USER_EVENTS_BY_UID_U_
You can easily do it in inline in a search with rex
;
...| rex field=your_fieldname mode=sed "s/\d//g"
If you don't specify a field name, the sed script will run aginst the whole event (the _raw
field).
Then you can do your stats/top/chart etc on the field.
http://docs.splunk.com/Documentation/Splunk/6.0.1/SearchReference/Rex
/K
alternatively, if you want to keep some numbers, like in TEST1
or V2
, but not those sub-parts that are ONLY numbers, like _2342_
you can just alter the sed script slightly;
...| rex field=your_fieldname mode=sed "s/_\d+/_/g"
If these are in a field called myField
then you would just tack this on the end of your search:
| eval myField=replace(myField,"_\d+$","")
and that will effectively clip the _33097521 off the end of all the values.
Therefore if you have this:
| eval myField=replace(myField,"_\d+$","") | top 100 myField
you'll get the top 100 values, considering only the part up to the big integers.
You can also use rex instead of eval. The following will do the same thing as the eval syntax above:
| rex field="myField" "(?<myField>.+)_\d+"`
Oops, answering a little late. Hmm.. Neither the rex
nor the replace
will handle numbers in the middle of the strings.
Is that an example of one event, or is each line an event?