I have a dataset that uses some non-segmented character to separate meaningful and commonly-used search terms. Sample events
123,SVCA,ABC123,DEF~AP~SOME_SVC123~1.0,10.0.1.2 ,67e15429-e44c-4c27-bc9a-f3462ae67125,,2023-02-10-12:00:28.578,14,ER40011,"Unauthorized"
123,SVCB,DEF456,DEF~LG~Login~1.0,10.0.1.2,cd63b821-a96c-11ed-8a7c-00000a070dc2,cd63b820-a96c-11ed-8a7c-00000a070dc2,2023-02-10-12:00:28.578,10,0,"OK"
123,SVCC,ZHY789,123~XD-ABC~OtherSvc~2.0,10.0.1.2 ,67e15429-e44c-4c27-bc9a-f3462ae67125,,2023-02-10-12:00:28.566,321,ER00000,"Success"
456,ABC1,,DEFAULT~ENTL~ASvc~1.0,10.0.1.2 ,b70a2c11-286f-44da-9013-854acb1599cd,,2023-02-10-11:59:44.830,14,ER00000,"Success"
456,DEF2,,456~LG~Login~v1.0.0,10.0.0.1,27bee310-a843-11ed-a629-db0c7ca6c807,,2023-02-10-11:59:44.666,300,1,"FAIL"
456,ZHY3,ZHY45678,DEF~AB~ANOTHER_SVC121~1.0,10.0.0.1 ,19b79e9b-e2e2-4ba2-a7cf-e65ba8da5e7b,,2023-02-10-11:58:58.813,,27,ER40011,"Unauthorized"
Users will often search for individual items separated by the ~ character. E.g., index=myindex sourcetype=the_above_sourcetype *LG*
My purpose is to reduce the need for leading wildcards in most searches here, as this is a high-volume dataset by adding the minor segmentation character '~' at index time.
I've tried these props.conf and segmenters.conf without success. Could anyone provide any insight? <indexer> SPLUNK_HOME/etc/apps/myapp/local/props.conf
[the_above_sourcetype]
SHOULD_LINEMERGE=false
LINE_BREAKER=([\r\n]+)
TIME_PREFIX = ^([^,]*,){7}
TIME_FORMAT = %Y-%m-%d-%H:%M:%S.%3Q
TRUNCATE = 10000
MAX_TIMESTAMP_LOOKAHEAD=50
SEGMENTATION = my-custom-segmenter
SPLUNK_HOME/etc/apps/myapp/local/segmenters.conf
[my-custom-segmenter]
MINOR = / : = @ . - $ # % \\ _ ~ %7E
Added those and bounced my test instance, but I still cannot search for index=myindex sourcetype=the_above_sourcetype LG -- does not return results such as these, however *LG* as a term does return it. 456,DEF2,,456~LG~Login~v1.0.0,10.0.0.1,27bee310-a843-11ed-a629-db0c7ca6c807,,2023-02-10-11:59:44.666,300,1,"FAIL"
... View more