part 1 - As far as i know,Splunk can match below users with same pattern "John%" , but all 6 are same users, what is the best way to match all..
John Dave Peterson -Hit
John Peterson - Hit
John Dave -Hit
J D Peterson -Miss
JDPeterson - Miss
JDP - Miss
How to frame a generic query for all usernames since I cannot use a specific username pattern for all customer names ?? Can a customer field name "custname" can be matched with the pattern by the value of same field ? (like below syntax ?)
"| where like(custname "custname%")"
Part 2 - Can splunk match percentage wise between 2 string fields..
Like example below, possible to have 2 common address but in typo/erroneous formats.... I only want 1st 2 values to be matched as they both are same but in different long/short format
Address 1= St Peter's Street - Hit
Address 2 = St Peter's St - Hit
Address 3 = St Peter's Complex - Miss
Can splunk match Address 1 and 2 above and say - 90% match.
For the user names: don't think you can solve that with 'intelligent text pattern matching', this calls for a lookup that holds all possible usernames of a person and maps them to some standardized identity.
For calculating how closely 2 strings match, you could perhaps use the Levenshtein algorithm. I know the URL Toolbox add-on provides that. But perhaps there are better approaches for recognizing addresses. You might want to do a quick search on Google for that (as that is of course not a Splunk specific challenge).