I've written the rex command to pull the emails. But actually as per my REX command it will pull only the email having one dot, i need its to pull the email with more than one dot also before the domain names. Here is my rex cmd. Help me on this. Can we add OR condition here? Is it would work in REX?
It appears that you are unaware of google's implementation of infinite email aliases. For gmail (and possibly other email systems), the username cannot contain periods or plus-signs when you create your gmail ID. The reason for this rule is because those characters are how email aliases are allowed. Gmail strips out all periods and anything after a plus sign. See this blog for details:
So for gmail, you need to normalize usernames the same way, like this:
... | rex field=email "^(?<username>[^@]*)@(?<domain>.*)$" | username = if(domain="gmail.com", replace(replace(username, "\.", ""), "+.*", "") , username)
This one is okay, but i don't need it as i've written the rex command to pull the emails. But actually as per my REX command it will pull only the email having one dot, i need its to pull the email with more than one dot also before the domain names. Here is my rex cmd. Help me on this. Can we add OR condition here? Is it would work in REX?
What about the following snippet?
<your_search> | rex field=<email_field> "^(?P<email_user>.*)@(?P<email_domain>.*)$" | rex field=email_user mode=sed "s/\.//g" | rex field=email_user mode=sed "s/\+.*$//g" | eval sanitized_email=email_user . "@" . email_domain | table <email_field>, sanitized_email
Should compute a sanitized version of the email address in a new field named
Thanks! Its Cool but its fetching the sanitized email from the dot trend emails.
Actual Email with Dot trends
h.au.517@gmail[.]com h.au5.17@gmail[.]com h.au51.7@gmail[.]com ha.u.517@gmail[.]com ha.u51.7@gmail[.]com ha.u517@gmail[.]com ha.u5.17@gmail[.]com
Actually i need the dot trend alone not the sanitized one.
Anyway thanks for this one!
First of all those gmail addresses you posted? They are all the same mailbox (see https://support.google.com/mail/answer/10313?hl=en).
You could split the email up as follows (including the domain as sundareshr pointed out)
| rex field=emailfield (?<namePartA>.*?)\.(?<namePartB>.*?)@(?<domain>.*)
You then will have three new field - namePartA (before the dot), namePartB and domain. You can arrange these as you wish to capture the patterns. (eg | stats values(namePartB) by Name PartA) but I'm not sure how helpful that is going to be.
Instead I would have a play with the cluster command (http://docs.splunk.com/Documentation/Splunk/6.3.2/SearchReference/Cluster ). Try something like
... | cluster field=emailfield showcount=t | table cluster_count emailfield _raw | sort -cluster_count
You may have to play around with the value of the cluster threshold (read the docs!) The beauty of this is you won't have to worry about regex issues and you can easily see the most common matches.
Finally, I think you have a bigger problem if you're trying to validate your customer accounts with Splunk . I am sure there are lots of good shopping cart security libraries out there - I would tell your developers to fix their application first!
One option would be to extract the email "name", remove the 'dot' and dedupe or dc(name). Something like this
.... | rex field=emailfield "(?<name>[.*]+)@" | eval name=replace(name, "\.", "") | stats dc(name)
Let me explain you the dot trend pattern.
For example if you see the offer in a retail store(website) - for new registration you will get like some discount coupons. So here the users are creating new accounts with multiple email IDs. Those IDs are have to be unique right so they're using dots in between their email IDs to create multiple using scripts or something.
Like this they are creating IDs and getting the discount coupons to purchase.
Please let me know if you need more info.
Yes I agree. Please clarify, showing examples where possible. The phrase "Email IDs with dot trend patterns" doesn't really show anything if I google it. (Well, it does now. It shows this splunk answer. You have now learnt a valuable lesson about SEO)
Nothing like that, i'm looking for the dot trend patterns. not only gmail all the email ID which are having dot trend. Since i can get the email this is not an issue, but email with dot trend. Many users are creating the ID with script using the dot trend, so need to monitor that alone is very difficult. If anything there to do that so it could be very useful.