Hi Experts,
I'm trying to validate whether the user is a new user or an existing user using summary index. The userLogin field is a combination of username, userId and uniqueId associated to user's each login. I just want the username and userId from userLogin field to maintain single record of each user and to find the count of userLogin within specific dateTime interval (i.e past one week).
Here's the query i've written, but. Any suggestions would be highly welcomed. Thanks in advance.
index=user_login_details
| rex field=userLogin "(?<userName>\s+\d{5}).*"
| eval time=strftime(_time,"%Y-%m-%dT%H:%M:%S")
| stats count, earliest(time) as FirstTime by userName
| join type=left userName
[search index=user_login_details sourcetype=existing_login_users latest=-7d
| eval Time=strptime(FirstTime ,"%Y-%m-%dT%H:%M:%S")
| stats count as ExistingUser by Time userName ]
| fillnull ExistingUser value=0
| search ExistingUser=0
| fields-ExistingUser
| collect index=user_login_details sourcetype=existing_login_users
The query searches for the same data twice. Both the main search and the subsearch search the user_login_details index for the existing_login_users sourcetype (implicitly in the main search; explicitly in the subsearch).
Avoid joins when possible. append followed by stats values(*) by userName may be more performant.
Using the collect command with a sourcetype other than "stash" will consume some of your ingest license.