This will do it:
yoursearchhere
| stats first(_time) as FirstSeenOn by clientUuid
| stats count as NewClients by FirstSeenOn
The search just needs to identify your data, perhaps sourcetype=xyz
or whatever.
This search needs to run over "All Time" to be accurate. As @lukejademec points out, this isn't very efficient. It would great if there was a database or table that contained the ids of existing clients...
Step 1: Create a new index with the name 'summary' in it (so you know it's a summary) like this: "ClientID-summary".
Step 2: Create a saved search that collects all ClientIDs and any other event information you think might be useful. This is important - you mention now that you want only ClientIDs, but will that always be the case? For the current use case you could use a search similar to this: index=yourindex source=yoursource clientUuid=* | table _time clientUuid
Step 3: Schedule the search to run at a time when you are certain all applicable events have been indexed - consider latency and system down time issues. Something like earliest=-2h15s@s latest=-1h@s
. This may result in some duplicates, but that should not be a problem in this usecase, and duplicates can be filtered out if necessary.
Step 4: While scheduling the search, select the summary index that you created in Step 2.
Step 5: Use the backfill script to mine the history of your index/source. See this link(read the whole thing): http://docs.splunk.com/Documentation/Splunk/6.0.3/Knowledge/Managesummaryindexgapsandoverlaps
Step 6: Run a search on your new summary index, when it's been populated. Something like this: Run an all-time search index=yoursummaryindex | dedup clientUuid sortby +_time | table _time clientUuid | where _time=relative_time(now(),"-1d@d") AND _time<0d@d
I've not tried this....
Now that I think about it, you could also run this same search on your regular index/sourcetype. If it takes too much time, then you should use a summary index.
The summary index? To create that you use the settings or manager GUI. It depends on what version you're on. When you create it I recommend namimg it "Summary-someting".
See the Create and Edit Indexes section of this doc:
http://docs.splunk.com/Documentation/Splunk/6.0.3/Indexer/Setupmultipleindexes
Set it to your version in the upper right of the page.
Could you tell us how to create the index?
There are more than a million clients here. Could you help me with the query string to fire for index. I am new to splunk and would really appreciate the help.
How many clients?
If there is less than say 30,000 you can use a lookup to store unique clients. If there are millions, then you should use a summary index to store unique clients.
Log line looks like
May 2 11:31:21 server1 stdout: : [09:27:55.750] 09:27:55.750 [AsyncQueue[]-26] INFO MSIPL - logRequest done, Request{reqId=624c76f7-ed8e-4196-b8da-124dde458615, affiliateId=0, clientUuid=7b09f2afcb673f72, locale='en_US'}
Here clientUuid is unique to every client. I want to check how many new clients (no of clients requesting today for the first time ever) have been added per day.
Need more details. Can you provide some sample events? Also, how do we know that the clientid is new? Or do you simply want a count of all in a 24 hour period?