My Splunk is 5.0.5. I constructed a rex to extract user from free-hand logs. In some logs, user is null. This skews my results. So I perform a search to filter out those null users. After this, I combine user with another field, service to form a unique identifier UserService, then I timechart by UserService. I know a particular user in a particular service has a specific pattern, so I am looking for a standout condition with this timechart. However, I can never get a chart with all user-service combinations. For all services, I can either get users whose name does not contain an underscore (
_) or users whose name does. For a single service, I can get users both with and without
Here are tested combinations and outcomes:
To make this even more mysterious, those results described as not containing users with
_ actually contain users whose name are capital letters and
_. So the actual outcome matrix is like:
1. Returns only users with no
_ and all-caps +
<preliminary search=""> | rex "something (?<user>w+)" |search user="*" |eval UserService=user + "." + service |timechart count by UserService
2. Returns only users with no
_ and all-caps +
<preliminary search> | rex "something (?<user>\w+)" |search user="*" OR user="*_*" |eval UserService=user + "." + service |timechart count by UserService
3. Returns only users with
_, including all-caps +
<preliminary search=""> | rex "something (?<user>w+)" |search user="_" |eval UserService=user + "." + service |timechart count by UserService
4. Returns users with and without
_, but only for a single service:
<preliminary search> | rex "something (?<user>\w+)" |search service="OneService" |eval UserService=user + "." + service |timechart count by UserService
How to get all combinations?
Should've done this earlier. So this is a limitation on timechart's graphic real estate. That null user is just a red herring. (It may be due to some other issues in user extraction - unrelated to greediness. Same problem ensues even with no null users.) After comparing with stats, now I notice this notice at the bottom of timechart: "Search generated too much data for the current display configuration, results have been truncated." Silly me - fewer than 1000 data points can be plotted, so even lower case, underscore, are all red herrings. It just reflects the way timechart sort things.
I'm with martin... I don't see exactly why you are extracting the users so that you don't get the whole thing, which my previous comment offers a suggestion to solve. However it's best to just check to see that you get your data, before you mess with whether it's timechart not able to display too many points. replace timechart with stats to test.
I'm not sure if part of your problem is the extraction of the user field (which you have as one or more words... but you don't account for the missing underscore which I presume looks like Joan Jett, rather than Joan_Jett. If you want all users regardless of case, space, underscore or number of words... it would be good if you could anchor on something that comes after... like "something\s+(?P
I'm looking at timechart output. There could be some weirdness with timechart, so I set limit=1000 even though the result has fewer than 100 UserService values. The distinct pattern that I'm looking at is that one_user.OneService gives hundreds of times bigger count than others combined. So it is very easy to tell whether one_user results are included from timechart.