Splunk Search

Sorting help

changwoo
Communicator

i am searching like this

sourcetype=user
|fields user_id, user_gender, user_age,user_occup,user_zipcode 
|rename user_id as rate_user_id
|join rate_user_id [search sourcetype=rate |rename rate_movie_id as movie_id | 
join movie_id [search sourcetype = movie ] ]
| stats avg(rate_rating) as avg_rating by movie_id,movie_name 
| stats max(avg_rating) as ttt by movie_name 
|sort -ttt | head 100

What i am asking is can i get the other columns??? instead of only movie_name columns?
result looks like
movie_name | avg_rating
i want something like this
movie_name | avg_rating | movie_id .. . . .| ..... | . .. . . .

Tags (1)
0 Karma
1 Solution

lguinn2
Legend

First, you seem to be using Splunk like a relational database. What you are doing may work, but you aren't leveraging Splunk's capabilities. In Splunk, the user data and the movie data would probably be better as lookup tables, and only the ratings would be Splunk events. If you did it that way, it would be much easier to get what you want. (And BTW, there is no need for unique field names like rate_movie_id - if the field is the id of the movie, just use `movie_id".)

However, the following will work and will be more efficient as well:

[edit] fixed example

sourcetype=rate |  rename rate_movie_id as movie_id
| stats avg(rating) AS avg_rating count as number_of_ratings by movie_id
| join movie_id [ search sourcetype = movie ]
| table movie_name avg_rating number_of_ratings movie_id
| sort -avg_rating | head 100

Since you aren't using any of the user information, I ignored it. This lists the 100 movies with the highest average rating.

View solution in original post

lguinn2
Legend

First, you seem to be using Splunk like a relational database. What you are doing may work, but you aren't leveraging Splunk's capabilities. In Splunk, the user data and the movie data would probably be better as lookup tables, and only the ratings would be Splunk events. If you did it that way, it would be much easier to get what you want. (And BTW, there is no need for unique field names like rate_movie_id - if the field is the id of the movie, just use `movie_id".)

However, the following will work and will be more efficient as well:

[edit] fixed example

sourcetype=rate |  rename rate_movie_id as movie_id
| stats avg(rating) AS avg_rating count as number_of_ratings by movie_id
| join movie_id [ search sourcetype = movie ]
| table movie_name avg_rating number_of_ratings movie_id
| sort -avg_rating | head 100

Since you aren't using any of the user information, I ignored it. This lists the 100 movies with the highest average rating.

changwoo
Communicator

Thank you!

0 Karma

lguinn2
Legend

Thanks @yannK

0 Karma

yannK
Splunk Employee
Splunk Employee

missing "AS" to the search, I just edited the search.

yannK
Splunk Employee
Splunk Employee
0 Karma

changwoo
Communicator

avg_rating is is showing invalid argument is it right ?;

and can you give me an example with using lookup?

0 Karma
Get Updates on the Splunk Community!

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...

New in Observability Cloud - Explicit Bucket Histograms

Splunk introduces native support for histograms as a metric data type within Observability Cloud with Explicit ...