Solved: How to create a report from reading all data from ...

szrobag · ‎02-20-2023

Hello Splunkers,

Help me please. I need a search to generate daily report looking for user's traffic in internal logs. I got an csv file generated daily by an external system what contains username, and an start-end time period like this:

report.csv
user,start_time,end_time
user1,8,16
user2,8,20

I have to insert this three field per user into my search. I am using inputlookup to catch the "user" field this way:

[base search]
| search user=*[|inputlookup "report.csv" |fields user ]*
| table x,y,z,user

It works, shows only the user'related logs, could be one or more users i csv. The problem i canno handle yet is about the additionl fileds. I had an idea to add an extra field with "eval" cmd, but doesn't work. So how can I read rest of the data form an external csv file?

thanks

gcusello · ‎02-20-2023

Hi @szrobag,

do you want to filter results using the usernames in the lookup or add the start-time and end_time information to the results?

if the first you have to use a subsearch like the following:

[base search] [ | inputlookup "report.csv" | fields user ]
| table x,y,z,user

putting attention that the field "user" in the subsearch is the same of the main search (it's case sensitive).

if the second, you have to use the lookup command that's something like an inner join, see something liek this:

[base search] 
| lookup "report.csv" user OUTPUT start_time end_time
| table x y z user start_time end_time

Ciao.

Giuseppe

View solution in original post

PickleRick · ‎02-20-2023

If you simply want to use externally sourced lookup to generate a set of conditions for your search, you can just use a subsearch. Remember that the subsearch is executed _before_ the main search and as a results renders a string which is substituted into the main search. So your search

|inputlookup "report.csv" |fields user

Will return a set of results with just user field which in turn will get render into

user="user1" OR user="user2" OR user="user3" ...

So by using clever renames you can reformat your search to include earliest/latest condition (probably using some evals in the middle to render proper timestamps from your lookup fields). If you make your subsearch so that it returns three fields - user, earliest, latest - your subsearch will get rendered to:

(user="user1" earliest=earliest1 latest=latest1) OR (user="user2" earliest=earliest2 latest=latest2) ...

So you can just insert your subsearch into the main search - don't use the "search | search [ subsearch] " construction - Splunk might be able to optimize it out anyway but generally what you're asking it to do is to search for all events and then filter them out (which might be much less efficient than selecting just a subset from the indexes).

Having said that - generating too many conditions might also be detrimental to the performance of the search so it _might_ (but doesn't have to) be better to consider other ways to filter those events.

gcusello · ‎02-20-2023

Hi @szrobag,

do you want to filter results using the usernames in the lookup or add the start-time and end_time information to the results?

if the first you have to use a subsearch like the following:

[base search] [ | inputlookup "report.csv" | fields user ]
| table x,y,z,user

putting attention that the field "user" in the subsearch is the same of the main search (it's case sensitive).

if the second, you have to use the lookup command that's something like an inner join, see something liek this:

[base search] 
| lookup "report.csv" user OUTPUT start_time end_time
| table x y z user start_time end_time

Ciao.

Giuseppe

szrobag · ‎02-20-2023

Hello Giuseppe,

Thank you for reply.
My index constains logs about a lot users, so my idea was to filter it first to users added in csv file.
So, i choose inputcsv to filter to 3-4-5 ( usually up to ten ) user first. The "user" field is a common used filed in csv and the indexed data, so It is like an inner join. user filering is okay ( as in your first query ) but can't work with the two added start_time and end_time filed even it is added in OUTPUT part. It is needed to filter users logs to specified time range.

Using inputlookup to filter users first, and after using your query

| lookup "report.csv" user OUTPUT start_time end_time

to extract data from two fields seems working now

gcusello · ‎02-20-2023

Hi @szrobag,

good for you, see next time!

Ciao and happy splunking

Giuseppe

P.S.: Karma Points are appreciated by all the contributors 😉

How to create a report from reading all data from csv file?

eval

lookup

subsearch

Detecting Remote Code Executions With the Splunk Threat Research Team

Observability | Use Synthetic Monitoring for Website Metadata Verification

More Ways To Control Your Costs With Archived Metrics | Register for Tech Talk