Splunk Search

Adding lookup from external csv to search

Path Finder

I would like to do this permanently but let us try it on the command line first.

command:

source="C:\\..." | rex "^(?<client_ip>[0-9\.]+) (?<user>[0-9\-]*) (?<profile>[0-9\-]*) (?<timestamp>\[[^\]]+\]) (?<url>\"[^\"]+\") (?<http_status>[0-9\-]+) (?<bytes>[0-9\-]+) (\"[^\"]+\") (?<user_agent>\"[^\"]+\") (?<processing_time>[0-9\-]+) (?<registrant>\"[^\"]+\") (?<forward_for>[0-9\.\-]+)"  | lookup registrant.csv registrant

input, apache custom log

1.1.1.1 - - [24/Apr/2013:16:10:08 +0200] "GET /someurl HTTP/1.1" 200 278356 "-" "some useragent" 123 "9999" 111.222.333.444

Lookup file:

registrant,fullname
9999,John Doe
7777,Jane Doe
  1. where do I save registrant.csv? I put it in \etc\apps\search\local and \etc\system\lookups
  2. where would I see John Doe in the output if the above worked?

UPDATE

So I tried the suggestion given.

In etc\system\lookups\registrants.csv I have

applicationid,username,email
0,Anonymous,
1234,John Doe,johndoe@xxx.com
4321,Jane Doe,janedoe@xxx.com

in \etc\system\local\props.conf I have

[registrants]
EXTRACT-fields=^(?<client_ip>[0-9\.]+) (?<user>[0-9\-]*) (?<profile>[0-9\-]*) (?<timestamp>\[[^\]]+\]) (?<url>\"[^\"]+\") (?<http_status>[0-9\-]+) (?<bytes>[0-9\-]+) (\"[^\"]+\") (?<user_agent>\"[^\"]+\") (?<processing_time>[0-9\-]+) (?<applicationid>\"[^\"]+\") (?<forward_for>[0-9\.\-]+)
LOOKUP-applicationid=registrants username

in \etc\system\local\transforms.conf I have

[registrants]
filename = registrants.csv

I am now getting error on all searches: The lookup table 'registrant' does not exist. It is referenced by configuration 'combined_wcookie'. which I have no idea how to get rid of :(((

UPDATE

The documentation is not clear. It uses the same names for the source file fields and the csv.

I am now a bit further - one issue is that the csv is forced by splunk to be numeric values since the field I am testing happens to be a number. In the log file it is a string so I have to extract it without quotes.

I now extract ONE (1) record out of many. It is not encouraging...

Tags (2)
0 Karma

Path Finder
0 Karma

Path Finder

PS: On windows command > file.txt 2>&1 works in normal circumstances

0 Karma

Path Finder

Does not matter where I put the redirect on my win7 box. I get a popup, a fast scrolling set of data and the popup closes - Why does splunk make me feel like I haven't been around computers since 1986!?

0 Karma

Legend

btool sends its output to stdout so you should be able to capture it. That said, your stderr>stdout redirect comes too late even if btool would be sending to stderr - it should be before you redirect everything into splunk.txt

0 Karma

Path Finder

thanks again. Just tried this. I cannot seem to capture the output no matter what I do $SPLUNK_HOME/bin/splunk cmd btool props list --debug > splunk.txt 2>&1 or whatever

0 Karma

Legend

For instance, this should prove helpful in your case:

$SPLUNK_HOME/bin/splunk cmd btool props list --debug

This will show you all props settings, and info on where they came from (that's what the --debug flag does).

0 Karma

Legend

Ah OK, my apologies - didn't mean to state the obvious, just figured you didn't know about the error 🙂

You might find the btool utility to be very useful when checking for stuff like this. Runs from the command line and shows you various combined configurations after Splunk has merged them from all possible places, so you don't have to look in 20 different props files for instance...you just need to check the btool output.

0 Karma

Path Finder

Yes, thanks - I realise that. the problem is finding the "somewhere" since it is not in any file so it must be in some cached setting.

0 Karma

Legend

You've defined your lookup as "registrants" but somewhere you're referencing it as "registrant" (lacking the s).

0 Karma

Path Finder

Thanks @chris - I'll check it after my vacation. The command line thing is very interesting seeing how hard it seems to get debug information - and the whole issue here is that the fields are only recognised if I do a | rex
"^my regex"

0 Karma

Motivator

To refresh props & transforms search time configs you can use the following command in splunk "| extract reload=True" (or restart Splunk). If you do a search are the fields that you defined in EXTRACT recognized? This has to work for the automated lookup to work. To check if the lookup configuration is ok you can use the following splunk comman "| inputlookup registrants".

0 Karma

Motivator

There are 2 things that i can see here the stanza in props.conf should match the sourcetype of your events you are searching it is not related to the name in props or the lookup table if this is on purpose then you are ok. The 2nd thind is the automatic lookup definition it should be LOOKUP-=registrants applicationid. The error you get is probably because there is a lookup called registrant that is referenced in one of the props files somewhere. You can use $SPLUNK_HOME\bin\splunk btool props list --debug to see if registrant is referenced somewhere.

0 Karma

Motivator

If you put your registrant.csv file in $SPLUNK_HOME/etc/system/lookups your command should work and you should see "John Doe" in the fields sidebar in a field called fullname.

You will probably get an info message that looks similar to this: Assuming implicit lookup table with filename 'registrant.csv'.

The 'proper/complete' way to do this is described in Configure field lookups in the Knowledge Manager Manual. You will have to create a transforms.conf file and optionally a props.conf file for automated lookups. Those files can also be used to do the field extraction that you're doing in the command for you.

Both files can be created in $SPLUNK_HOME/etc/system/local/

To make things easier it makes sense to use sourcetypes in your searches and config files (all your inputs with the same log format should have the same sourcetype) your log looks almost like the predefined access_combined sourcetype but I do not think that the registrant field which is important for you is standard. So I suggest you set it to a custom value e.g. access_custom

transforms.conf

[registrant]
filename = registrant.csv

props.conf

[access_custom]
EXTRACT-fields=^(?<client_ip>[0-9\.]+) (?<user>[0-9\-]*) (?<profile>[0-9\-]*) (?<timestamp>\[[^\]]+\]) (?<url>\"[^\"]+\") (?<http_status>[0-9\-]+) (?<bytes>[0-9\-]+) (\"[^\"]+\") (?<user_agent>\"[^\"]+\") (?<processing_time>[0-9\-]+) (?<registrant>\"[^\"]+\") (?<forward_for>[0-9\.\-]+)
LOOKUP-registrant=registrant registrant

Path Finder

Please see update

0 Karma

Path Finder

That looks very promising I'll look tomorrow

0 Karma

Legend

If you haven't already dived into the whole of docs.splunk.com I highly recommend it. There's loads of info on subjects like this. For instance http://docs.splunk.com/Documentation/Splunk/5.0.2/Knowledge/Addfieldsfromexternaldatasources and http://docs.splunk.com/Documentation/Splunk/5.0.2/Tutorial/Usefieldlookups and http://docs.splunk.com/Documentation/Splunk/latest/SearchReference/Lookup

Short answers:
1. Lookup files go in a suitable lookups directory, for instance etc/apps/search/lookups
2. In the field "fullname".

Legend

I guess I've been doing this for too long to think of how things else would work - it's pretty straightforward as I see it (except for the idiotic aliasing syntax that's completely the opposite of how one would expect it should be). You need a lookup somewhere and you need to somehow tell Splunk that this lookup should be applied. That's not that complicated, is it? 🙂

As I've said previously the lookup will work as long as there are fields for it to work on, which is why you need a lookup command after an inline rex command.

0 Karma

Path Finder

Thank. I did read that page from end to end and that is how I uploaded the lookup. What I fail to grasp is the next step. Since my automatic extraction didn't work, the automatic use of lookup also did not work, now I am trying to test the thing with your suggested Rex and that finally started to work, but the lookup seems so unnecessary complicated - as does many things in splunk. Powerful but opaque. I will try again tomorrow. Thanks for the persistent assistance

0 Karma