Splunk Search

open .csv file

Orange_girl
Loves-to-Learn Everything

Hello, 

I have a really basic question 🙂 I have a .csv file saved in SPLUNK, which I believe is indexed - this is not an output of a search but a file feed into SPLUNK from another source. I want to be able to open the file in SPLUNK search. Can you please advise what command I should use in SPLUNK search to be able to see the content of the .csv?

thank you. 

Labels (1)
0 Karma

Orange_girl
Loves-to-Learn Everything

hi PickleRick, thank you for your answer. I know where the file is and I can open it from SPLUNK explorer. This indexed file is used in one of my searches, but unfortunately the search has recently stopped providing correct information. When investigating the issue, I discovered that the data pulled from the indexed file, misses values in some columns (which are crucial) and therefore the search results are incorrect. When I open the .csv file directly from its location, the values in all columns are correct. I wanted to open the .csv file in SPLUNK search to see what it will look like, but if this is not possible, I will have to find another way of working this out. 

0 Karma

PickleRick
SplunkTrust
SplunkTrust

I have no idea what "Splunk explorer" you're talking about. Honestly.

But it doesn't work the way you think it does. If it's ingested into an index, it's split into separate events and indexed into Splunk's own data format. There is no "csv file" of the data anymore on Splunk's side.

Assuming you're indeed talking about indexed data, not the lookups.

0 Karma

Orange_girl
Loves-to-Learn Everything

Thanks, this is helping! I can see now that there is indeed separate events indexed into Splunk's own data format. Now... how can I ensure that the specific information within the events are used in a SPLUNK search? For example, one of the pieces of information within the event, is a name of a parent group. How can I ensure, that when I run a search, it will look into these events and match my results with the corresponding parent group? Thank you for your patience and please bear with me while I try to work this out!

0 Karma

PickleRick
SplunkTrust
SplunkTrust

Well... it's kinda complicated because we're talking about CSV 🙂

Normally most of the data is just split into separate events (usually one event per line), some metadata is added and the fields are extracted in search time. But in case of CSV the fields can be split right in the moment of ingestion and can be indexed and immutable after ingestion (so called indexed-extractions). So it depends heavily on your configuration.

0 Karma

Orange_girl
Loves-to-Learn Everything

hahah and I was just thinking we are getting there...

So I went through events and can confirm that I have one event per .csv line. I don't see any additional information injected in the events, other than "," in the fields that are empty (in the .csv file).

My search first pulls in some data and does filtering and then:

| rename Letter as Y_Field
| table A_Field, B_Field, Y_Field
| join type=left Y_Field
[ search earliest=-24h host="AAA" index="BBB" sourcetype="CCC"
| eval dateFile=strftime(now(), "%Y-%m-%d")
| where like(source,"%".dateFile."%XXX.csv")
| rename "Target Number" as Y_Field
| eval Y_Field=lower(Y_Field)
| fields Y_Field, Field, "Field 2", "Field 3"]
| table A_Field, B_Field, Y_Field, Field, "Field 2", "Field 3"


I also wonder if the issue might be with the common field for both the search and the events information, as I have to rename it to match. I tested with renaming the field in the search, tested with renaming the filed in the data pulled from the index and tested by renaming both to something different - but no luck.

As I mentioned earlier the data in index is injected daily, so the search looks for the latest csv.

When I run this search, I get results for A_Field, B_Field, and Y_Field, but Field, "Field 2" and "Field 3" are empty. 

0 Karma

PickleRick
SplunkTrust
SplunkTrust

This might be a completely different issue. You're not talking about searching directly from the raw data but some fancy operations and the final result of some more complicated search which doesn't necessarily mean that the ingested data is bad.

Try running the subsearch as a separate search and see if it returns (any/proper) results. Also take note of how long the search takes and how many results it returns.

Since you're using join with a subsearch, it's quite probable that this might be the culprit here - join is usually best avoided. Especially if used with a search for indexed events. Especially if it's run over a relatively long period.

0 Karma

Orange_girl
Loves-to-Learn Everything

I have to admit, I did suspect that the issue might be with the 'join'. 

So... I have to go back to my original question. How to I run the subsearch mentioned earlier to see the data from indexed csv? Running the below gives me nothing. Am I missing some obvious characters/words in this to run on itself and not as subsearch?

search earliest=-24h host="AAA" index="BBB" sourcetype="CCC"
| eval dateFile=strftime(now(), "%Y-%m-%d")
| where like(source,"%".dateFile."%XXX.csv")
| rename "Target Number" as Y_Field
| eval Y_Field=lower(Y_Field)
| fields Y_Field, Field, "Field 2", "Field 3"

 

0 Karma

PickleRick
SplunkTrust
SplunkTrust

OK. Start by cutting the search to the initial search and see if the results are what you expect them to be.

In other words - check if

search earliest=-24h host="AAA" index="BBB" sourcetype="CCC"

returns any results at all.

If not - it means you have problems on the ingestion end - you have no events at all to search from (or maybe you're looking for wrong data).

And then add one step after another until the results stop being in line with what you expect them to be. This will be the step that is wrong.

0 Karma

Orange_girl
Loves-to-Learn Everything

It looks like the file has not been indexed for a few days and in addition I found the below warnings:

05-01-2024 02:18:07.646 -0400 WARN TailReader [4549 tailreader0] - Insufficient permissions to read file='/xxx/xxx/xxx/xxx/xxx.csv' (hint: No such file or directory , UID: 0, GID: 0).

How can I go about checking the permissions? Thank you.

0 Karma

deepakc
Builder

There are two main command's for lookup's  

| inputlookup my_lookup 

| lookup my_lookup  - (This is mainly used for enrichment)

So start with the  | inputlookup my_lookup  command, if you can't  see it it's most likely due permissions or the definitions has not ben set. The lookup is a knowledge object and requires permissions, so could be private or shared, of you may have to to to the app its running under . So check this under SplunkGUI>settings>lookups and check lookup table files for the file and then under definitions. Once you have the definition or csv name try that in the | inputlookup command.  

 


This assumes you have created the lookup file and it has permissions 

0 Karma

PickleRick
SplunkTrust
SplunkTrust

There is no such thing as "a .csv file saved in SPLUNK, which I believe is indexed ".

CSV can be used as a lookup or its contents might have been ingested and indexed but then you need to know how and where to it was indexed so that you can look for data from it.

0 Karma
Get Updates on the Splunk Community!

Preparing your Splunk Environment for OpenSSL3

The Splunk platform will transition to OpenSSL version 3 in a future release. Actions are required to prepare ...

Deprecation of Splunk Observability Kubernetes “Classic Navigator” UI starting ...

Access to Splunk Observability Kubernetes “Classic Navigator” UI will no longer be available starting January ...

Now Available: Cisco Talos Threat Intelligence Integrations for Splunk Security Cloud ...

At .conf24, we shared that we were in the process of integrating Cisco Talos threat intelligence into Splunk ...