How to specify source type for virtual indexes.?

sdaruna · ‎01-13-2016

Hi,

I have data in HDFS and I am creating Virtual Indexes to access the data. However, I need to make get the whole file content as an event. For that, I have already created one source_type, which will get the whole file data. How can I apply the source_type to virtual indexes.?

lloydd518 · ‎10-19-2018

This technique doen't seem very well documented... and it looks like splunk prefer you to perform this within a props.conf file.
This answer presents people with an option how to do this from the virtual index UI for hadoop provider.

In the UI, select settings----> virtual indexes.
Ensure you have a data provider configured, that works...
Then within the virtual indexes menu, create a new virtual index.

This example is going to use the following folders
/data/auditlogs/RHEL_syslog
/data/auditlogs/WindowEvents

Within the UI, an admin should enter the following HDFS path setting:
/data/auditlogs/${sourcetype}

The admin could apply a whitelist if only one of the folder is required to be searched ..

By applying the ${sourcetype} variable in the UI... this will be written to a props.conf file...
Whenever a search is performed across this virtual index, two sourcetypes should appear.

rdagan_splunk · ‎01-14-2016

[source::/home/somepath/twitter/...]
priority = 100
sourcetype = twitter-hadoop
SHOULD_LINEMERGE = false
DATETIME_CONFIG = NONE

[twitter-hadoop]
KV_MODE = json
EVAL-_time = strptime(postedTime, "%Y-%m-%dT%H:%M:%S.%lZ")

sdaruna · ‎01-15-2016

Hi I am looking for some solution that does not depend on props.conf. I already created source type but how could I apply that to virtual index.? That is the question

somesoni2 · ‎01-13-2016

Splunk documentation to rescue. See this

http://docs.splunk.com/Documentation/Hunk/6.2.5/Hunktutorial/SearchbySourcetype

sdaruna · ‎01-13-2016

Hi, this documentation which mentions about props.conf does apply the source type to every index. I want it to have only for specific index.

somesoni2 · ‎01-15-2016

IMO, the sourcetype is applied to a data input or data source, not to an index. Props.conf will allow you to set the sourcetype for a source, which are being stored in virtual index.

sdaruna · ‎01-15-2016

Hi,

I might have not mentioned my view properly.
Lets say, i have two types of data

1) JSON
2) CSV
3) XML.

I need to get whole file for JSON and XML, and i need to get the data split when reading CSV. CSV Data goes to one index and xml data goes to other.

In this case, can we get the data shown with their respective requirements.? i.e, get the whole file data for xml and json and splitted data for csv.

Can we do that with props.conf.?

somesoni2 · ‎01-15-2016

Give this a try

[source::.../*.xml]
sourcetype=your_xml_sourcetype
priority=100

[source::.../*.csv]
sourcetype=your_csv_sourcetype
priority=100

[source::.../*.json]   *****use the correct extension of the file
sourcetype=your_json_sourcetype
priority=100


[your_csv_sourcetype]
define property per your requirement

[your_xml_sourcetype]
define property per your requirement

[your_json_sourcetype]
define property per your requirement

sdaruna · ‎01-15-2016

Thank you somesoni.. I would give it a try and let you know.. Are you splunkr.? if so, is there a way to reach you over mail or so.?

MuS · ‎01-19-2016

You can identify Splunk employees by the [Splunk] after their username - therefore @somesoni2 is no splunkr, but he once was 😉

rdagan_splunk · ‎01-15-2016

For the most part JSON does not need a source type. Hunk understand that format without any additional work from you. CSV with a header, also does not require any additional work.

So, that means only your XML and CSV without Headers will require some additional manipulation in the Props.conf files.

In your case, are these 3 data types stored in the exact same HDFS directory /user/data/alldata or do you have /user/data/jsondata /user/data/xmldata /user/data/csvdata ?

sdaruna · ‎01-15-2016

Hi rdaga,

Yes. There is chance that they might have in same directory.
Is there any solution if they reside in different directory.?

rdagan_splunk · ‎01-15-2016

Same Hadoop directory:
[source::/user/data/alldata/*.xml]
priority = 100
sourcetype = xml-hadoop

[source::/user/data/alldata/*.csv]
priority = 101
sourcetype = csv-hadoop

Different Hadoop directory:
[source::/user/data/xmldata/...]
priority = 100
sourcetype = xml-hadoop

[source::/user/data/csvdata/...]
priority = 101
sourcetype = csv-hadoop

How to specify source type for virtual indexes.?

Data Management Digest – December 2025

Index This | What is broken 80% of the time by February?

Unlock Faster Time-to-Value on Edge and Ingest Processor with New SPL2 Pipeline ...

Join the Conversation

How to specify source type for virtual indexes.?

Data Management Digest – December 2025

Index This | What is broken 80% of the time by February?

Unlock Faster Time-to-Value on Edge and Ingest Processor with New SPL2 Pipeline ...