Getting Data In

How to specify source type for virtual indexes.?

sdaruna
Explorer

Hi,

I have data in HDFS and I am creating Virtual Indexes to access the data. However, I need to make get the whole file content as an event. For that, I have already created one source_type, which will get the whole file data. How can I apply the source_type to virtual indexes.?

0 Karma

lloydd518
Path Finder

This technique doen't seem very well documented... and it looks like splunk prefer you to perform this within a props.conf file.
This answer presents people with an option how to do this from the virtual index UI for hadoop provider.

In the UI, select settings----> virtual indexes.
Ensure you have a data provider configured, that works...
Then within the virtual indexes menu, create a new virtual index.

This example is going to use the following folders
/data/auditlogs/RHEL_syslog
/data/auditlogs/WindowEvents

Within the UI, an admin should enter the following HDFS path setting:
/data/auditlogs/${sourcetype}

The admin could apply a whitelist if only one of the folder is required to be searched ..

By applying the ${sourcetype} variable in the UI... this will be written to a props.conf file...
Whenever a search is performed across this virtual index, two sourcetypes should appear.

0 Karma

rdagan_splunk
Splunk Employee
Splunk Employee

[source::/home/somepath/twitter/...]
priority = 100
sourcetype = twitter-hadoop
SHOULD_LINEMERGE = false
DATETIME_CONFIG = NONE

[twitter-hadoop]
KV_MODE = json
EVAL-_time = strptime(postedTime, "%Y-%m-%dT%H:%M:%S.%lZ")

0 Karma

sdaruna
Explorer

Hi I am looking for some solution that does not depend on props.conf. I already created source type but how could I apply that to virtual index.? That is the question

0 Karma

somesoni2
Revered Legend
0 Karma

sdaruna
Explorer

Hi, this documentation which mentions about props.conf does apply the source type to every index. I want it to have only for specific index.

0 Karma

somesoni2
Revered Legend

IMO, the sourcetype is applied to a data input or data source, not to an index. Props.conf will allow you to set the sourcetype for a source, which are being stored in virtual index.

0 Karma

sdaruna
Explorer

Hi,

I might have not mentioned my view properly.
Lets say, i have two types of data

1) JSON
2) CSV
3) XML.

I need to get whole file for JSON and XML, and i need to get the data split when reading CSV. CSV Data goes to one index and xml data goes to other.

In this case, can we get the data shown with their respective requirements.? i.e, get the whole file data for xml and json and splitted data for csv.

Can we do that with props.conf.?

0 Karma

somesoni2
Revered Legend

Give this a try

[source::.../*.xml]
sourcetype=your_xml_sourcetype
priority=100

[source::.../*.csv]
sourcetype=your_csv_sourcetype
priority=100

[source::.../*.json]   *****use the correct extension of the file
sourcetype=your_json_sourcetype
priority=100


[your_csv_sourcetype]
define property per your requirement

[your_xml_sourcetype]
define property per your requirement

[your_json_sourcetype]
define property per your requirement
0 Karma

sdaruna
Explorer

Thank you somesoni.. I would give it a try and let you know.. Are you splunkr.? if so, is there a way to reach you over mail or so.?

0 Karma

MuS
Legend

You can identify Splunk employees by the [Splunk] after their username - therefore @somesoni2 is no splunkr, but he once was 😉

0 Karma

rdagan_splunk
Splunk Employee
Splunk Employee

For the most part JSON does not need a source type. Hunk understand that format without any additional work from you. CSV with a header, also does not require any additional work.

So, that means only your XML and CSV without Headers will require some additional manipulation in the Props.conf files.

In your case, are these 3 data types stored in the exact same HDFS directory /user/data/alldata or do you have /user/data/jsondata /user/data/xmldata /user/data/csvdata ?

0 Karma

sdaruna
Explorer

Hi rdaga,

Yes. There is chance that they might have in same directory.
Is there any solution if they reside in different directory.?

0 Karma

rdagan_splunk
Splunk Employee
Splunk Employee

Same Hadoop directory:
[source::/user/data/alldata/*.xml]
priority = 100
sourcetype = xml-hadoop

[source::/user/data/alldata/*.csv]
priority = 101
sourcetype = csv-hadoop

Different Hadoop directory:
[source::/user/data/xmldata/...]
priority = 100
sourcetype = xml-hadoop

[source::/user/data/csvdata/...]
priority = 101
sourcetype = csv-hadoop

0 Karma
Get Updates on the Splunk Community!

Splunk Certification Support Alert | Pearson VUE Outage

Splunk Certification holders and candidates!  Please be advised of an upcoming system maintenance period for ...

Enterprise Security Content Update (ESCU) | New Releases

In September, the Splunk Threat Research Team had two releases of new security content via the Enterprise ...

New in Observability - Improvements to Custom Metrics SLOs, Log Observer Connect & ...

The latest enhancements to the Splunk observability portfolio deliver improved SLO management accuracy, better ...