Your daily limit is set by the size of the license you buy. You determine the size of your license by adding up the number of GB you plan to give to Splunk each day. That number may be available in your current SIEM; otherwise, you'll have to do some research to find out where the SIEM is getting it's data and how big that data is.
can I look at the size of each index file on SPLUNK_DB and add up there. Is the indexed data what goes against my daily limit or what is ingested?
We estimated that our current siem is about 1tb a day, but since our current siem only ingest security events and throws away any event it can't parse I am expecting SPLUNK to ingest a much larger amount on a daily basis.
The amount of data ingested is what counts against your daily license limit. The indexes are compressed and have metadata added so looking at them will not be accurate. Go to Settings->Licensing and click the "Usage report" button to see how much of your license you've used.
The TB/day estimate is a good start. Splunk also has the ability to discard events.