- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
MLTKC how should i check jupyter notebook func in splunk
ryanaa
Explorer
02-04-2025
01:55 AM
I want to use an autoencoder model in Splunk for anomaly detection. I have already built my own model, and I did not use a scaler during the process. However, I still encountered the following error.
Here is my code:
I want to check the fields returned by my func in the search bar. What syntax can I use to verify this?
this is my python code
def apply(model, df, param):
X = df[param['feature_variables']].copy()
# 1. 類型轉換
X = X.replace({True: 1, False: 0})
# 2. 處理特殊字符/缺失值
X = X.apply(pd.to_numeric, errors='coerce') # 將無法轉換的值設為NaN
X = X.fillna(0)
# 3. 類型統一
X = X.astype('float32').values
"""
应用模型进行异常检测(无标准化)
"""
# X = df[param['feature_variables']].values
# 重建预测
X_reconstructed = model.predict(X)
# 计算重建误差
reconstruction_errors = np.mean(np.square(X - X_reconstructed), axis=1)
# 异常阈值设置
threshold_percentile = param.get('options', {}).get('params', {}).get('threshold_percentile', 95)
threshold = np.percentile(reconstruction_errors, threshold_percentile)
# 构建结果
df_result = df.copy()
df_result['reconstruction_error'] = reconstruction_errors
filtered_errors_1 = df_result.loc[df_result['is_work'] == 1, 'reconstruction_error']
filtered_errors_0 = df_result.loc[df_result['is_work'] == 0, 'reconstruction_error']
threshold_1 = np.percentile(filtered_errors_1, threshold_percentile) if not filtered_errors_1.empty else np.nan
threshold_0 = np.percentile(filtered_errors_0, threshold_percentile) if not filtered_errors_0.empty else np.nan
df_result['threshold'] = np.where(df_result['is_work'] == 1, threshold_1, threshold_0)
df_result['is_anomaly'] = (reconstruction_errors > threshold).astype(int)
# 可选隐藏层特征
if param.get('options', {}).get('params', {}).get('return_hidden', False):
intermediate_model = Model(inputs=model.inputs, outputs=model.layers[1].output)
hidden = intermediate_model.predict(X)
hidden_df = pd.DataFrame(hidden, columns=[f"hidden_{i}" for i in range(hidden.shape[1])])
df_result = pd.concat([df_result, hidden_df], axis=1)
return df_result
I used apply to call this model, but I want to see the threshold field returned in df_result.
