MLTKC how should i check jupyter notebook func in ...

ryanaa · ‎02-04-2025

I want to use an autoencoder model in Splunk for anomaly detection. I have already built my own model, and I did not use a scaler during the process. However, I still encountered the following error.

Here is my code:

I want to check the fields returned by my func in the search bar. What syntax can I use to verify this?

this is my python code

def apply(model, df, param):
    X = df[param['feature_variables']].copy()
    # 1. 類型轉換
    X = X.replace({True: 1, False: 0})
    
    # 2. 處理特殊字符/缺失值
    X = X.apply(pd.to_numeric, errors='coerce')  # 將無法轉換的值設為NaN
    X = X.fillna(0)
    
    # 3. 類型統一
    X = X.astype('float32').values
    
    """
    应用模型进行异常检测（无标准化）
    """
    # X = df[param['feature_variables']].values
    
    # 重建预测
    X_reconstructed = model.predict(X)
    
    # 计算重建误差
    reconstruction_errors = np.mean(np.square(X - X_reconstructed), axis=1)
    
    # 异常阈值设置
    
    threshold_percentile = param.get('options', {}).get('params', {}).get('threshold_percentile', 95)
    threshold = np.percentile(reconstruction_errors, threshold_percentile)

    # 构建结果
    df_result = df.copy()

    df_result['reconstruction_error'] = reconstruction_errors
    filtered_errors_1 = df_result.loc[df_result['is_work'] == 1, 'reconstruction_error']
    filtered_errors_0 = df_result.loc[df_result['is_work'] == 0, 'reconstruction_error']

    threshold_1 = np.percentile(filtered_errors_1, threshold_percentile) if not filtered_errors_1.empty else np.nan
    threshold_0 = np.percentile(filtered_errors_0, threshold_percentile) if not filtered_errors_0.empty else np.nan
    df_result['threshold'] = np.where(df_result['is_work'] == 1, threshold_1, threshold_0)

    
    df_result['is_anomaly'] = (reconstruction_errors > threshold).astype(int)
    
    # 可选隐藏层特征
    if param.get('options', {}).get('params', {}).get('return_hidden', False):
        intermediate_model = Model(inputs=model.inputs, outputs=model.layers[1].output)
        hidden = intermediate_model.predict(X)
        hidden_df = pd.DataFrame(hidden, columns=[f"hidden_{i}" for i in range(hidden.shape[1])])
        df_result = pd.concat([df_result, hidden_df], axis=1)
    
    return df_result

I used apply to call this model, but I want to see the threshold field returned in df_result.

MLTKC how should i check jupyter notebook func in splunk

Splunk App for Anomaly Detection End of Life Announcment

Aligning Observability Costs with Business Value: Practical Strategies

Mastering Data Pipelines: Unlocking Value with Splunk

Are you a member of the Splunk Community?

MLTKC how should i check jupyter notebook func in splunk

Splunk App for Anomaly Detection End of Life Announcment

Aligning Observability Costs with Business Value: Practical Strategies

Mastering Data Pipelines: Unlocking Value with Splunk