Create a random forest model.

Parameters:

Parameters:
See dedicated page for more information.
The R_RandomForest2 action button creates a Random Forest classification model based on a given dataset. It automatically evaluates predictor importance, computes performance metrics, and generates validation visuals such as confusion matrices and ROC curves. This action supports saving the model, making predictions, and outputting all key visual artifacts.
Required Input File: Structured CSV containing categorical and numerical predictors with a binary or multiclass target.
Columns Used in Example (Demographic_Purchase_Data_Expanded.csv):
age, incomegender, job_typepurchaseMinimum Requirement:
At least 50 rows for reliable model training and cross-validation.
For test runs with smaller datasets (e.g., 10 rows), reduce training complexity (see Troubleshooting).
The action generates the following outputs:
.rModel) — Saved if OUT: save model = ON


Prepare Input File
Load Demographic_Purchase_Data_Expanded.csv using the readCSV action.
Add and Connect R_RandomForest2
Connect the CSV node to the R_RandomForest2 action. Optionally connect a RunToFinishLine.

Configure Parameters
Define predictor columns, target column, task type, and output folder.
⚠ Avoid using spaces in
OUT: time stampto prevent image export errors (see Troubleshooting).

Run the Pipeline
Execute the pipeline. View logs and artifacts in the Records tab.

Not enough data to run a modelCause: Dataset too small for current settings (e.g., 10 rows with 5-fold CV and 500 trees)
Fix:
idxXVal → 2idNTree → 50Max terminal node size → 5Learning sample proportion → 0.6ggsave(): Cannot find directoryCause: R cannot save images if folder path (OUT: time stamp) contains spaces or does not exist.
Fix:
OUT: time stamp to a simple name like RandomForestResultsCustom Partitioning Mode, ensure your input table is pre-sortedggplot2, randomForest, and caret libraries.0: Scored table1: Coefficients2: Variable Importance3: Variable names and recoded values4: File info