--- title: "Pomodoro_Vignette" author: "Seyma Kalay" date: "`r Sys.Date()`" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Pomodoro_Vignette} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` ```{r setup} library(pomodoro) ``` This package was set for the Credit Access studies. But it can be used for the binary and multiple factor variables. First thing let's see the str of the *sample_data* with `str(sample_data)`. Since the dataset is huge, let's take the first 500 rows and set the study on it. The following example run the `multinominal logistic model` in `yvar`. The function simplifies the 80/20 train test set using 10cv after scaled and center it. ```{r RF, echo=FALSE} sample_data <- sample_data[c(1:750),] yvar <- c("Loan.Type") xvar <- c("sex", "married", "age", "havejob", "educ", "political.afl", "rural", "region", "fin.intermdiaries", "fin.knowldge", "income") BchMk.RF <- RF_Model(sample_data, c(xvar, "networth"), yvar ) BchMk.RF$Roc$auc ``` ### Estimate_Models Estimate_Models function considers `exog` and `xadd` variables and set multiple models based on the selected `exog` and `xadd`. On the one hand `exog` is subtract the selected vector from the dataset and run the model for all the dataset and for the splits of the `exog`. On the other hand `xadd` add the selected vectors and run the model. Where the `dnames` are the unique values in `exog` this is to save the model estimates by their name. ```{r EstModel, echo=T} sample_data <- sample_data[c(1:750),] yvar <- c("Loan.Type") xvar <- c("sex", "married", "age", "havejob", "educ", "political.afl", "rural", "region", "fin.intermdiaries", "fin.knowldge", "income") CCP.RF <- Estimate_Models(sample_data, yvar, xvec = xvar, exog = "political.afl", xadd = c("networth", "networth_homequity", "liquid.assets"), type = "RF", dnames = c("0","1")) ``` ### Combined_Performance Estimate_Models gives the results based on the splits of the `exog`. Combined_Performance prints out the total performance of these splits. ```{r ComPerf, echo=T} Sub.CCP.RF <- list(Mdl.1 = CCP.RF$EstMdl$`D.1+networth`, Mdl.0 = CCP.RF$EstMdl$`D.0+networth`) CCP.NoCCP.RF <- Combined_Performance (Sub.CCP.RF) ```