Classification and Modes

Practical route to nonparametric classification and conditional-mode workflows in np.

Keywords

npconmode, classification, conditional mode, mode estimation, birthwt

The np package can also be used for nonparametric classification problems by estimating the conditional density or conditional probability structure and then working with the conditional mode. This is a useful route when a simple parametric classification model feels too rigid.

If you want a minimal downloadable script for the classification route, start with np_classification_quickstart.R.

The basic idea

For binary, multinomial, or ordered outcomes, one practical route is to estimate a nonparametric conditional density and then classify using the conditional mode. In np, the function most directly associated with this workflow is npconmode.

This is especially useful when the covariates include a mixture of continuous and categorical variables.

In statistical terms, the fitted class is the outcome value with the largest estimated conditional probability at the conditioning point. For a discrete response support, this is the conditional mode. The bandwidths and any local-polynomial degree search are inherited from the conditional-density bandwidth route used by npcdensbw.

A simple binary-outcome example

The older birthwt example remains a good illustration because it contains several categorical covariates that must be classed correctly.

## Class the categorical variables correctly, then fit parametric and np models
library(np)
data(birthwt, package = "MASS")

birthwt$low <- factor(birthwt$low)
birthwt$smoke <- factor(birthwt$smoke)
birthwt$race <- factor(birthwt$race)
birthwt$ht <- factor(birthwt$ht)
birthwt$ui <- factor(birthwt$ui)
birthwt$ftv <- factor(birthwt$ftv)

model_logit <- glm(low ~ smoke + race + ht + ui + ftv + age + lwt,
  family = binomial(link = logit),
  data = birthwt)

model_np <- npconmode(low ~ smoke + race + ht + ui + ftv + age + lwt,
  data = birthwt)

summary(model_np)

The default route uses the usual local-constant conditional-density fit. You can also allow the bandwidth search to choose a local-polynomial degree when there is at least one continuous conditioning variable:

model_np_nomad <- npconmode(low ~ smoke + race + ht + ui + ftv + age + lwt,
  data = birthwt,
  ## nomad = "auto" uses exhaustive search for p = 1 and NOMAD otherwise.
  nomad = "auto")

summary(model_np_nomad)

For non-local-constant routes, npconmode works with the full set of fitted probabilities over the discrete response support before choosing the modal outcome. The reported modal class is therefore based on probabilities that are non-negative and sum to one across the observed support.

The returned object also retains the conditional-density bandwidth object and mirrors the selected regression type, degree, NOMAD shortcut metadata, and search-engine information used to build the classifier. If you want to inspect the fitted class-probability matrix directly, set probabilities = TRUE.

Comparing with a parametric classifier

One natural comparison is a confusion matrix against a parametric logit model.

## Compare the parametric confusion matrix with the np conditional-mode result
pred_logit <- factor(ifelse(fitted(model_logit) > 0.5, "1", "0"),
  levels = levels(birthwt$low))

cm_logit <- table(birthwt$low, pred_logit)

cm_logit
model_np$confusion.matrix
model_np_nomad$confusion.matrix

mean(pred_logit == birthwt$low)
model_np$CCR.overall
model_np_nomad$CCR.overall

The exact comparison will of course depend on the data and the tuning, but this is the right general workflow when you want to compare a flexible nonparametric classifier to a familiar parametric benchmark.

Why this route is useful

it handles mixed data naturally,
it avoids forcing a specific link and linear index structure from the outset,
it can uncover classification structure that a simple parametric model misses.

Practical notes

Make sure categorical variables are properly classed before fitting.
Start with a modest problem size first.
Do not casually override search tolerances just because an older example did so for convenience.
Treat the confusion matrix as one useful summary, not the whole story.

What `npconmode` is really doing

Conceptually, the model is building a nonparametric representation of the conditional distribution and then using the most likely outcome category at each conditioning point. That is why the function is useful for classification and modal regression problems.

The underlying conditional-density machinery is the same mixed-data kernel framework used elsewhere in np. The package help pages carry the argument details; this Gallery page is meant to show the classification workflow and where the fitted summaries come from.