Regression Demo • FNN

Here, we present an example of a regression problem. We are concerned with the fat contents of meat samples as a scalar response. We will use the absorbance curves and their derivatives (functional covariates) along with the water contents (scalar covariate) as our predictors.

First, we’ll read in the data and load some libraries.

# loading data
tecator = FNN::tecator

## Warning: replacing previous import 'caret::train' by 'tensorflow::train' when
## loading 'FNN'

# libraries
library(fda)

## Loading required package: Matrix

## 
## Attaching package: 'fda'

## The following object is masked from 'package:graphics':
## 
##     matplot

library(FNN)

Before doing anything else, we’re going to do some pre-processing to get our functional observations. Remember, we can let the fnn.fit() function do this for us (as see in the classification example) but, for this example, let’s do it ourselves.

# define the time points on which the functional predictor is observed.
timepts = tecator$absorp.fdata$argvals

# define the fourier basis
nbasis = 29
spline_basis = create.fourier.basis(tecator$absorp.fdata$rangeval, nbasis)

# convert the functional predictor into a fda object and getting deriv
tecator_fd =  Data2fd(timepts, t(tecator$absorp.fdata$data), spline_basis)
tecator_deriv = deriv.fd(tecator_fd)
tecator_deriv2 = deriv.fd(tecator_deriv)

In the chunk of code above, we are creating functional observations using a 29 term Fourier basis expansion. These functions are available in the fda package. The Data2fd() function converts the raw data into the functional data objects that we need. We are going to use multiple functional covariates as alluded to earlier by using the derivatives of the absorbance curves; we can easily acquire these derivatives by using the deriv.fd function.

Let’s now get our scalar covariate

# Non functional covariate
tecator_scalar = data.frame(water = tecator$y$Water)

And our response

# Response
tecator_resp = tecator$y$Fat

We now need to create a tensor containing the functional covariates (as defined by their coefficients) so that it can be passed into the main model function:

# Getting data into right format
tecator_data = array(dim = c(nbasis, length(tecator_resp), 3))
tecator_data[,,1] = tecator_fd$coefs
tecator_data[,,2] = tecator_deriv$coefs
tecator_data[,,3] = tecator_deriv2$coefs

And the last step before building our model, we create a test train split. This will be quite a few lines of code but all of them are just splitting each of the functional and scalar covariates, as well as the response. In this case, we will use the first 165 curves as the training set and use the final 50 as the test set.

# Splitting into test and train for third FNN
ind = 1:165
tec_data_train <- array(dim = c(nbasis, length(ind), 3))
tec_data_test <- array(dim = c(nbasis, nrow(tecator$absorp.fdata$data) - length(ind), 3))
tec_data_train = tecator_data[, ind, ]
tec_data_test = tecator_data[, -ind, ]
tecResp_train = tecator_resp[ind]
tecResp_test = tecator_resp[-ind]
scalar_train = data.frame(tecator_scalar[ind,1])
scalar_test = data.frame(tecator_scalar[-ind,1])

We now build the model:

# Setting up network
tecator_fnn = fnn.fit(resp = tecResp_train,
                      func_cov = tec_data_train,
                      scalar_cov = scalar_train,
                      basis_choice = c("fourier", "fourier", "fourier"),
                      num_basis = c(5, 5, 7),
                      hidden_layers = 4,
                      neurons_per_layer = c(64, 64, 64, 64),
                      activations_in_layers = c("relu", "relu", "relu", "linear"),
                      domain_range = list(c(850, 1050), c(850, 1050), c(850, 1050)),
                      epochs = 300,
                      learn_rate = 0.002)

## Model
## ________________________________________________________________________________
## Layer (type)                        Output Shape                    Param #     
## ================================================================================
## dense (Dense)                       (None, 64)                      1216        
## ________________________________________________________________________________
## dense_1 (Dense)                     (None, 64)                      4160        
## ________________________________________________________________________________
## dense_2 (Dense)                     (None, 64)                      4160        
## ________________________________________________________________________________
## dense_3 (Dense)                     (None, 64)                      4160        
## ________________________________________________________________________________
## dense_4 (Dense)                     (None, 1)                       65          
## ================================================================================
## Total params: 13,761
## Trainable params: 13,761
## Non-trainable params: 0
## ________________________________________________________________________________
## 
## 
## 
## xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

## 
## Trained on 132 samples (batch_size=32, epochs=52)
## Final epoch (plot to see history):
##                   loss: 2.869
##     mean_squared_error: 2.869
##               val_loss: 5.15
## val_mean_squared_error: 5.15

In this example, we build a 4 layer network each of which contains 64 neurons. We use 300 training iterations and define the 3 functional weights (for the 3 functional covariates) using 5, 5, and 7 basis functions, respectively.

Let’s now get some predictions!

# Predicting
pred_tec = fnn.predict(tecator_fnn,
                       tec_data_test,
                       scalar_cov = scalar_test,
                       basis_choice = c("fourier", "fourier", "fourier"),
                       num_basis = c(5, 5, 7),
                       domain_range = list(c(850, 1050), c(850, 1050), c(850, 1050)))

Using this output, we can compare the predictions to the actual results in whatever way we would like!

And that is pretty much it for this example.