This is the main function in the FNN
package. This function fits models of the form: f(z, b(x)) where
z are the scalar covariates and b(x) are the functional covariates. The form of f() is that of a neural network
with a generalized input space.
fnn.fit( resp, func_cov, scalar_cov = NULL, basis_choice = c("fourier"), num_basis = c(7), hidden_layers = 2, neurons_per_layer = c(64, 64), activations_in_layers = c("sigmoid", "linear"), domain_range = list(c(0, 1)), epochs = 100, loss_choice = "mse", metric_choice = list("mean_squared_error"), val_split = 0.2, learn_rate = 0.001, patience_param = 15, early_stopping = T, print_info = T, batch_size = 32, decay_rate = 0, func_resp_method = 1, covariate_scaling = T, raw_data = F )
resp | For scalar responses, this is a vector of the observed dependent variable. For functional responses, this is a matrix where each row contains the basis coefficients defining the functional response (for each observation). |
---|---|
func_cov | The form of this depends on whether the |
scalar_cov | A matrix contained the multivariate information associated with the data set. This is all of your non-longitudinal data. |
basis_choice | A vector of size k (the number of functional covariates) with either "fourier" or "bspline" as the inputs. This is the choice for the basis functions used for the functional weight expansion. If you only specify one, with k > 1, then the argument will repeat that choice for all k functional covariates. |
num_basis | A vector of size k defining the number of basis functions to be used in the basis expansion. Must be odd
for |
hidden_layers | The number of hidden layers to be used in the neural network. |
neurons_per_layer | Vector of size = |
activations_in_layers | Vector of size = |
domain_range | List of size k. Each element of the list is a 2-dimensional vector containing the upper and lower bounds of the k-th functional weight. |
epochs | The number of training iterations. |
loss_choice | This parameter defines the loss function used in the learning process. |
metric_choice | This parameter defines the printed out error metric. |
val_split | A parameter that decides the percentage split of the inputted data set. |
learn_rate | Hyperparameter that defines how quickly you move in the direction of the gradient. |
patience_param | A keras parameter that decides how many additional |
early_stopping | If True, then learning process will be halted early if error improvement isn't seen. |
print_info | If True, function will output information about the model as it is trained. |
batch_size | Size of the batch for stochastic gradient descent. |
decay_rate | A modification to the learning rate that decreases the learning rate as more and more learning iterations are completed. |
func_resp_method | Set to 1 by default. In the future, this will be set to 2 for an alternative functional response approach. |
covariate_scaling | If True, then data will be internally scaled before model development. |
raw_data | If True, then user does not need to create functional observations beforehand. The function will internally take care of that pre-processing. |
The following are returned:
model
-- Full keras model that can be used with any functions that act on keras models.
data
-- Adjust data set after scaling and appending of scalar covariates.
fnc_basis_num
-- A return of the original input; describes the number of functions used in each of the k basis expansions.
fnc_type
-- A return of the original input; describes the basis expansion used to make the functional weights.
parameter_info
-- Information associated with hyperparameter choices in the model.
per_iter_info
-- Change in error over training iterations
func_obs
-- In the case when raw_data
is True, the user may want to see the internally developed functional observations.
This returns those functions.
Updates coming soon.
# First, an easy example with raw_data = T # Loading in data data("daily") # Functional covariates temp = t(daily$tempav) precip = t(daily$precav) longtidunal_dat = list(temp, precip) # Scalar Response total_prec = apply(daily$precav, 2, mean) # Running model fit1 = fnn.fit(resp = total_prec, func_cov = longtidunal_dat, scalar_cov = NULL, learn_rate = 0.0001, raw_data = T) # Classification Example with raw_data = T # Loading data tecator = FNN::tecator # Making classification bins tecator_resp = as.factor(ifelse(tecator$y$Fat > 25, 1, 0)) # Non functional covariate tecator_scalar = data.frame(water = tecator$y$Water) # Splitting data ind = sample(1:length(tecator_resp), round(0.75*length(tecator_resp))) train_y = tecator_resp[ind] test_y = tecator_resp[-ind] train_x = tecator$absorp.fdata$data[ind,] test_x = tecator$absorp.fdata$data[-ind,] scalar_train = data.frame(tecator_scalar[ind,1]) scalar_test = data.frame(tecator_scalar[-ind,1]) # Making list element to pass in func_covs_train = list(train_x) func_covs_test = list(test_x) # Now running model fit_class = fnn.fit(resp = train_y, func_cov = func_covs_train, scalar_cov = scalar_train, hidden_layers = 6, neurons_per_layer = c(24, 24, 24, 24, 24, 58), activations_in_layers = c("relu", "relu", "relu", "relu", "relu", "linear"), domain_range = list(c(850, 1050)), learn_rate = 0.001, epochs = 100, raw_data = T, early_stopping = T) # Running prediction, gets probabilities predict_class = fnn.predict(fit_class, func_cov = func_covs_test, scalar_cov = scalar_test, domain_range = list(c(850, 1050)), raw_data = T) # Example with Pre-Processing (raw_data = F) # loading data tecator = FNN::tecator # libraries library(fda) # define the time points on which the functional predictor is observed. timepts = tecator$absorp.fdata$argvals # define the fourier basis nbasis = 29 spline_basis = create.fourier.basis(tecator$absorp.fdata$rangeval, nbasis) # convert the functional predictor into a fda object and getting deriv tecator_fd = Data2fd(timepts, t(tecator$absorp.fdata$data), spline_basis) tecator_deriv = deriv.fd(tecator_fd) tecator_deriv2 = deriv.fd(tecator_deriv) # Non functional covariate tecator_scalar = data.frame(water = tecator$y$Water) # Response tecator_resp = tecator$y$Fat # Getting data into right format tecator_data = array(dim = c(nbasis, length(tecator_resp), 3)) tecator_data[,,1] = tecator_fd$coefs tecator_data[,,2] = tecator_deriv$coefs tecator_data[,,3] = tecator_deriv2$coefs # Splitting into test and train for third FNN ind = 1:165 tec_data_train <- array(dim = c(nbasis, length(ind), 3)) tec_data_test <- array(dim = c(nbasis, nrow(tecator$absorp.fdata$data) - length(ind), 3)) tec_data_train = tecator_data[, ind, ] tec_data_test = tecator_data[, -ind, ] tecResp_train = tecator_resp[ind] tecResp_test = tecator_resp[-ind] scalar_train = data.frame(tecator_scalar[ind,1]) scalar_test = data.frame(tecator_scalar[-ind,1]) # Setting up network tecator_fnn = fnn.fit(resp = tecResp_train, func_cov = tec_data_train, scalar_cov = scalar_train, basis_choice = c("fourier", "fourier", "fourier"), num_basis = c(5, 5, 7), hidden_layers = 4, neurons_per_layer = c(64, 64, 64, 64), activations_in_layers = c("relu", "relu", "relu", "linear"), domain_range = list(c(850, 1050), c(850, 1050), c(850, 1050)), epochs = 300, learn_rate = 0.002) # Prediction example can be seen with ?fnn.fit() # Functional Response Example: # libraries library(fda) # Loading data data("daily") # Creating functional data temp_data = array(dim = c(65, 35, 1)) tempbasis65 = create.fourier.basis(c(0,365), 65) tempbasis7 = create.bspline.basis(c(0,365), 7, norder = 4) timepts = seq(1, 365, 1) temp_fd = Data2fd(timepts, daily$tempav, tempbasis65) prec_fd = Data2fd(timepts, daily$precav, tempbasis7) prec_fd$coefs = scale(prec_fd$coefs) # Data set up temp_data[,,1] = temp_fd$coefs resp_mat = prec_fd$coefs # Non functional covariate weather_scalar = data.frame(total_prec = apply(daily$precav, 2, sum)) # Getting data into proper format ind = 1:30 nbasis = 65 weather_data_train <- array(dim = c(nbasis, ncol(temp_data), 1)) weather_data_train[,,1] = temp_data scalar_train = data.frame(weather_scalar[,1]) resp_train = t(resp_mat) # Running model weather_func_fnn <- fnn.fit(resp = resp_train, func_cov = weather_data_train, scalar_cov = scalar_train, basis_choice = c("bspline"), num_basis = c(7), hidden_layers = 2, neurons_per_layer = c(1024, 1024), activations_in_layers = c("sigmoid", "linear"), domain_range = list(c(1, 365)), epochs = 300, learn_rate = 0.01, func_resp_method = 1)