Classification Accuracy as a Function of Cell Population

The script below is used to generate the Classification Accuracy using the LibSVM Classifier as a function of the number of neurons used to perform the analysis. The script runs from the minimum to the maximum number of neurons present on the given file and using the parallel computing toolbox under matlab runs the analysis using 25 resample runs. It is recommended that one is familiar with the decoding using the Max Correlation Coefficient before proceeding on attempting to use this script.

Note 1

This section assumes that the reader is familiar with the basic workings of the Neural Decoding Toolbox (NDT). For further information, please see the official website official website. Additionally, please refer to the NDT Components.

Note 2: Incomplete/Not Working Script

The script below is a Work in Progress, as it does not perform as expected; please see the figure below for a comparison of this classifier with respect to both the Bayes and Max Correlation Coefficient.

 

Classifiers

Please click on the Classifier below

  1. Bayes
  2. LibSVM
The content below is under construction

Adding to Path

Before the script can be run, the toolbox must be added to the Matlab Path.

% add the path to the NDT so add_ndt_paths_and_init_rand_generator can be called
toolbox_basedir_name = '/Users/UserName/Matlab/NDT_Toolbox/';
addpath(toolbox_basedir_name);
 
% add the NDT paths using add_ndt_paths_and_init_rand_generator
add_ndt_paths_and_init_rand_generator

Initialization

Some directory paths are set to let the script know where to locate and store files. The ‘iFILENAME’ variable is passed from a master script which gets the name of the files in a directory.

homeDir = pwd;
dataToClassifyDir = '/Volumes/DATA_21/DEC_MAT_RES_TOCLASSIFY_FINAL';
CLASSIFIED = '/Volumes/DATA_21/DEC_MAT_RES_BASIC_DS_CLASSIFIED_PopSize_MaxN';
 
% --- LOAD FILES
cd (CLASSIFIED)
mkdir (iFILENAME)
cd (dataToClassifyDir);
load(iFILENAME)

Some Preprocessing

This script finds the number of times each stimulus condition was repeated, as this is useful for determining the number of cross validation splits that may be used. Ideally, the number of cross validation (CV) splits should be large, but a smaller CV size maybe used if computation time is critical.

for k = 1:65
    inds_of_sites_with_at_least_k_repeats = find_sites_with_k_label_repetitions(binned_labels.value_highlow, k);
    num_sites_with_k_repeats(k) = length(inds_of_sites_with_at_least_k_repeats);
end
clear inds_of_sites_with_at_least_k_repeats
clear k

In case the program terminates prematurely, a function ‘findRemaining’ is written in order to run the analysis on the remaining cell population.

cd(homeDir)
toDecode = findRemaining(CLASSIFIED,iFILENAME,length(binned_data));
cd(homeDir)

Specify the label name to look for in our data set file.

specific_label_name_to_use = 'value_highlow';

Choose a suitable CV split size based on the ‘k’ value obtained earlier.

num_cv_splits = 20;

Parallel Computing

We now run the script in matlab using the parfor command. The maximum simultaneous number of workers that may be used depends on Matlab licensing as well as the number of processor’s available.

parfor I = 1:length(toDecode) %Cycle through the different population Sizes each time!

Create a Basic Datasource

The datasource created below is used to generate k leave-one-fold-out cross validation splits of the data. We have specified k=20 as above. In addition, the ‘num_resample_sites’ parameter is set to limit the number of sites which are randomly selected to run the Decoding Analysis.

cd (dataToClassifyDir)
ds = basic_DS(iFILENAME, specific_label_name_to_use, num_cv_splits);
ds.num_resample_sites  = I; %Creates a random pseudopopulation with the number of given sites

Feature Pre-processor

Any preprocessing is applied using any of the three available feature preprocessors. In the following example script, the zscore_normalize was applied to consider neurons having different ranges of firing rates. See Feature Preprocessors at the readout.info website for more information.

fp = {}; 
fp{1}= zscore_normalize_FP;
Insert Equation Here

Choosing a Classifier

The max correlation coefficient classifier is used in this script. For a detailed description of the classifiers see Classifiers.

the_classifier = max_correlation_coefficient_CL;

Cross Validator

The Cross Validator runs a cross validation decoding analysis by testing and training the classifier with data obtained from the Datasource after being optionally processed through any feature preprocessors. In our case, we used the zscore_normalize feature preprocessor. The scheme is run a total of 25 times.

the_cross_validator = standard_resample_CV(ds, the_classifier, fp);
the_cross_validator.num_resample_runs = 25;
the_cross_validator.confusion_matrix_params.create_confusion_matrix = 1;

Run the Cross Validator

DECODING_RESULTS = the_cross_validator.run_cv_decoding;

Save Script to File

The script is saved by calling a function ‘iSaveX’ which is necessary for handling requests in a parallel computing environment.

save_file_name = strcat(iFILENAME,'_',int2str(I),'_RESULTS');
cd (homeDir)
iSaveX(save_file_name,DECODING_RESULTS,homeDir,CLASSIFIED,iFILENAME);
end