Matlab classifier model
Develop web app for predicting phenotypic and environmental characteristics of gram-negative bacterium Escherichia coli (dataset contains 4502 features, the first 6 corresponding to gene ID, strain, medium, environmental and genetic perturbation, and information about the growth rate; the last entries correspond to the expression of all genes in the bacterium).
In this project we use a set of 223 transcriptional profiling samples from the gram-negative bacterium Escherichiacoli, which is the well-studied organism with great importance to human healthand biotechnology. We created a predictor of the bacterial growth attribute by using only the expression of the genes as attributes and use a regularized regression technique lasso. Program reports the confidence interval of the prediction by using the bootstrapping method. We created four separate SVM classifiers to categorize the strain type, medium type, environmental and gene perturbation,given all the gene transcriptional profiles and create one composite SVM classifier to simultaneously predict medium and environmental perturbations. This classifier performs worse that the two individual classifiers together for these predictions. And finally the program performs of Principal Component Analysis,keeping only 3 Principal Components as features for the SVM classifier.