getExpressionStructure Loads a representation of an experiment from an Excel file (see comments further down) fileName an Excel representation on an experiment experiment an experiment structure data matrix with expression values orfs the corresponding ORFs experiments the titles of the experiments boundNames reaction names for the bounds upperBoundaries matrix with the upper bound values fitNames reaction names for the measured fluxes fitTo matrix with the measured fluxes A very common data set when working with genome-scale metabolic models is that you have measured fermentation data, gene expression data, and some different 'bounds' (for example different carbon sources or genes that are knocked out) in a number of conditions. This function reads an Excel representation of such an experiment. The Excel file must contain three sheets, 'EXPRESSION', 'BOUNDS', 'FITTING'. Below are some examples to show how they should be formatted: -EXPRESSION ORF dsm_paa wisc_paa Pc00e00030 79.80942723 78.14755338 Shows the expression of the gene Pc00e00030 under two different conditions (in this case a DSM strain and a Wisconsin strain of P. chrysogenum with PSS in the media) -BOUNDS Fixed Upper dsm_paa wisc_paa paaIN 0.1 0.2 The upper bound for the reaction paaIN should be 0.1 for the first condition and 0.2 for the second -FITTING Fit to dsm_paa wisc_paa co2OUT 2.85 3.05 glcIN 1.2 0.9 The measured fluxes for CO2 production and glucose uptake for the two conditions. The model(s) can later be fitted to match these values as good as possible. Usage: experiment=getExpressionStructure(fileName)
0001 function experiment=getExpressionStructure(fileName) 0002 % getExpressionStructure 0003 % Loads a representation of an experiment from an Excel file (see 0004 % comments further down) 0005 % 0006 % fileName an Excel representation on an experiment 0007 % 0008 % experiment an experiment structure 0009 % data matrix with expression values 0010 % orfs the corresponding ORFs 0011 % experiments the titles of the experiments 0012 % boundNames reaction names for the bounds 0013 % upperBoundaries matrix with the upper bound values 0014 % fitNames reaction names for the measured fluxes 0015 % fitTo matrix with the measured fluxes 0016 % 0017 % A very common data set when working with genome-scale metabolic models 0018 % is that you have measured fermentation data, gene expression data, 0019 % and some different 'bounds' (for example different carbon sources 0020 % or genes that are knocked out) in a number of conditions. This function 0021 % reads an Excel representation of such an experiment. 0022 % The Excel file must contain three sheets, 'EXPRESSION', 'BOUNDS', 0023 % 'FITTING'. Below are some examples to show how they should be 0024 % formatted: 0025 % 0026 % -EXPRESSION 0027 % ORF dsm_paa wisc_paa 0028 % Pc00e00030 79.80942723 78.14755338 0029 % Shows the expression of the gene Pc00e00030 under two different 0030 % conditions (in this case a DSM strain and a Wisconsin strain of P. 0031 % chrysogenum with PSS in the media) 0032 % 0033 % -BOUNDS 0034 % Fixed Upper dsm_paa wisc_paa 0035 % paaIN 0.1 0.2 0036 % The upper bound for the reaction paaIN should be 0.1 for the first 0037 % condition and 0.2 for the second 0038 % 0039 % -FITTING 0040 % Fit to dsm_paa wisc_paa 0041 % co2OUT 2.85 3.05 0042 % glcIN 1.2 0.9 0043 % The measured fluxes for CO2 production and glucose uptake for the two 0044 % conditions. The model(s) can later be fitted to match these values as 0045 % good as possible. 0046 % 0047 % Usage: experiment=getExpressionStructure(fileName) 0048 0049 [type, sheets]=xlsfinfo(fileName); 0050 0051 %Check if the file is a Microsoft Excel Spreadsheet 0052 if ~strcmp(type,'Microsoft Excel Spreadsheet') 0053 EM='The file is not a Microsoft Excel Spreadsheet'; 0054 dispEM(EM); 0055 end 0056 0057 %Check that all sheets are present and saves the index of each 0058 exprSheet=find(strcmp('EXPRESSION', sheets)); 0059 boundSheet=find(strcmp('BOUNDS', sheets)); 0060 fitSheet=find(strcmp('FITTING', sheets)); 0061 0062 if length(exprSheet)~=1 || length(boundSheet)~=1 || length(fitSheet)~=1 0063 EM='Not all required spreadsheets are present in the file'; 0064 dispEM(EM); 0065 end 0066 0067 %Load the expression data 0068 [discard,dataSheet]=xlsread(fileName,exprSheet); 0069 experiment.data=discard; 0070 experiment.orfs=dataSheet(2:size(dataSheet,1),1); 0071 experiment.experiments=dataSheet(1,2:size(dataSheet,2)); 0072 0073 %Loads the maximal boundaries 0074 [discard,dataSheet]=xlsread(fileName,boundSheet); 0075 experiment.boundNames=dataSheet(2:size(dataSheet,1),1); 0076 experiment.upperBoundaries=discard; 0077 0078 %Loads the experimental data to fit to 0079 [discard,dataSheet]=xlsread(fileName,fitSheet); 0080 experiment.fitNames=dataSheet(2:size(dataSheet,1),1); 0081 experiment.fitTo=discard; 0082 0083 %Check to see that the dimensions are correct 0084 if length(experiment.orfs)~=size(experiment.data,1) || (length(experiment.experiments)~=size(experiment.data,2) && ~isempty(experiment.data)) 0085 EM='The expression data does not seem to be formated in the expected manner'; 0086 dispEM(EM); 0087 end 0088 end