Home > INIT > getINITModel.m

getINITModel

PURPOSE ^

getINITModel_legacy

SYNOPSIS ^

function [model, metProduction, essentialRxnsForTasks, addedRxnsForTasks, deletedDeadEndRxns, deletedRxnsInINIT, taskReport]=getINITModel(refModel, tissue, celltype, hpaData, arrayData, metabolomicsData, taskFile, useScoresForTasks, printReport, taskStructure, params, paramsFT)

DESCRIPTION ^

 getINITModel_legacy
   Generates a model using the INIT algorithm, based on proteomics and/or
   transcriptomics and/or metabolomics and/or metabolic tasks. This is the original 
   implementation of tINIT, which is replaced by ftINIT.

   Input:
   refModel            a model structure. The model should be in the
                       closed form (no exchange reactions open). Import
                       using import(filename,false). If the model is not
                       loaded using importModel, it might be that there
                       is no "unconstrained" field. In that case,
                       manually add the field like:
                       model.unconstrained=false(numel(model.mets),1);
   tissue              tissue to score for. Should exist in either
                       hpaData.tissues or arrayData.tissues
   celltype            cell type to score for. Should exist in either
                       hpaData.celltypes or arrayData.celltypes for this
                       tissue (optional, default is to use the best values
                       among all the cell types for the tissue. Use [] if
                       you want to supply more arguments)
   hpaData             HPA data structure from parseHPA (optional if arrayData is
                       supplied, default [])
   arrayData           gene expression data structure (optional if hpaData is
                       supplied, default [])
       genes           cell array with the unique gene names
       tissues         cell array with the tissue names. The list may not be
                       unique, as there can be multiple cell types per tissue
       celltypes       cell array with the cell type names for each tissue
       levels          GENESxTISSUES array with the expression level for
                       each gene in each tissue/celltype. NaN should be
                       used when no measurement was performed
       threshold       a single value or a vector of gene expression 
                       thresholds, above which genes are considered to be
                       "expressed". (optional, by default, the mean expression
                       levels of each gene across all tissues in arrayData
                       will be used as the threshold values)
       singleCells     binary value selecting whether to use the
                       single-cell algorithm to identify expressed genes.
                       If used, specify cell subpopulations in CELLTYPES
                       (optional, default [])
       plotResults     true if single cell probability distributions
                       should be plotted (optional, default = False)
   metabolomicsData    cell array with metabolite names that the model
                       should produce (optional, default [])
   taskFile            a task list in Excel format. See parseTaskList for
                       details (optional, default [])
   useScoresForTasks   true if the calculated reaction scored should be used as
                       weights in the fitting to tasks (optional, default true)
   printReport         true if a report should be printed to the screen
                       (optional, default true)
   taskStructure       task structure as from parseTaskList. Can be used
                       as an alternative way to define tasks when Excel
                       sheets are not suitable. Overrides taskFile (optional,
                       default [])
   params              parameter structure as used by getMILPParams. This is
                       for the INIT algorithm. For the the MILP problems
                       solved to fit tasks, see paramsFT (optional, default [])
   paramsFT            parameter structure as used by getMILPParams. This is
                       for the fitTasks step. For the INIT algorithm, see
                       params (optional, default [])


   Output:
   model                   the resulting model structure
   metProduction           array that indicates which of the
                           metabolites in metabolomicsData that could be
                           produced. Note that this is before the
                           gap-filling process to enable defined tasks. To
                           see which metabolites that can be produced in
                           the final model, use canProduce.
                           -2: metabolite name not found in model
                           -1: metabolite found, but it could not be produced
                           1: metabolite could be produced
   essentialRxnsForTasks   cell array of the reactions which were
                           essential to perform the tasks
   addedRxnsForTasks       cell array of the reactions which were added in
                           order to perform the tasks
   deletedDeadEndRxns      cell array of reactions deleted because they
                           could not carry flux (INIT requires a
                           functional input model)
   deletedRxnsInINIT       cell array of the reactions which were deleted by
                           the INIT algorithm
   taskReport              structure with the results for each task
       id                  cell array with the id of the task
       description         cell array with the description of the task
       ok                  boolean array with true if the task was successful
       essential           cell array with cell arrays of essential
                           reactions for the task
       gapfill             cell array of cell arrays of reactions included
                           in the gap-filling for the task

   This is the main function for automatic reconstruction of models based
   on the (t)INIT algorithm (PLoS Comput Biol. 2012;8(5):e1002518, 
   Mol Syst Biol. 2014;10:721). Not all settings are possible using this
   function, and you may want to call the functions scoreModel, runINIT
   and fitTasks individually instead.

   NOTE: Exchange metabolites should normally not be removed from the model
   when using this approach, since checkTasks/fitTasks rely on putting specific
   constraints for each task. The INIT algorithm will remove exchange metabolites
   if any are present. Use importModel(file,false) to import a model with
   exchange metabolites remaining.

 Usage: [model, metProduction, essentialRxnsForTasks, addedRxnsForTasks,...
               deletedDeadEndRxns, deletedRxnsInINIT, taskReport]=...
               getINITModel(refModel, tissue, celltype, hpaData, arrayData,...
               metabolomicsData, taskFile, useScoresForTasks, printReport,...
               taskStructure, params, paramsFT)

CROSS-REFERENCE INFORMATION ^

This function calls: This function is called by:

SUBFUNCTIONS ^

SOURCE CODE ^

0001 function [model, metProduction, essentialRxnsForTasks, addedRxnsForTasks, deletedDeadEndRxns, deletedRxnsInINIT, taskReport]=getINITModel(refModel, tissue, celltype, hpaData, arrayData, metabolomicsData, taskFile, useScoresForTasks, printReport, taskStructure, params, paramsFT)
0002 % getINITModel_legacy
0003 %   Generates a model using the INIT algorithm, based on proteomics and/or
0004 %   transcriptomics and/or metabolomics and/or metabolic tasks. This is the original
0005 %   implementation of tINIT, which is replaced by ftINIT.
0006 %
0007 %   Input:
0008 %   refModel            a model structure. The model should be in the
0009 %                       closed form (no exchange reactions open). Import
0010 %                       using import(filename,false). If the model is not
0011 %                       loaded using importModel, it might be that there
0012 %                       is no "unconstrained" field. In that case,
0013 %                       manually add the field like:
0014 %                       model.unconstrained=false(numel(model.mets),1);
0015 %   tissue              tissue to score for. Should exist in either
0016 %                       hpaData.tissues or arrayData.tissues
0017 %   celltype            cell type to score for. Should exist in either
0018 %                       hpaData.celltypes or arrayData.celltypes for this
0019 %                       tissue (optional, default is to use the best values
0020 %                       among all the cell types for the tissue. Use [] if
0021 %                       you want to supply more arguments)
0022 %   hpaData             HPA data structure from parseHPA (optional if arrayData is
0023 %                       supplied, default [])
0024 %   arrayData           gene expression data structure (optional if hpaData is
0025 %                       supplied, default [])
0026 %       genes           cell array with the unique gene names
0027 %       tissues         cell array with the tissue names. The list may not be
0028 %                       unique, as there can be multiple cell types per tissue
0029 %       celltypes       cell array with the cell type names for each tissue
0030 %       levels          GENESxTISSUES array with the expression level for
0031 %                       each gene in each tissue/celltype. NaN should be
0032 %                       used when no measurement was performed
0033 %       threshold       a single value or a vector of gene expression
0034 %                       thresholds, above which genes are considered to be
0035 %                       "expressed". (optional, by default, the mean expression
0036 %                       levels of each gene across all tissues in arrayData
0037 %                       will be used as the threshold values)
0038 %       singleCells     binary value selecting whether to use the
0039 %                       single-cell algorithm to identify expressed genes.
0040 %                       If used, specify cell subpopulations in CELLTYPES
0041 %                       (optional, default [])
0042 %       plotResults     true if single cell probability distributions
0043 %                       should be plotted (optional, default = False)
0044 %   metabolomicsData    cell array with metabolite names that the model
0045 %                       should produce (optional, default [])
0046 %   taskFile            a task list in Excel format. See parseTaskList for
0047 %                       details (optional, default [])
0048 %   useScoresForTasks   true if the calculated reaction scored should be used as
0049 %                       weights in the fitting to tasks (optional, default true)
0050 %   printReport         true if a report should be printed to the screen
0051 %                       (optional, default true)
0052 %   taskStructure       task structure as from parseTaskList. Can be used
0053 %                       as an alternative way to define tasks when Excel
0054 %                       sheets are not suitable. Overrides taskFile (optional,
0055 %                       default [])
0056 %   params              parameter structure as used by getMILPParams. This is
0057 %                       for the INIT algorithm. For the the MILP problems
0058 %                       solved to fit tasks, see paramsFT (optional, default [])
0059 %   paramsFT            parameter structure as used by getMILPParams. This is
0060 %                       for the fitTasks step. For the INIT algorithm, see
0061 %                       params (optional, default [])
0062 %
0063 %
0064 %   Output:
0065 %   model                   the resulting model structure
0066 %   metProduction           array that indicates which of the
0067 %                           metabolites in metabolomicsData that could be
0068 %                           produced. Note that this is before the
0069 %                           gap-filling process to enable defined tasks. To
0070 %                           see which metabolites that can be produced in
0071 %                           the final model, use canProduce.
0072 %                           -2: metabolite name not found in model
0073 %                           -1: metabolite found, but it could not be produced
0074 %                           1: metabolite could be produced
0075 %   essentialRxnsForTasks   cell array of the reactions which were
0076 %                           essential to perform the tasks
0077 %   addedRxnsForTasks       cell array of the reactions which were added in
0078 %                           order to perform the tasks
0079 %   deletedDeadEndRxns      cell array of reactions deleted because they
0080 %                           could not carry flux (INIT requires a
0081 %                           functional input model)
0082 %   deletedRxnsInINIT       cell array of the reactions which were deleted by
0083 %                           the INIT algorithm
0084 %   taskReport              structure with the results for each task
0085 %       id                  cell array with the id of the task
0086 %       description         cell array with the description of the task
0087 %       ok                  boolean array with true if the task was successful
0088 %       essential           cell array with cell arrays of essential
0089 %                           reactions for the task
0090 %       gapfill             cell array of cell arrays of reactions included
0091 %                           in the gap-filling for the task
0092 %
0093 %   This is the main function for automatic reconstruction of models based
0094 %   on the (t)INIT algorithm (PLoS Comput Biol. 2012;8(5):e1002518,
0095 %   Mol Syst Biol. 2014;10:721). Not all settings are possible using this
0096 %   function, and you may want to call the functions scoreModel, runINIT
0097 %   and fitTasks individually instead.
0098 %
0099 %   NOTE: Exchange metabolites should normally not be removed from the model
0100 %   when using this approach, since checkTasks/fitTasks rely on putting specific
0101 %   constraints for each task. The INIT algorithm will remove exchange metabolites
0102 %   if any are present. Use importModel(file,false) to import a model with
0103 %   exchange metabolites remaining.
0104 %
0105 % Usage: [model, metProduction, essentialRxnsForTasks, addedRxnsForTasks,...
0106 %               deletedDeadEndRxns, deletedRxnsInINIT, taskReport]=...
0107 %               getINITModel(refModel, tissue, celltype, hpaData, arrayData,...
0108 %               metabolomicsData, taskFile, useScoresForTasks, printReport,...
0109 %               taskStructure, params, paramsFT)
0110 
0111 if nargin<3
0112     celltype=[];
0113 else
0114     celltype=char(celltype);
0115 end
0116 if nargin<4
0117     hpaData=[];
0118 end
0119 if nargin<5
0120     arrayData=[];
0121 end
0122 if nargin<6
0123     metabolomicsData=[];
0124 end
0125 if nargin<7
0126     taskFile=[];
0127 else
0128     taskFile=char(taskFile);
0129 end
0130 if nargin<8 || isempty(useScoresForTasks)
0131     useScoresForTasks=true;
0132 end
0133 if nargin<9 || isempty(printReport)
0134     printReport=true;
0135 end
0136 if nargin<10
0137     taskStructure=[];
0138 end
0139 if nargin<11
0140     params=[];
0141 end
0142 if nargin<12
0143     paramsFT=[];
0144 end
0145 
0146 %Check that the model is in the closed form
0147 if ~isfield(refModel,'unconstrained')
0148     EM='Exchange metabolites should normally not be removed from the model when using getINITModel. Use importModel(file,false) to import a model with exchange metabolites remaining (see the documentation for details)';
0149     dispEM(EM);
0150 end
0151 
0152 %Create the task structure if not supplied
0153 if any(taskFile) && isempty(taskStructure)
0154     taskStructure=parseTaskList(taskFile);
0155 end
0156 
0157 
0158 % sc-tINIT to identify confidence levels of gene expression
0159 if ~isempty(arrayData) && isfield(arrayData,'singleCells')
0160     if arrayData.singleCells == 1
0161         % Check to ensure cell type is defined
0162         if ~isfield(arrayData,'celltypes')
0163             dispEM('arrayData must contain cell type information if sc-tINIT is to be used','false');   
0164         end
0165         if ~ismember(upper(celltype),upper(arrayData.celltypes))
0166             dispEM('The cell type name does not match');   
0167         end
0168         
0169         % Analyze only cell type of interest
0170         J= strcmpi(arrayData.celltypes,celltype);
0171         
0172         % Analyze only genes included in the reference model
0173         I=ismember(arrayData.genes,refModel.genes);
0174         
0175         % Convert expression data to population fractions
0176         binary_levels = arrayData.levels(I,J)~=0; % Classify each gene as detected (1) or not (0) in each cell
0177         cell_count_levels = sum(binary_levels,2); % Number of cells expressing each transcript
0178         cell_frac_levels = cell_count_levels/size(binary_levels,2); % Number of cells expressing each transcript
0179 
0180         % Bin cell_frac_counts manually
0181         x = 0:.01:1;
0182         for(i = 1:length(x))
0183             cell_frac_count(i) = sum(round(cell_frac_levels,2)==x(i));
0184         end
0185         
0186         % Fit four beta distributions
0187         cell_frac_count(cell_frac_count==0) = NaN; % Remove zeros from optimization
0188         cell_frac_count(1) = NaN; % Remove non-expressed genes from optimization
0189         x_lim = 1; % Somewhat arbitrary parameter to fit left tail of distr.
0190         myfun = @(par) nansum((cell_frac_count(1:find(x>=x_lim,1)) - ...
0191             abs(par(1))*betapdf(x(1:find(x>=x_lim,1)),abs(par(2)),abs(par(3))) - ...
0192             abs(par(4))*betapdf(x(1:find(x>=x_lim,1)),abs(par(5)),abs(par(6))) - ...
0193             abs(par(7))*betapdf(x(1:find(x>=x_lim,1)),abs(par(8)),abs(par(9))) - ...
0194             abs(par(10))*betapdf(x(1:find(x>=x_lim,1)),abs(par(11)),abs(par(12)))).^2);
0195         
0196         par0 = [4,2,100,7,2,30,7,5,20,5,15,20];
0197         opts = optimset('Display','off');
0198         [par,f_val] = fminsearch(myfun,par0,opts);
0199         par = abs(par);
0200         
0201         % Plot results
0202         if (isfield(arrayData,'plotResults'))
0203             if arrayData.plotResults == true
0204                 figure(); hold on; plot(x,cell_frac_count,'ko','MarkerSize',5);
0205                 plot(x,abs(par(1))*betapdf(x,abs(par(2)),abs(par(3))),'b-','LineWidth',1)
0206                 plot(x,abs(par(4))*betapdf(x,abs(par(5)),abs(par(6))),'b-','LineWidth',1)
0207                 plot(x,abs(par(7))*betapdf(x,abs(par(8)),abs(par(9))),'b-','LineWidth',1)
0208                 plot(x,abs(par(10))*betapdf(x,abs(par(11)),abs(par(12))),'b-','LineWidth',1)
0209                 plot(x,abs(par(1))*betapdf(x,abs(par(2)),abs(par(3))) + ...
0210                     abs(par(4))*betapdf(x,abs(par(5)),abs(par(6))) + ...
0211                     abs(par(7))*betapdf(x,abs(par(8)),abs(par(9))) + ...
0212                     abs(par(10))*betapdf(x,abs(par(11)),abs(par(12))),'-','Color',[.5 .5 .5],'LineWidth',2)
0213                 xlabel('Expression Probability');ylabel('# of genes');set(gca,'FontSize',14,'LineWidth',1.25);
0214                 title('Expression prediction','FontSize',18,'FontWeight','bold')
0215             end
0216         end
0217         
0218         % Score genes based on population expression (p = .05)
0219         exprs_cutoff_1 = find(cdf('beta',x,par(2),par(3)) >.95,1)-1; % Find index of no confidence genes
0220         exprs_cutoff_2 = find(cdf('beta',x,par(5),par(6)) >.95,1)-1; % Find index of low confidence genes
0221         exprs_cutoff_3 = find(cdf('beta',x,par(8),par(9)) >.95,1)-1; % Find index of low confidence genes
0222         exprs_cutoffs = sort([exprs_cutoff_1,exprs_cutoff_2,exprs_cutoff_3]);
0223         gene_scores = cell_frac_levels*0;
0224         gene_scores(cell_frac_levels <= x(exprs_cutoffs(1))) = 4; % Not detected
0225         gene_scores(logical((cell_frac_levels >= x(exprs_cutoffs(1))).*(cell_frac_levels < x(exprs_cutoffs(2))))) = 3; % Low detection
0226         gene_scores(logical((cell_frac_levels >= x(exprs_cutoffs(2))).*(cell_frac_levels < x(exprs_cutoffs(3))))) = 2; % Medium detection
0227         gene_scores(cell_frac_levels > x(exprs_cutoffs(3))) = 1; % High detection
0228 
0229         % Replace hpaData with singleCellData
0230         if printReport==true
0231             dispEM('Single cell data is not currently compatible with HPA data. \n         Replacing hpaData with single cell-based scoring.',false);
0232         end
0233         hpaData.genes = arrayData.genes;
0234         hpaData.tissues = arrayData.tissues;
0235         hpaData.celltypes = arrayData.celltypes;
0236         hpaData.levels = [{'High'},{'Medium'},{'Low'},{'None'}];
0237         hpaData.gene2Level = zeros(length(arrayData.genes),length(arrayData.celltypes));
0238         for i = 1:length(find(J))
0239             find_var = find(J,i);
0240             hpaData.gene2Level(I,find_var(end)) = gene_scores;
0241         end
0242         
0243         % Remove arrayData from the analysis (Might be a bad idea)
0244         clear arrayData
0245         arrayData=[];
0246     end
0247 end
0248 
0249 
0250 if printReport==true
0251     if any(celltype)
0252         fprintf(['***Generating model for: ' tissue ' - ' celltype '\n']);
0253     else
0254         fprintf(['***Generating model for: ' tissue '\n']);
0255     end
0256     if ~isempty(hpaData)
0257         fprintf('-Using HPA data\n');
0258     end
0259     if ~isempty(arrayData)
0260         fprintf('-Using array data\n');
0261     end
0262     if ~isempty(metabolomicsData)
0263         fprintf('-Using metabolomics data\n');
0264     end
0265     if ~isempty(taskFile) || ~isempty(taskStructure)
0266         fprintf('-Using metabolic tasks\n');
0267     end
0268     fprintf('\n');
0269     
0270     printScores(refModel,'Reference model statistics',hpaData,arrayData,tissue,celltype);
0271 end
0272 
0273 %Remove dead-end reactions to speed up the optimization and to
0274 %differentiate between reactions removed by INIT and those that are
0275 %dead-end
0276 [~, deletedDeadEndRxns]=simplifyModel(refModel,true,false,true,true,true);
0277 cModel=removeReactions(refModel,deletedDeadEndRxns,false,true);
0278 
0279 %Store the connected model like this to keep track of stuff
0280 if printReport==true
0281     printScores(cModel,'Pruned model statistics',hpaData,arrayData,tissue,celltype);
0282 end
0283 
0284 %If tasks have been defined, then go through them and get essential
0285 %reactions
0286 if ~isempty(taskStructure)
0287     [taskReport, essentialRxnMat]=checkTasks(cModel,[],printReport,true,true,taskStructure);
0288     
0289     essentialRxnsForTasks=cModel.rxns(any(essentialRxnMat,2));
0290     
0291     %Remove tasks that cannot be performed
0292     taskStructure(taskReport.ok==false)=[];
0293     if printReport==true
0294         printScores(removeReactions(cModel,setdiff(cModel.rxns,essentialRxnsForTasks),true,true),'Reactions essential for tasks',hpaData,arrayData,tissue,celltype);
0295     end
0296 else
0297     essentialRxnsForTasks={};
0298 end
0299 
0300 %Score the connected model
0301 [rxnScores, geneScores]=scoreModel(cModel,hpaData,arrayData,tissue,celltype);
0302 
0303 %Run the INIT algorithm. The exchange reactions that are used in the final
0304 %reactions will be open, which doesn't fit with the last step. Therefore
0305 %delete reactions from the original model instead of taking the output. The
0306 %default implementation does not constrain reversible reactions to only
0307 %carry flux in one direction. Runs without the constraints on reversibility
0308 %and with all output allowed. This is to reduce the complexity of the
0309 %problem.
0310 [~, deletedRxnsInINIT, metProduction]=runINIT(simplifyModel(cModel),rxnScores,metabolomicsData,essentialRxnsForTasks,0,true,false,params);
0311 initModel=removeReactions(cModel,deletedRxnsInINIT,true,true);
0312 if printReport==true
0313     printScores(initModel,'INIT model statistics',hpaData,arrayData,tissue,celltype);
0314     printScores(removeReactions(cModel,setdiff(cModel.rxns,deletedRxnsInINIT),true,true),'Reactions deleted by INIT',hpaData,arrayData,tissue,celltype);
0315 end
0316 
0317 %The full model has exchange reactions in it. fitTasks calls on fillGaps,
0318 %which automatically removes exchange metabolites (because it assumes that
0319 %the reactions are constrained when appropriate). In this case the
0320 %uptakes/outputs are retrieved from the task sheet instead. To prevent
0321 %exchange reactions being used to fill gaps, they are delete from the
0322 %reference model here.
0323 initModel.id='INITModel';
0324 
0325 %If gaps in the model should be filled using a task list
0326 if ~isempty(taskStructure)
0327     %Remove exchange reactions and reactions already included in the INIT
0328     %model
0329     refModelNoExc=removeReactions(refModel,union(initModel.rxns,getExchangeRxns(refModel)),true,true);
0330     
0331     %At this stage the model is fully connected and most of the genes with
0332     %good scores should have been included. The final gap-filling should
0333     %take the scores of the genes into account, so that "rather bad"
0334     %reactions are preferred to "very bad" reactions. However, reactions
0335     %with positive scores will be included even if they are not connected
0336     %in the current formulation. Therefore, such reactions will have to be
0337     %assigned a small negative score instead.
0338     if useScoresForTasks==true
0339         refRxnScores=scoreModel(refModelNoExc,hpaData,arrayData,tissue,celltype);
0340         [outModel, addedRxnMat]=fitTasks(initModel,refModelNoExc,[],true,min(refRxnScores,-0.1),taskStructure,paramsFT);
0341     else
0342         [outModel, addedRxnMat]=fitTasks(initModel,refModelNoExc,[],true,[],taskStructure,paramsFT);
0343     end
0344     if printReport==true
0345         printScores(outModel,'Functional model statistics',hpaData,arrayData,tissue,celltype);
0346         printScores(removeReactions(outModel,intersect(outModel.rxns,initModel.rxns),true,true),'Reactions added to perform the tasks',hpaData,arrayData,tissue,celltype);
0347     end
0348     
0349     addedRxnsForTasks=refModelNoExc.rxns(any(addedRxnMat,2));
0350 else
0351     outModel=initModel;
0352     addedRxnMat=[];
0353     addedRxnsForTasks={};
0354 end
0355 
0356 %The model can now perform all the tasks defined in the task list. The
0357 %algorithm cannot deal with gene-complexes at the moment. It is therefore
0358 %ok to remove bad genes from a reaction (as long as at least one gene is
0359 %kept)
0360 model=outModel;
0361 
0362 [~, I]=ismember(model.genes,cModel.genes); %All should be found
0363 %This is a little weird way to make sure that only one bad gene is included
0364 %if there are no good ones (since all -Inf==max(-Inf))
0365 geneScores(isinf(geneScores))=-1000+rand(sum(isinf(geneScores)),1);
0366 
0367 model.grRules(:)={''};
0368 for i=1:numel(model.rxns)
0369     ids=find(model.rxnGeneMat(i,:));
0370     if numel(ids)>1
0371         scores=geneScores(I(ids));
0372         %Only keep the positive ones if possible
0373         model.rxnGeneMat(i,ids(~(scores>0 | scores==max(scores))))=0;
0374     end
0375     %Rewrite the grRules to be only OR
0376     if isfield(model,'grRules')
0377         J=find(model.rxnGeneMat(i,:));
0378         for j=1:numel(J)
0379             model.grRules{i}=[model.grRules{i} '(' model.genes{J(j)} ')'];
0380             if j<numel(J)
0381                 model.grRules{i}=[model.grRules{i} ' or '];
0382             end
0383         end
0384     end
0385 end
0386 
0387 %Find all genes that are not used and delete them
0388 I=sum(model.rxnGeneMat)==0;
0389 model.genes(I)=[];
0390 model.rxnGeneMat(:,I)=[];
0391 if isfield(model,'geneShortNames')
0392     model.geneShortNames(I)=[];
0393 end
0394 if isfield(model,'proteins')
0395     model.proteins(I)=[];
0396 end
0397 if isfield(model,'geneMiriams')
0398     model.geneMiriams(I)=[];
0399 end
0400 if isfield(model,'geneFrom')
0401     model.geneFrom(I)=[];
0402 end
0403 if isfield(model,'geneComps')
0404     model.geneComps(I)=[];
0405 end
0406 
0407 %At this stage the model will contain some exchange reactions but probably
0408 %not all (and maybe zero). This can be inconvenient, so all exchange
0409 %reactions from the reference model are added, except for those which
0410 %involve metabolites that are not in the model.
0411 
0412 %First delete and included exchange reactions in order to prevent the order
0413 %from changing
0414 model=removeReactions(model,getExchangeRxns(model));
0415 
0416 %Create a model with only the exchange reactions in refModel
0417 excModel=removeReactions(refModel,setdiff(refModel.rxns,getExchangeRxns(refModel)),true,true);
0418 
0419 %Find the metabolites there which are not exchange metabolites and which do
0420 %not exist in the output model
0421 I=~ismember(excModel.mets,model.mets) & excModel.unconstrained==0;
0422 
0423 %Then find those reactions and delete them
0424 [~, J]=find(excModel.S(I,:));
0425 excModel=removeReactions(excModel,J,true,true);
0426 
0427 %Merge with the output model
0428 model=mergeModels({model;excModel},'metNames');
0429 model.id='INITModel';
0430 model.name=['Automatically generated model for ' tissue];
0431 if any(celltype)
0432     model.name=[model.name ' - ' celltype];
0433 end
0434 
0435 if printReport==true
0436     printScores(model,'Final model statistics',hpaData,arrayData,tissue,celltype);
0437 end
0438 
0439 %Add information about essential reactions and reactions included for
0440 %gap-filling and return a taskReport
0441 if ~isempty(taskStructure)
0442     I=find(taskReport.ok); %Ignore failed tasks
0443     for i=1:numel(I)
0444         taskReport.essential{I(i),1}=cModel.rxns(essentialRxnMat(:,I(i)));
0445         taskReport.gapfill{I(i),1}=refModelNoExc.rxns(addedRxnMat(:,i));
0446     end
0447 else
0448     taskReport=[];
0449 end
0450 
0451 %Fix grRules and reconstruct rxnGeneMat
0452 [grRules,rxnGeneMat] = standardizeGrRules(model,true);
0453 model.grRules = grRules;
0454 model.rxnGeneMat = rxnGeneMat;
0455 end
0456 
0457 %This is for printing a summary of a model
0458 function [rxnS, geneS]=printScores(model,name,hpaData,arrayData,tissue,celltype)
0459 [a, b]=scoreModel(model,hpaData,arrayData,tissue,celltype);
0460 rxnS=mean(a);
0461 geneS=mean(b(~isinf(b)));
0462 fprintf([name ':\n']);
0463 fprintf(['\t' num2str(numel(model.rxns)) ' reactions, ' num2str(numel(model.genes)) ' genes\n']);
0464 fprintf(['\tMean reaction score: ' num2str(rxnS) '\n']);
0465 fprintf(['\tMean gene score: ' num2str(geneS) '\n']);
0466 fprintf(['\tReactions with positive scores: ' num2str(100*sum(a>0)/numel(a)) '%%\n\n']);
0467 end

Generated by m2html © 2005