getStandardKcat Calculate an standard kcat and standard molecular weight (MW) that can be used to apply enzyme constraints to reactions without any associated enzymes. Such reactions have either an empty model.grRules field, or they have no match in model.ec.rxns (which can be the case if the genes in the model.grRules field could not be mapped to enzymes). This is done by adding those reactions to model.ec, assign a "standard" pseudoenzyme with the standard MW (median of all proteins in the organism) and standard kcat (median from all kcat, or subsystem specific kcat). A reaction is assigned a subSystem specific kcat values if the model has a subSystems field and the reaction is annotated with a subSystem. Only the first subSystem will be considered if multiple are annotated to the same reaction. Exchange, transport and pseudoreactions are filtered out, plus any reaction identifiers specified in /data/pseudoRxns.tsv in the model adapter folder. In addition, reactions that are annotated with an enzyme (and therefore already in model.ec), but not assigned any reaction-specific kcat value (their model.ec.kcat entry is either 0 or NaN), can be assigned standard kcat values by a similar approach. However, those reactions will not be linked to the "standard" pseudoenzyme, but will use the enzyme that they had already been associated with. Any pre-existing standard kcat assignments (identified by 'standard' entires in model.ec.source) are removed when applying this function. Input: model an ecModel in GECKO 3 format (with ecModel.ec structure) modelAdapter a loaded model adapter (Optional, will otherwise use the default model adapter). threshold a threshold to determine when use a kcat value based on the mean kcat of the reactions in the same subSystem or based on the median value of all the kcat in the model. Second option is used when the number of reactions in a determined subSystem is < threshold. (Optional, default = 10) fillZeroKcat logical whether zero kcat values should be replaced with standard kcat values. (Optional, default = true). Output: model ecModel where model.ec is expanded with a standard protein with standard kcat and standard MW, assigned to reactions without gene associations. rxnsMissingGPR a list of updated rxns identifiers with a standard value standardMW the standard MW value calculated standardKcat the standard Kcat value calculated rxnsNoKcat a list of rxns identifiers whose zero kcat has been replaced While model.ec.kcat is populated, applyKcatConstraints would still need to be run to apply the new constraints to the S-matrix. Usage: [model, rxnsMissingGPR, standardMW, standardKcat, rxnsNoKcat] = getStandardKcat(model, modelAdapter, threshold, fillZeroKcat);
0001 function [model, rxnsMissingGPR, standardMW, standardKcat, rxnsNoKcat] = getStandardKcat(model, modelAdapter, threshold, fillZeroKcat) 0002 % getStandardKcat 0003 % Calculate an standard kcat and standard molecular weight (MW) that can 0004 % be used to apply enzyme constraints to reactions without any associated 0005 % enzymes. Such reactions have either an empty model.grRules field, or 0006 % they have no match in model.ec.rxns (which can be the case if the genes 0007 % in the model.grRules field could not be mapped to enzymes). This is 0008 % done by adding those reactions to model.ec, assign a "standard" 0009 % pseudoenzyme with the standard MW (median of all proteins in the 0010 % organism) and standard kcat (median from all kcat, or subsystem 0011 % specific kcat). 0012 % 0013 % A reaction is assigned a subSystem specific kcat values if the model 0014 % has a subSystems field and the reaction is annotated with a subSystem. 0015 % Only the first subSystem will be considered if multiple are annotated 0016 % to the same reaction. 0017 % 0018 % Exchange, transport and pseudoreactions are filtered out, plus any 0019 % reaction identifiers specified in /data/pseudoRxns.tsv in the model 0020 % adapter folder. 0021 % 0022 % In addition, reactions that are annotated with an enzyme (and therefore 0023 % already in model.ec), but not assigned any reaction-specific kcat value 0024 % (their model.ec.kcat entry is either 0 or NaN), can be assigned 0025 % standard kcat values by a similar approach. However, those reactions 0026 % will not be linked to the "standard" pseudoenzyme, but will use the 0027 % enzyme that they had already been associated with. 0028 % 0029 % Any pre-existing standard kcat assignments (identified by 'standard' 0030 % entires in model.ec.source) are removed when applying this function. 0031 % 0032 % Input: 0033 % model an ecModel in GECKO 3 format (with ecModel.ec structure) 0034 % modelAdapter a loaded model adapter (Optional, will otherwise use 0035 % the default model adapter). 0036 % threshold a threshold to determine when use a kcat value based on 0037 % the mean kcat of the reactions in the same subSystem or 0038 % based on the median value of all the kcat in the model. 0039 % Second option is used when the number of reactions in a 0040 % determined subSystem is < threshold. (Optional, default 0041 % = 10) 0042 % fillZeroKcat logical whether zero kcat values should be replaced 0043 % with standard kcat values. (Optional, default = true). 0044 % 0045 % Output: 0046 % model ecModel where model.ec is expanded with a standard 0047 % protein with standard kcat and standard MW, assigned to 0048 % reactions without gene associations. 0049 % rxnsMissingGPR a list of updated rxns identifiers with a standard value 0050 % standardMW the standard MW value calculated 0051 % standardKcat the standard Kcat value calculated 0052 % rxnsNoKcat a list of rxns identifiers whose zero kcat has been replaced 0053 % 0054 % While model.ec.kcat is populated, applyKcatConstraints would still need 0055 % to be run to apply the new constraints to the S-matrix. 0056 % 0057 % Usage: 0058 % [model, rxnsMissingGPR, standardMW, standardKcat, rxnsNoKcat] = getStandardKcat(model, modelAdapter, threshold, fillZeroKcat); 0059 0060 if nargin < 2 || isempty(modelAdapter) 0061 modelAdapter = ModelAdapterManager.getDefault(); 0062 if isempty(modelAdapter) 0063 error('Either send in a modelAdapter or set the default model adapter in the ModelAdapterManager.') 0064 end 0065 end 0066 params = modelAdapter.getParameters(); 0067 0068 if nargin < 3 || isempty(threshold) 0069 threshold = 10; 0070 end 0071 0072 if nargin < 4 || isempty(fillZeroKcat) 0073 fillZeroKcat = true; 0074 end 0075 0076 databases = loadDatabases('uniprot', modelAdapter); 0077 0078 % An standard MW is defined for all the rxns which does not have a GPR 0079 % rule defined. This is based in all the proteins reported for the specific 0080 % organism in uniprot 0081 standardMW = median(databases.uniprot.MW, 'omitnan'); 0082 0083 % An standard Kcat is defined for all the rxns which does not have a GPR 0084 % rule defined. In this case, the kcat value for a particular reaction is 0085 % defined as the mean of the kcat values of the reactions involved in the 0086 % same subsystem in which the given reaction is involved. Nevertheless, if 0087 % a subSystem have a number of reactions lower than a treshold, the kcat 0088 % value will be the median of the kcat in all the reactions of the model. 0089 0090 % Remove from the list those with kcat zero 0091 rxnsKcatZero = model.ec.kcat > 0; 0092 0093 % Get the kcat value based on all the kcats in the model 0094 standardKcat = median(model.ec.kcat(rxnsKcatZero), 'omitnan'); 0095 0096 % If the model have subSystems assigned calculate kcat based on subSystem 0097 if isfield(model,'subSystems') && ~all(cellfun(@isempty, model.subSystems)) 0098 standard = false; 0099 if model.ec.geckoLight 0100 modRxns = extractAfter(model.ec.rxns,4); 0101 else 0102 modRxns = model.ec.rxns; 0103 end 0104 % Map ec-rxns to model.rxns 0105 [~,rxnIdx] = ismember(modRxns,model.rxns); 0106 % Choose first subSystem 0107 enzSubSystems = flattenCell(model.subSystems(rxnIdx)); 0108 enzSubSystems = enzSubSystems(:,1); 0109 if ~all(cellfun(@isempty, enzSubSystems)) 0110 0111 % Make list of unique subsystems, and which rxns are linked to them 0112 [enzSubSystem_names, ~, rxnToSub] = unique(enzSubSystems); 0113 % Make matrix of ec-rxns vs. unique subsystem index 0114 ind = sub2ind([numel(enzSubSystem_names) numel(enzSubSystems)],rxnToSub',1:numel(rxnToSub)); 0115 kcatSubSystem = false([numel(enzSubSystem_names) numel(enzSubSystems)]); 0116 kcatSubSystem(ind) = true; 0117 % Number of kcats per subSystem 0118 kcatsPerSubSystem = sum(kcatSubSystem,2); 0119 % Calculate average kcat values per subSystem 0120 kcatSubSystem = (kcatSubSystem*model.ec.kcat)./kcatsPerSubSystem; 0121 kcatSubSystem(kcatsPerSubSystem < threshold) = standardKcat; 0122 else 0123 standard = true; 0124 printOrange('WARNING: No subSystem-specific kcat values can be calculated'); 0125 end 0126 else 0127 standard = true; 0128 printOrange('WARNING: No subSystem-specific kcat values can be calculated'); 0129 end 0130 0131 % Find reactions without GPR 0132 rxnsMissingGPR = find(cellfun(@isempty, model.grRules)); 0133 0134 % Find reactions with GPR but without model.ec entry (for instance due to 0135 % no protein matching) 0136 rxnsMissingEnzyme = find(~cellfun(@isempty, model.grRules)); 0137 if model.ec.geckoLight 0138 ecRxnsList = unique(extractAfter(model.ec.rxns,4)); 0139 else 0140 ecRxnsList = model.ec.rxns; 0141 end 0142 rxnsMissingEnzyme = find(and(~ismember(model.rxns(rxnsMissingEnzyme),ecRxnsList), ~contains(model.rxns(rxnsMissingEnzyme),'usage_prot_'))); 0143 rxnsMissingGPR = [rxnsMissingGPR;rxnsMissingEnzyme]; 0144 0145 % Get custom list of reaction IDs to ignore, if existing. First column 0146 % contains reaction IDs, second column contains reaction names for 0147 % reference only. 0148 if exist(fullfile(params.path,'data','pseudoRxns.tsv'),'file') 0149 fID = fopen(fullfile(params.path,'data','pseudoRxns.tsv')); 0150 fileData = textscan(fID,'%s %s','delimiter','\t'); 0151 fclose(fID); 0152 customRxns = fileData{1}; 0153 customRxns = find(ismember(model.rxns,customRxns)); 0154 else 0155 customRxns = []; 0156 end 0157 % Get and remove exchange, transport, spontaneous and pseudo reactions 0158 [~, exchangeRxns] = getExchangeRxns(model); 0159 transportRxns = getTransportRxns(model); 0160 [spontaneousRxns, ~] = modelAdapter.getSpontaneousReactions(model); 0161 pseudoRxns = contains(model.rxnNames,'pseudoreaction'); 0162 slimeRxns = contains(model.rxnNames,'SLIME rxn'); 0163 rxnsToIgnore = [customRxns; exchangeRxns; find(transportRxns); ... 0164 find(spontaneousRxns); find(pseudoRxns); find(slimeRxns)]; 0165 rxnsMissingGPR(ismember(rxnsMissingGPR, rxnsToIgnore)) = []; 0166 0167 % Only add if not geckoLight & getStandardKcat was not run earlier 0168 if ~any(strcmp(model.mets,'prot_standard')) 0169 % Add a new gene to be consistent with ec field named standard 0170 proteinStdGenes.genes = 'standard'; 0171 if isfield(model,'geneShortNames') 0172 proteinStdGenes.geneShortNames = 'std'; 0173 end 0174 model = addGenesRaven(model, proteinStdGenes); 0175 0176 if ~model.ec.geckoLight 0177 % Add a new metabolite named prot_standard 0178 proteinStdMets.mets = 'prot_standard'; 0179 proteinStdMets.metNames = proteinStdMets.mets; 0180 proteinStdMets.compartments = 'c'; 0181 if isfield(model,'metNotes') 0182 proteinStdMets.metNotes = 'Standard enzyme-usage pseudometabolite'; 0183 end 0184 model = addMets(model, proteinStdMets); 0185 0186 % Add a protein usage reaction if not a light version 0187 proteinStdUsageRxn.rxns = {'usage_prot_standard'}; 0188 proteinStdUsageRxn.rxnNames = proteinStdUsageRxn.rxns; 0189 proteinStdUsageRxn.mets = {proteinStdMets.mets, 'prot_pool'}; 0190 proteinStdUsageRxn.stoichCoeffs = [-1, 1]; 0191 proteinStdUsageRxn.lb = -1000; 0192 proteinStdUsageRxn.ub = 0; 0193 proteinStdUsageRxn.grRules = proteinStdGenes.genes; 0194 0195 model = addRxns(model, proteinStdUsageRxn); 0196 end 0197 % Update .ec structure in model 0198 model.ec.genes(end+1) = {'standard'}; 0199 model.ec.enzymes(end+1) = {'standard'}; 0200 model.ec.mw(end+1) = standardMW; 0201 model.ec.sequence(end+1) = {''}; 0202 % Additional info 0203 if isfield(model.ec,'concs') 0204 model.ec.concs(end+1) = nan(); 0205 end 0206 0207 % Expand the enzyme rxns matrix 0208 model.ec.rxnEnzMat = [model.ec.rxnEnzMat, zeros(length(model.ec.rxns), 1)]; % 1 new enzyme 0209 model.ec.rxnEnzMat = [model.ec.rxnEnzMat; zeros(length(rxnsMissingGPR), length(model.ec.enzymes))]; % new rxns 0210 end 0211 stdMetIdx = find(strcmpi(model.ec.enzymes, 'standard')); 0212 0213 % Remove previous standard kcat assignment 0214 oldStandardEnz = find(strcmp(model.ec.source,'standard')); 0215 if ~isempty(oldStandardEnz) 0216 oldStandardProt = logical(model.ec.rxnEnzMat(oldStandardEnz,stdMetIdx)); 0217 % If annotated with real enzyme => set kcat to zero 0218 model.ec.kcat(oldStandardEnz(~oldStandardProt)) = 0; 0219 model.ec.source(oldStandardEnz(~oldStandardProt)) = {''}; 0220 % If annotated with standard protein => remove entry 0221 model.ec.rxns(oldStandardEnz(oldStandardProt)) = []; 0222 model.ec.kcat(oldStandardEnz(oldStandardProt)) = []; 0223 model.ec.source(oldStandardEnz(oldStandardProt)) = []; 0224 model.ec.notes(oldStandardEnz(oldStandardProt)) = []; 0225 model.ec.eccodes(oldStandardEnz(oldStandardProt)) = []; 0226 model.ec.rxnEnzMat(oldStandardEnz(oldStandardProt),:) = []; 0227 end 0228 0229 numRxns = length(model.ec.rxns); 0230 for i = 1:numel(rxnsMissingGPR) 0231 rxnIdx = rxnsMissingGPR(i); 0232 0233 % Update .ec structure in model 0234 if ~model.ec.geckoLight 0235 model.ec.rxns(end+1) = model.rxns(rxnIdx); 0236 % Add prefix in case is light version 0237 else 0238 model.ec.rxns{end+1} = ['001_' model.rxns{rxnIdx}]; 0239 end 0240 0241 if ~standard 0242 kcatSubSystemIdx = strcmpi(enzSubSystem_names, model.subSystems{rxnIdx}(1)); 0243 if all(kcatSubSystemIdx) 0244 model.ec.kcat(end+1) = kcatSubSystem(kcatSubSystemIdx); 0245 else 0246 model.ec.kcat(end+1) = standardKcat; 0247 end 0248 else 0249 model.ec.kcat(end+1) = standardKcat; 0250 end 0251 0252 model.ec.source(end+1) = {'standard'}; 0253 model.ec.notes(end+1) = {''}; 0254 model.ec.eccodes(end+1) = {''}; 0255 0256 % Update the enzyme rxns matrix 0257 model.ec.rxnEnzMat(numRxns+i, stdMetIdx) = 1; 0258 end 0259 % Get the rxns identifiers of the updated rxns 0260 rxnsMissingGPR = model.rxns(rxnsMissingGPR); 0261 0262 if fillZeroKcat 0263 zeroKcat = model.ec.kcat == 0 | isnan(model.ec.kcat); 0264 model.ec.kcat(zeroKcat) = standardKcat; 0265 model.ec.source(zeroKcat) = {'standard'}; 0266 rxnsNoKcat = model.ec.rxns(zeroKcat); 0267 else 0268 rxnsNoKcat = []; 0269 end 0270 end 0271 0272 function Cflat = flattenCell(C,strFlag) 0273 %FLATTENCELL Flatten a nested column cell array into a matrix cell array. 0274 % 0275 % CFLAT = FLATTENCELL(C) takes a column cell array in which one or more 0276 % entries is a nested cell array, and flattens it into a 2D matrix cell 0277 % array, where the nested entries are spread across new columns. 0278 % 0279 % CFLAT = FLATTENCELL(C,STRFLAG) if STRFLAG is TRUE, empty entries in the 0280 % resulting CFLAT will be replaced with empty strings {''}. Default = FALSE 0281 if nargin < 2 0282 strFlag = false; 0283 end 0284 0285 % determine which entries are cells 0286 cells = cellfun(@iscell,C); 0287 0288 % determine number of elements in each nested cell 0289 cellsizes = cellfun(@numel,C); 0290 cellsizes(~cells) = 1; % ignore non-cell entries 0291 0292 % flatten single-entry cells 0293 Cflat = C; 0294 Cflat(cells & (cellsizes == 1)) = cellfun(@(x) x{1},Cflat(cells & (cellsizes == 1)),'UniformOutput',false); 0295 0296 % iterate through multi-entry cells 0297 multiCells = find(cellsizes > 1); 0298 for i = 1:length(multiCells) 0299 cellContents = Cflat{multiCells(i)}; 0300 Cflat(multiCells(i),1:length(cellContents)) = cellContents; 0301 end 0302 0303 % change empty elements to strings, if specified 0304 if ( strFlag ) 0305 Cflat(cellfun(@isempty,Cflat)) = {''}; 0306 end 0307 end