


getModelFromMetaCyc
Retrieves information stored in MetaCyc flat files and generates a super model
Input:
metacycPath By setting this parameter as empty (default ''), a
super model of MetaCyc can be directly generated from
the Matlab files (metaCycRxns, metaCycMets and metaCycEnzymes)
that are in the RAVEN\external\metacyc directory.
Alternatively, this function attempts to re-generate
the Matlab files from a local dump of MetaCyc data files
(e.g. reactions.dat, proteins.dat, compounds.dat),
to which the path should be defined by this parameter
keepTransportRxns include transportation reactions, which often have identical
reactants and products that turn to be all-zero columns in
the S matrix (optional, default false)
keepUnbalanced include reactions cannot be unbalanced reactions, usually
because they are polymeric reactions or because of a
specific difficulty in balancing class structures
(optional, default false)
keepUndetermined include reactions that have substrates lack chemical
structures or with non-numerical coefficients (e.g. n+1)
(optional, default false)
Output:
metaCycModel a model structure generated from MetaCyc database
including all reactions, metabolites and enzymes
in MetaCyc
NOTE: This function allows users to update the MetaCyc Matlab files from
a local dump of data files, which can be obtained through subscribing to
the database (https://metacyc.org/download.shtml).
Usage: getModelFromMetaCyc(metacycPath,keepTransportRxns,keepUnbalanced,keepUndetermined)

0001 function metaCycModel=getModelFromMetaCyc(metacycPath,keepTransportRxns,keepUnbalanced,keepUndetermined) 0002 % getModelFromMetaCyc 0003 % Retrieves information stored in MetaCyc flat files and generates a super model 0004 % 0005 % Input: 0006 % metacycPath By setting this parameter as empty (default ''), a 0007 % super model of MetaCyc can be directly generated from 0008 % the Matlab files (metaCycRxns, metaCycMets and metaCycEnzymes) 0009 % that are in the RAVEN\external\metacyc directory. 0010 % Alternatively, this function attempts to re-generate 0011 % the Matlab files from a local dump of MetaCyc data files 0012 % (e.g. reactions.dat, proteins.dat, compounds.dat), 0013 % to which the path should be defined by this parameter 0014 % keepTransportRxns include transportation reactions, which often have identical 0015 % reactants and products that turn to be all-zero columns in 0016 % the S matrix (optional, default false) 0017 % keepUnbalanced include reactions cannot be unbalanced reactions, usually 0018 % because they are polymeric reactions or because of a 0019 % specific difficulty in balancing class structures 0020 % (optional, default false) 0021 % keepUndetermined include reactions that have substrates lack chemical 0022 % structures or with non-numerical coefficients (e.g. n+1) 0023 % (optional, default false) 0024 % 0025 % Output: 0026 % metaCycModel a model structure generated from MetaCyc database 0027 % including all reactions, metabolites and enzymes 0028 % in MetaCyc 0029 % 0030 % NOTE: This function allows users to update the MetaCyc Matlab files from 0031 % a local dump of data files, which can be obtained through subscribing to 0032 % the database (https://metacyc.org/download.shtml). 0033 % 0034 % Usage: getModelFromMetaCyc(metacycPath,keepTransportRxns,keepUnbalanced,keepUndetermined) 0035 0036 if nargin<1 0037 ravenPath=findRAVENroot(); 0038 metacycPath=fullfile(ravenPath,'external','metacyc'); 0039 else 0040 metacycPath=char(metacycPath); 0041 end 0042 if nargin<2 0043 keepTransportRxns=false; 0044 end 0045 if nargin<3 0046 keepUnbalanced=false; 0047 end 0048 if nargin<4 0049 keepUndetermined=false; 0050 end 0051 0052 %First get all reactions 0053 metaCycModel=getRxnsFromMetaCyc(metacycPath,keepTransportRxns,keepUnbalanced,keepUndetermined); 0054 0055 %Get reaction and enzyme association 0056 metaCycEnzymes=getEnzymesFromMetaCyc(metacycPath); 0057 0058 %Replace rxnNames with those from metaCycEnzymes 0059 [a, b]=ismember(metaCycModel.rxns,metaCycEnzymes.rxns); 0060 a=find(a); 0061 b=b(a); 0062 metaCycModel.rxnNames(a)=metaCycEnzymes.rxnNames(b); 0063 0064 fprintf('Reorganizing reaction-enzyme associations... ') 0065 %Create the rxnGeneMat for the reactions, by geting all enzymes and 0066 %corresponding subunits 0067 rxnNum=numel(metaCycModel.rxns); 0068 metaCycModel.genes=metaCycEnzymes.enzymes; 0069 metaCycModel.rxnGeneMat=sparse(rxnNum,numel(metaCycEnzymes.enzymes)); 0070 metaCycModel.grRules=cell(rxnNum,1); 0071 0072 %Loop through all reactions to generate rxnGeneMat matrix and grRules This 0073 %step also cross-link reactions to their catalyzing enzymes 0074 for i=1:rxnNum 0075 0076 metaCycModel.grRules{i}=''; 0077 %Find out if this is an enzymatic reaction 0078 [a, b]=ismember(metaCycModel.rxns(i),metaCycEnzymes.rxns); 0079 if a 0080 I=[]; %Find out all catalyzing enzymes, which are treated as isoenzymes 0081 I=find(metaCycEnzymes.rxnEnzymeMat(b,:)); 0082 if ~isempty(I) 0083 0084 grRule=''; 0085 for j=1:numel(I) 0086 0087 subgrRule=''; %Find out if enzyme complex 0088 [c, d]=ismember(metaCycEnzymes.enzymes(I(j)),metaCycEnzymes.cplxs); 0089 if c %In cases of an enzyme complex 0090 %With single subunit 0091 if numel(metaCycEnzymes.cplxComp{d}.subunit)==1 0092 subgrRule=metaCycEnzymes.cplxComp{d}.subunit{1}; 0093 %With multiple subunits 0094 else 0095 subgrRule=strjoin(metaCycEnzymes.cplxComp{d}.subunit,' and '); 0096 subgrRule=strcat('(',subgrRule,')'); 0097 end 0098 [x, geneIndex]=ismember(metaCycEnzymes.cplxComp{d}.subunit,metaCycModel.genes); 0099 metaCycModel.rxnGeneMat(i,geneIndex)=1; 0100 0101 else %In cases of NOT an enzyme complex 0102 subgrRule=metaCycEnzymes.enzymes(I(j)); 0103 metaCycModel.rxnGeneMat(i,I(j))=1; 0104 end 0105 0106 %Generating grRules 0107 if ~strcmp(subgrRule,'') 0108 if ~strcmp(grRule,'') 0109 grRule=strcat(grRule,{' or '},subgrRule); 0110 else 0111 grRule=subgrRule; 0112 end 0113 end 0114 0115 end 0116 if iscell(grRule) 0117 metaCycModel.grRules{i}=grRule{1}; 0118 else 0119 metaCycModel.grRules{i}=grRule; 0120 end 0121 0122 end 0123 0124 end 0125 end 0126 fprintf('done\n') 0127 %Then get all metabolites 0128 metaCycMets=getMetsFromMetaCyc(metacycPath); 0129 0130 %Add information about all metabolites to the model 0131 [a, b]=ismember(metaCycModel.mets,metaCycMets.mets); 0132 a=find(a); 0133 b=b(a); 0134 0135 if ~isfield(metaCycModel,'metNames') 0136 metaCycModel.metNames=cell(numel(metaCycModel.mets),1); 0137 metaCycModel.metNames(:)={''}; 0138 end 0139 metaCycModel.metNames(a)=metaCycMets.metNames(b); 0140 0141 if ~isfield(metaCycModel,'metFormulas') 0142 metaCycModel.metFormulas=cell(numel(metaCycModel.mets),1); 0143 metaCycModel.metFormulas(:)={''}; 0144 end 0145 metaCycModel.metFormulas(a)=metaCycMets.metFormulas(b); 0146 0147 if ~isfield(metaCycModel,'metCharges') 0148 metaCycModel.metCharges=zeros(numel(metaCycModel.mets),1); 0149 end 0150 metaCycModel.metCharges(a)=metaCycMets.metCharges(b); 0151 0152 if ~isfield(metaCycModel,'inchis') 0153 metaCycModel.inchis=cell(numel(metaCycModel.mets),1); 0154 metaCycModel.inchis(:)={''}; 0155 end 0156 metaCycModel.inchis(a)=metaCycMets.inchis(b); 0157 0158 if ~isfield(metaCycModel,'metMiriams') 0159 metaCycModel.metMiriams=cell(numel(metaCycModel.mets),1); 0160 end 0161 metaCycModel.metMiriams(a)=metaCycMets.metMiriams(b); 0162 0163 if ~isfield(metaCycModel,'keggid') 0164 metaCycModel.keggid=cell(numel(metaCycModel.mets),1); 0165 end 0166 metaCycModel.keggid(a)=metaCycMets.keggid(b); 0167 0168 %Put all metabolites in one compartment called 's' (for system). This is 0169 %done just to be more compatible with the rest of the code 0170 metaCycModel.comps={'s'}; 0171 metaCycModel.compNames={'System'}; 0172 metaCycModel.metComps=ones(numel(metaCycModel.mets),1); 0173 0174 0175 %It could also be that the metabolite and reaction names are empty for some 0176 %reasons. In that case, use the ID instead 0177 I=cellfun(@isempty,metaCycModel.metNames); 0178 metaCycModel.metNames(I)=metaCycModel.mets(I); 0179 I=cellfun(@isempty,metaCycModel.rxnNames); 0180 metaCycModel.rxnNames(I)=metaCycModel.rxns(I); 0181 0182 end