Home > io > importModel.m

importModel

PURPOSE ^

importModel

SYNOPSIS ^

function model=importModel(fileName,removeExcMets,isSBML2COBRA,supressWarnings)

DESCRIPTION ^

 importModel
   Import a constraint-based model from a SBML file

   Input:
   fileName        a SBML file to import. A dialog window will open if 
                   no file name is specified.
   removeExcMets   true if exchange metabolites should be removed. This is
                   needed to be able to run simulations, but it could also
                   be done using simplifyModel at a later stage (opt,
                   default true)
   isSBML2COBRA    true if the SBML file is in the old COBRA Toolbox format
                   (SBML Level 2) (opt, default false)
   supressWarnings true if warnings regarding the model structure should
                   be supressed (opt, default false)

   Output:
   model
       id               model ID
       name      name of model contents
       annotation       additional information about model
       rxns             reaction ids
       mets             metabolite ids
       S                stoichiometric matrix
       lb               lower bounds
       ub               upper bounds
       rev              reversibility vector
       c                objective coefficients
       b                equality constraints for the metabolite equations
       comps            compartment ids
       compNames        compartment names
       compOutside      the id (as in comps) for the compartment
                        surrounding each of the compartments
       compMiriams      structure with MIRIAM information about the
                        compartments
       rxnNames         reaction description
       rxnComps         compartments for reactions
       grRules          reaction to gene rules in text form
       rxnGeneMat       reaction-to-gene mapping in sparse matrix form
       subSystems       subsystem name for each reaction
       eccodes          EC-codes for the reactions
       rxnMiriams       structure with MIRIAM information about the reactions
       rxnNotes         reaction notes
       rxnReferences    reaction references
       rxnConfidenceScores reaction confidence scores
       genes            list of all genes
       geneComps        compartments for genes
       geneMiriams      structure with MIRIAM information about the genes
       geneShortNames   gene alternative names (e.g. ERG10)
       metNames         metabolite description
       metComps         compartments for metabolites
       inchis           InChI-codes for metabolites
       metFormulas      metabolite chemical formula
       metMiriams       structure with MIRIAM information about the metabolites
       metCharges       metabolite charge
       unconstrained    true if the metabolite is an exchange metabolite

   Loads models in the COBRA Toolbox format and in the format used in
   the yeast consensus reconstruction. The resulting structure is compatible
   with COBRA Toolbox. A number of consistency checks are performed in order
   to ensure that the model is valid. Take these warnings seriously and modify the
   model structure to solve them. The RAVEN Toolbox is made to function
   only on consistent models, and the only checks performed are when the
   model is imported. You can use exportToExcelFormat, modify the model in
   Microsoft Excel and then reimport it using importExcelModel (or remake
   the SBML file using SBMLFromExcel)

   NOTE: This script requires that libSBML is installed.

   NOTE: All IDs are assumed to be named C_, M_, E_, R_ for compartments,
         metabolites, genes, and reactions. This is true for models
         generated by SBMLFromExcel and those that follow the yeast
         consensus network model formulation.

   Usage: model=importModel(fileName,removeExcMets,isSBML2COBRA,supressWarnings)

CROSS-REFERENCE INFORMATION ^

This function calls: This function is called by:

SUBFUNCTIONS ^

SOURCE CODE ^

0001 function model=importModel(fileName,removeExcMets,isSBML2COBRA,supressWarnings)
0002 % importModel
0003 %   Import a constraint-based model from a SBML file
0004 %
0005 %   Input:
0006 %   fileName        a SBML file to import. A dialog window will open if
0007 %                   no file name is specified.
0008 %   removeExcMets   true if exchange metabolites should be removed. This is
0009 %                   needed to be able to run simulations, but it could also
0010 %                   be done using simplifyModel at a later stage (opt,
0011 %                   default true)
0012 %   isSBML2COBRA    true if the SBML file is in the old COBRA Toolbox format
0013 %                   (SBML Level 2) (opt, default false)
0014 %   supressWarnings true if warnings regarding the model structure should
0015 %                   be supressed (opt, default false)
0016 %
0017 %   Output:
0018 %   model
0019 %       id               model ID
0020 %       name      name of model contents
0021 %       annotation       additional information about model
0022 %       rxns             reaction ids
0023 %       mets             metabolite ids
0024 %       S                stoichiometric matrix
0025 %       lb               lower bounds
0026 %       ub               upper bounds
0027 %       rev              reversibility vector
0028 %       c                objective coefficients
0029 %       b                equality constraints for the metabolite equations
0030 %       comps            compartment ids
0031 %       compNames        compartment names
0032 %       compOutside      the id (as in comps) for the compartment
0033 %                        surrounding each of the compartments
0034 %       compMiriams      structure with MIRIAM information about the
0035 %                        compartments
0036 %       rxnNames         reaction description
0037 %       rxnComps         compartments for reactions
0038 %       grRules          reaction to gene rules in text form
0039 %       rxnGeneMat       reaction-to-gene mapping in sparse matrix form
0040 %       subSystems       subsystem name for each reaction
0041 %       eccodes          EC-codes for the reactions
0042 %       rxnMiriams       structure with MIRIAM information about the reactions
0043 %       rxnNotes         reaction notes
0044 %       rxnReferences    reaction references
0045 %       rxnConfidenceScores reaction confidence scores
0046 %       genes            list of all genes
0047 %       geneComps        compartments for genes
0048 %       geneMiriams      structure with MIRIAM information about the genes
0049 %       geneShortNames   gene alternative names (e.g. ERG10)
0050 %       metNames         metabolite description
0051 %       metComps         compartments for metabolites
0052 %       inchis           InChI-codes for metabolites
0053 %       metFormulas      metabolite chemical formula
0054 %       metMiriams       structure with MIRIAM information about the metabolites
0055 %       metCharges       metabolite charge
0056 %       unconstrained    true if the metabolite is an exchange metabolite
0057 %
0058 %   Loads models in the COBRA Toolbox format and in the format used in
0059 %   the yeast consensus reconstruction. The resulting structure is compatible
0060 %   with COBRA Toolbox. A number of consistency checks are performed in order
0061 %   to ensure that the model is valid. Take these warnings seriously and modify the
0062 %   model structure to solve them. The RAVEN Toolbox is made to function
0063 %   only on consistent models, and the only checks performed are when the
0064 %   model is imported. You can use exportToExcelFormat, modify the model in
0065 %   Microsoft Excel and then reimport it using importExcelModel (or remake
0066 %   the SBML file using SBMLFromExcel)
0067 %
0068 %   NOTE: This script requires that libSBML is installed.
0069 %
0070 %   NOTE: All IDs are assumed to be named C_, M_, E_, R_ for compartments,
0071 %         metabolites, genes, and reactions. This is true for models
0072 %         generated by SBMLFromExcel and those that follow the yeast
0073 %         consensus network model formulation.
0074 %
0075 %   Usage: model=importModel(fileName,removeExcMets,isSBML2COBRA,supressWarnings)
0076 if nargin<1 || isempty(fileName)
0077     [fileName, pathName] = uigetfile({'*.xml;*.sbml'}, 'Please select the model file');
0078     if fileName == 0
0079         error('You should select a model file')
0080     else
0081         fileName = fullfile(pathName,fileName);
0082     end
0083 end
0084 fileName=char(fileName);
0085 if nargin<2
0086     removeExcMets=true;
0087 end
0088 
0089 if nargin<3
0090     isSBML2COBRA=false;
0091 end
0092 
0093 if nargin<4
0094     supressWarnings=false;
0095 end
0096 
0097 if ~isfile(fileName)
0098     error('SBML file %s cannot be found',string(fileName));
0099 end
0100 
0101 %This is to match the order of the fields to those you get from importing
0102 %from Excel
0103 model=[];
0104 model.id=[];
0105 model.name=[];
0106 model.annotation=[];
0107 model.rxns={};
0108 model.mets={};
0109 model.S=[];
0110 model.lb=[];
0111 model.ub=[];
0112 model.rev=[];
0113 model.c=[];
0114 model.b=[];
0115 model.comps={};
0116 model.compNames={};
0117 model.compOutside={};
0118 model.compMiriams={};
0119 model.rxnNames={};
0120 model.rxnComps=[];
0121 model.grRules={};
0122 model.rxnGeneMat=[];
0123 model.subSystems={};
0124 model.eccodes={};
0125 model.rxnMiriams={};
0126 model.rxnNotes={};
0127 model.rxnReferences={};
0128 model.rxnConfidenceScores=[];
0129 model.genes={};
0130 model.geneComps=[];
0131 model.geneMiriams={};
0132 model.geneShortNames={};
0133 model.metNames={};
0134 model.metComps=[];
0135 model.inchis={};
0136 model.metFormulas={};
0137 model.metMiriams={};
0138 model.metCharges=[];
0139 model.unconstrained=[];
0140 
0141 %Load the model using libSBML
0142 [ravenDir,prevDir]=findRAVENroot();
0143 fileName=checkFileExistence(fileName,1);
0144 cd(fullfile(ravenDir,'software','libSBML'));
0145 modelSBML = TranslateSBML(fileName,0,0,[1 1]);
0146 cd(prevDir);
0147 
0148 if isempty(modelSBML)
0149     EM='There is a problem with the SBML file. Try using the SBML Validator at http://sbml.org/Facilities/Validator';
0150     dispEM(EM);
0151 end
0152 
0153 %Remove the preceding strings for reactions, compartments and
0154 %reactants/products in 'reaction' field. The strings for metabolites, genes
0155 %and complexes are not removed, as we will need them later to identify them
0156 %from 'species' field
0157 for i=1:numel(modelSBML.reaction)
0158     modelSBML.reaction(i).name=regexprep(modelSBML.reaction(i).name,'^R_','');
0159     modelSBML.reaction(i).id=regexprep(modelSBML.reaction(i).id,'^R_','');
0160     if isfield(modelSBML.reaction(i),'compartment')
0161         modelSBML.reaction(i).compartment=regexprep(modelSBML.reaction(i).compartment,'^C_','');
0162     end
0163     for j=1:numel(modelSBML.reaction(i).reactant)
0164         modelSBML.reaction(i).reactant(j).species=regexprep(modelSBML.reaction(i).reactant(j).species,'^M_','');
0165     end
0166     for j=1:numel(modelSBML.reaction(i).product)
0167         modelSBML.reaction(i).product(j).species=regexprep(modelSBML.reaction(i).product(j).species,'^M_','');
0168     end
0169 end
0170 
0171 %Retrieve compartment names and IDs
0172 compartmentNames=cell(numel(modelSBML.compartment),1);
0173 compartmentIDs=cell(numel(modelSBML.compartment),1);
0174 compartmentOutside=cell(numel(modelSBML.compartment),1);
0175 compartmentMiriams=cell(numel(modelSBML.compartment),1);
0176 
0177 if isfield(modelSBML.compartment,'sboTerm') && numel(unique([modelSBML.compartment.sboTerm])) == 1
0178     %If all the SBO terms are identical, don't add them to compMiriams
0179     modelSBML.compartment = rmfield(modelSBML.compartment,'sboTerm');
0180 end
0181     
0182 for i=1:numel(modelSBML.compartment)
0183     compartmentNames{i}=modelSBML.compartment(i).name;
0184     compartmentIDs{i}=regexprep(modelSBML.compartment(i).id,'^C_','');
0185     if isfield(modelSBML.compartment(i),'outside')
0186         if ~isempty(modelSBML.compartment(i).outside)
0187             compartmentOutside{i}=regexprep(modelSBML.compartment(i).outside,'^C_','');
0188         else
0189             compartmentOutside{i}='';
0190         end
0191     else
0192         compartmentOutside{i}=[];
0193     end
0194     
0195     if isfield(modelSBML.compartment(i),'annotation')
0196         compartmentMiriams{i}=parseMiriam(modelSBML.compartment(i).annotation);
0197     else
0198         compartmentMiriams{i}=[];
0199     end
0200     
0201     if isfield(modelSBML.compartment(i),'sboTerm') && ~(modelSBML.compartment(i).sboTerm==-1)
0202         compartmentMiriams{i} = addSBOtoMiriam(compartmentMiriams{i},modelSBML.compartment(i).sboTerm);
0203     end
0204 end
0205 
0206 %If there are no compartment names then use compartment id as name
0207 if all(cellfun(@isempty,compartmentNames))
0208     compartmentNames=compartmentIDs;
0209 end
0210 
0211 %Retrieve info on metabolites, genes, complexes
0212 metaboliteNames={};
0213 metaboliteIDs={};
0214 metaboliteCompartments={};
0215 metaboliteUnconstrained=[];
0216 metaboliteFormula={};
0217 metaboliteInChI={};
0218 metaboliteMiriams={};
0219 metaboliteCharges=[];
0220 
0221 geneNames={};
0222 geneIDs={};
0223 geneMiriams={};
0224 geneShortNames={};
0225 geneCompartments={};
0226 complexIDs={};
0227 complexNames={};
0228 
0229 %If the file is not a COBRA Toolbox model. According to the format
0230 %specified in the yeast consensus model both metabolites and genes are a
0231 %type of 'species'. The metabolites have names starting with 'M_' and genes
0232 %with 'E_'
0233 geneSBOs = [];
0234 metSBOs = [];
0235 %Regex of compartment names, later to be used to remove from metabolite
0236 %names if present as suffix.
0237 regexCompNames = ['\s?\[((' strjoin({modelSBML.compartment.name},')|(') '))\]$'];
0238 for i=1:numel(modelSBML.species)
0239     if ~isSBML2COBRA
0240         if length(modelSBML.species(i).id)>=2 && strcmpi(modelSBML.species(i).id(1:2),'E_')
0241             geneNames{numel(geneNames)+1,1}=modelSBML.species(i).name;
0242             
0243             %The "E_" is included in the ID. This is because it's only used
0244             %internally in this file and it makes the matching a little
0245             %smoother
0246             geneIDs{numel(geneIDs)+1,1}=modelSBML.species(i).id;
0247             geneCompartments{numel(geneCompartments)+1,1}=regexprep(modelSBML.species(i).compartment,'^C_','');
0248             
0249             %Get Miriam structure
0250             if isfield(modelSBML.species(i),'annotation')
0251                 %Get Miriam info
0252                 geneMiriam=parseMiriam(modelSBML.species(i).annotation);
0253                 geneMiriams{numel(geneMiriams)+1,1}=geneMiriam;
0254             else
0255                 geneMiriams{numel(geneMiriams)+1,1}=[];
0256             end
0257             
0258             %Protein short names (for example ERG10) are saved as SHORT
0259             %NAME: NAME in the notes-section of metabolites for SBML Level
0260             %2 and as PROTEIN_ASSOCIATION for each reaction in SBML Level 2
0261             %COBRA Toolbox format. For now only the SHORT NAME is loaded
0262             %and no mapping takes place
0263             if isfield(modelSBML.species(i),'notes')
0264                 geneShortNames{numel(geneShortNames)+1,1}=parseNote(modelSBML.species(i).notes,'SHORT NAME');
0265             else
0266                 geneShortNames{numel(geneShortNames)+1,1}='';
0267             end
0268             
0269             %Get SBO term
0270             if isfield(modelSBML.species(i),'sboTerm') && ~(modelSBML.species(i).sboTerm==-1)
0271                 geneSBOs(end+1,1) = modelSBML.species(i).sboTerm;
0272             end
0273         elseif length(modelSBML.species(i).id)>=2 && strcmpi(modelSBML.species(i).id(1:3),'Cx_')
0274             %If it's a complex keep the ID and name
0275             complexIDs=[complexIDs;modelSBML.species(i).id];
0276             complexNames=[complexNames;modelSBML.species(i).name];
0277         else
0278             %If it is not gene or complex, then it must be a metabolite
0279             metaboliteNames{numel(metaboliteNames)+1,1}=modelSBML.species(i).name;
0280             metaboliteIDs{numel(metaboliteIDs)+1,1}=regexprep(modelSBML.species(i).id,'^M_','');
0281             metaboliteCompartments{numel(metaboliteCompartments)+1,1}=regexprep(modelSBML.species(i).compartment,'^C_','');
0282             metaboliteUnconstrained(numel(metaboliteUnconstrained)+1,1)=modelSBML.species(i).boundaryCondition;
0283             
0284             %For each metabolite retrieve the formula and the InChI code if
0285             %available First add the InChI code and the formula from the
0286             %InChI. This allows for overwriting the formula by setting the
0287             %actual formula field
0288             if ~isempty(modelSBML.species(i).annotation)
0289                 %Get the formula if available
0290                 startString='>InChI=';
0291                 endString='</in:inchi>';
0292                 formStart=strfind(modelSBML.species(i).annotation,startString);
0293                 if isempty(formStart)
0294                     startString='InChI=';
0295                     endString='"/>';
0296                 end
0297                 formStart=strfind(modelSBML.species(i).annotation,startString);
0298                 if ~isempty(formStart)
0299                     formEnd=strfind(modelSBML.species(i).annotation,endString);
0300                     formEndIndex=find(formEnd>formStart, 1 );
0301                     formula=modelSBML.species(i).annotation(formStart+numel(startString):formEnd(formEndIndex)-1);
0302                     metaboliteInChI{numel(metaboliteInChI)+1,1}=formula;
0303                     
0304                     %The composition is most often present between the
0305                     %first and second "/" in the model. In some simple
0306                     %molecules, such as salts, there is no second "/". The
0307                     %formula is then assumed to be to the end of the string
0308                     compositionIndexes=strfind(formula,'/');
0309                     if numel(compositionIndexes)>1
0310                         metaboliteFormula{numel(metaboliteFormula)+1,1}=...
0311                             formula(compositionIndexes(1)+1:compositionIndexes(2)-1);
0312                     else
0313                         if numel(compositionIndexes)==1
0314                             %Probably a simple molecule which can have only
0315                             %one conformation
0316                             metaboliteFormula{numel(metaboliteFormula)+1,1}=...
0317                                 formula(compositionIndexes(1)+1:numel(formula));
0318                         else
0319                             metaboliteFormula{numel(metaboliteFormula)+1,1}='';
0320                         end
0321                     end
0322                 elseif isfield(modelSBML.species(i),'fbc_chemicalFormula')
0323                     metaboliteInChI{numel(metaboliteInChI)+1,1}='';
0324                     if ~isempty(modelSBML.species(i).fbc_chemicalFormula)
0325                         %Cannot extract InChi from formula, so remains
0326                         %empty
0327                         metaboliteFormula{numel(metaboliteFormula)+1,1}=modelSBML.species(i).fbc_chemicalFormula;
0328                     else
0329                         metaboliteFormula{numel(metaboliteFormula)+1,1}='';
0330                     end
0331                 else
0332                     metaboliteInChI{numel(metaboliteInChI)+1,1}='';
0333                     metaboliteFormula{numel(metaboliteFormula)+1,1}='';
0334                 end
0335                 
0336                 %Get Miriam info
0337                 metMiriam=parseMiriam(modelSBML.species(i).annotation);
0338                 metaboliteMiriams{numel(metaboliteMiriams)+1,1}=metMiriam;
0339             else
0340                 metaboliteInChI{numel(metaboliteInChI)+1,1}='';
0341                 if isfield(modelSBML.species(i),'notes')
0342                     metaboliteFormula{numel(metaboliteFormula)+1,1}=parseNote(modelSBML.species(i).notes,'FORMULA');
0343                 else
0344                     metaboliteFormula{numel(metaboliteFormula)+1,1}='';
0345                 end
0346                 metaboliteMiriams{numel(metaboliteMiriams)+1,1}=[];
0347             end
0348             if ~isempty(modelSBML.species(i).notes)
0349                 if ~isfield(modelSBML.species(i),'annotation')
0350                     metaboliteFormula{numel(metaboliteFormula)+1,1}=parseNote(modelSBML.species(i).notes,'FORMULA');
0351                 end
0352             elseif ~isfield(modelSBML.species(i),'annotation')
0353                 metaboliteFormula{numel(metaboliteFormula)+1,1}='';
0354             end
0355             %Get SBO term
0356             if isfield(modelSBML.species(i),'sboTerm') && ~(modelSBML.species(i).sboTerm==-1)
0357                 metSBOs(end+1,1) = modelSBML.species(i).sboTerm;
0358             end
0359         end
0360         
0361     elseif isSBML2COBRA
0362         %The metabolite names are assumed to be M_NAME_COMPOSITION or
0363         %_NAME_COMPOSITION or NAME_COMPOSITION or NAME. Regular expressions
0364         %are used that only NAME_COMPOSITION or NAME would be possible
0365         
0366         modelSBML.species(i).name=regexprep(modelSBML.species(i).name,'^M_','');
0367         modelSBML.species(i).name=regexprep(modelSBML.species(i).name,'^_','');
0368         underscoreIndex=strfind(modelSBML.species(i).name,'_');
0369         
0370         metaboliteNames{numel(metaboliteNames)+1,1}=modelSBML.species(i).name;
0371         
0372         metaboliteIDs{numel(metaboliteIDs)+1,1}=regexprep(modelSBML.species(i).id,'^M_','');
0373         metaboliteCompartments{numel(metaboliteCompartments)+1,1}=regexprep(modelSBML.species(i).compartment,'^C_','');
0374         
0375         %I think that COBRA doesn't set the boundary condition, but rather
0376         %uses name_b. Check for either
0377         metaboliteUnconstrained(numel(metaboliteUnconstrained)+1,1)=modelSBML.species(i).boundaryCondition;
0378         if strcmp(metaboliteIDs{end}(max(end-1,1):end),'_b')
0379             metaboliteUnconstrained(end)=1;
0380         end
0381         
0382         %Get the formula
0383         if max(underscoreIndex)<length(modelSBML.species(i).name)
0384             metaboliteFormula{numel(metaboliteFormula)+1,1}=modelSBML.species(i).name(max(underscoreIndex)+1:length(modelSBML.species(i).name));
0385         else
0386             metaboliteFormula{numel(metaboliteFormula)+1,1}='';
0387         end
0388         
0389         %The old COBRA version sometimes has composition information in the
0390         %notes instead
0391         if isfield(modelSBML.species(i),'notes') && ~isempty(parseNote(modelSBML.species(i).notes,'FORMULA'))
0392             metaboliteFormula{numel(metaboliteFormula)+1,1}=parseNote(modelSBML.species(i).notes,'FORMULA');
0393         end
0394         
0395         %Get Miriam info
0396         if ~isempty(modelSBML.species(i).annotation)
0397             metMiriam=parseMiriam(modelSBML.species(i).annotation);    
0398         else
0399             metMiriam=[];
0400         end
0401         metaboliteMiriams{numel(metaboliteMiriams)+1,1}=metMiriam;
0402         
0403         %Get SBO term
0404         if isfield(modelSBML.species(i),'sboTerm') && ~(modelSBML.species(i).sboTerm==-1)
0405             metSBOs(end+1,1) = modelSBML.species(i).sboTerm;
0406         end
0407     end
0408     %The following lines are executed regardless isSBML2COBRA setting
0409     if isempty(modelSBML.species(i).id) || ~strcmpi(modelSBML.species(i).id(1:2),'E_')
0410         if isempty(modelSBML.species(i).id) || ~strcmpi(modelSBML.species(i).id(1:3),'Cx_')
0411             %Remove trailing [compartment] from metabolite name if present
0412             metaboliteNames{end,1}=regexprep(metaboliteNames{end,1},regexCompNames,'');
0413             metaboliteNames{end,1}=regexprep(metaboliteNames{end,1},'^M_','');
0414             if isfield(modelSBML.species(i),'fbc_charge')
0415                 if ~isempty(modelSBML.species(i).fbc_charge) && modelSBML.species(i).isSetfbc_charge
0416                     metaboliteCharges(numel(metaboliteCharges)+1,1)=double(modelSBML.species(i).fbc_charge);
0417                 else
0418                     if isfield(modelSBML.species(i),'notes')
0419                         if strfind(modelSBML.species(i).notes,'CHARGE')
0420                             metaboliteCharges(numel(metaboliteCharges)+1,1)=str2double(parseNote(modelSBML.species(i).notes,'CHARGE'));
0421                         else
0422                             metaboliteCharges(numel(metaboliteCharges)+1,1)=NaN;
0423                         end
0424                     else
0425                         metaboliteCharges(numel(metaboliteCharges)+1,1)=NaN;
0426                     end
0427                 end
0428             elseif isfield(modelSBML.species(i),'notes')
0429                 if strfind(modelSBML.species(i).notes,'CHARGE')
0430                     metaboliteCharges(numel(metaboliteCharges)+1,1)=str2double(parseNote(modelSBML.species(i).notes,'CHARGE'));
0431                 else
0432                     metaboliteCharges(numel(metaboliteCharges)+1,1)=NaN;
0433                 end
0434             else
0435                 metaboliteCharges(numel(metaboliteCharges)+1,1)=NaN;
0436             end
0437             %Additional information from FBC format Chemical formula
0438             if isfield(modelSBML.species(i),'fbc_chemicalFormula')
0439                 if ~isempty(modelSBML.species(i).fbc_chemicalFormula)
0440                     metaboliteFormula{numel(metaboliteFormula),1}=modelSBML.species(i).fbc_chemicalFormula;
0441                 end
0442             end
0443         end
0444     end
0445 end
0446 
0447 %Add SBO terms to gene and metabolite miriam fields
0448 if numel(unique(geneSBOs)) > 1  % don't add if they're all identical
0449     for i = 1:numel(geneNames)
0450         geneMiriams{i} = addSBOtoMiriam(geneMiriams{i},geneSBOs(i));
0451     end
0452 end
0453 if numel(unique(metSBOs)) > 1
0454     for i = 1:numel(metaboliteNames)
0455         metaboliteMiriams{i} = addSBOtoMiriam(metaboliteMiriams{i},metSBOs(i));
0456     end
0457 end
0458 
0459 %Retrieve info on reactions
0460 reactionNames=cell(numel(modelSBML.reaction),1);
0461 reactionIDs=cell(numel(modelSBML.reaction),1);
0462 subsystems=cell(numel(modelSBML.reaction),1);
0463 eccodes=cell(numel(modelSBML.reaction),1);
0464 eccodes(:,:)=cellstr('');
0465 rxnconfidencescores=NaN(numel(modelSBML.reaction),1);
0466 rxnreferences=cell(numel(modelSBML.reaction),1);
0467 rxnreferences(:,:)=cellstr('');
0468 rxnnotes=cell(numel(modelSBML.reaction),1);
0469 rxnnotes(:,:)=cellstr('');
0470 grRules=cell(numel(modelSBML.reaction),1);
0471 grRules(:,:)=cellstr('');
0472 grRulesFromModifier=grRules;
0473 rxnComps=zeros(numel(modelSBML.reaction),1);
0474 rxnMiriams=cell(numel(modelSBML.reaction),1);
0475 reactionReversibility=zeros(numel(modelSBML.reaction),1);
0476 reactionUB=zeros(numel(modelSBML.reaction),1);
0477 reactionLB=zeros(numel(modelSBML.reaction),1);
0478 reactionObjective=zeros(numel(modelSBML.reaction),1);
0479 
0480 %Construct the stoichiometric matrix while the reaction info is read
0481 S=zeros(numel(metaboliteIDs),numel(modelSBML.reaction));
0482 
0483 counter=0;
0484 %If FBC, then bounds have parameter ids defined for the whole model
0485 if isfield(modelSBML,'parameter')
0486     parameter.name=cell(numel(modelSBML.parameter),1);
0487     parameter.name={modelSBML.parameter(:).id}';
0488     parameter.value={modelSBML.parameter(:).value}';
0489 end
0490 
0491 if isfield(modelSBML.reaction,'sboTerm') && numel(unique([modelSBML.reaction.sboTerm])) == 1
0492     %If all the SBO terms are identical, don't add them to rxnMiriams
0493     modelSBML.reaction = rmfield(modelSBML.reaction,'sboTerm');
0494 end
0495 
0496 for i=1:numel(modelSBML.reaction)
0497     
0498     %Check that the reaction doesn't produce a complex and nothing else. If
0499     %so, then jump to the next reaction. This is because I get the genes
0500     %for complexes from the names and not from the reactions that create
0501     %them. This only applies to the non-COBRA format
0502     if numel(modelSBML.reaction(i).product)==1
0503         if length(modelSBML.reaction(i).product(1).species)>=3
0504             if strcmp(modelSBML.reaction(i).product(1).species(1:3),'Cx_')==true
0505                 continue;
0506             end
0507         end
0508     end
0509     
0510     %It didn't look like a gene complex-forming reaction
0511     counter=counter+1;
0512     
0513     reactionNames{counter}=modelSBML.reaction(i).name;
0514     
0515     reactionIDs{counter}=modelSBML.reaction(i).id;
0516     reactionReversibility(counter)=modelSBML.reaction(i).reversible;
0517     
0518     %If model is FBC, first get parameter of bound and then replace it with
0519     %the correct value. Probably faster with replace(), but this was only
0520     %introduced in Matlab R2016b
0521     if isfield(modelSBML.reaction(i),'fbc_lowerFluxBound')
0522         lb=modelSBML.reaction(i).fbc_lowerFluxBound;
0523         ub=modelSBML.reaction(i).fbc_upperFluxBound;
0524         for n=1:numel(parameter.value)
0525             lb=regexprep(lb,parameter.name(n),num2str(parameter.value{n}));
0526             ub=regexprep(ub,parameter.name(n),num2str(parameter.value{n}));
0527         end
0528         if isempty(lb)
0529             lb='-Inf';
0530         end
0531         if isempty(ub)
0532             ub='Inf';
0533         end
0534         reactionLB(counter)=str2num(lb);
0535         reactionUB(counter)=str2num(ub);
0536         %The order of these parameters should not be hard coded
0537     elseif isfield(modelSBML.reaction(i).kineticLaw,'parameter')
0538         reactionLB(counter)=modelSBML.reaction(i).kineticLaw.parameter(1).value;
0539         reactionUB(counter)=modelSBML.reaction(i).kineticLaw.parameter(2).value;
0540         reactionObjective(counter)=modelSBML.reaction(i).kineticLaw.parameter(3).value;
0541     else
0542         if reactionReversibility(counter)==true
0543             reactionLB(counter)=-inf;
0544         else
0545             reactionLB(counter)=0;
0546         end
0547         reactionUB(counter)=inf;
0548         reactionObjective(counter)=0;
0549     end
0550     
0551     %Find the associated gene if available
0552     %If FBC, get gene association data from corresponding fields
0553     if isfield(modelSBML.reaction(i),'fbc_geneProductAssociation')
0554         if ~isempty(modelSBML.reaction(i).fbc_geneProductAssociation) && ~isempty(modelSBML.reaction(i).fbc_geneProductAssociation.fbc_association)
0555             grRules{counter}=modelSBML.reaction(i).fbc_geneProductAssociation.fbc_association.fbc_association;
0556         end
0557     elseif isfield(modelSBML.reaction(i),'notes')
0558         %This section was previously executed only if isSBML2COBRA is true. Now
0559         %it will be executed, if 'GENE_ASSOCIATION' is found in
0560         %modelSBML.reaction(i).notes
0561         if strfind(modelSBML.reaction(i).notes,'GENE_ASSOCIATION')
0562             geneAssociation=parseNote(modelSBML.reaction(i).notes,'GENE_ASSOCIATION');
0563         elseif strfind(modelSBML.reaction(i).notes,'GENE ASSOCIATION')
0564             geneAssociation=parseNote(modelSBML.reaction(i).notes,'GENE ASSOCIATION');
0565         else
0566             geneAssociation='';
0567         end
0568         if ~isempty(geneAssociation)
0569             %This adds the grRules. The gene list and rxnGeneMat are created
0570             %later
0571             grRules{counter}=geneAssociation;
0572         end
0573     end
0574     if isempty(grRules{counter}) && ~isempty(modelSBML.reaction(i).modifier)
0575         rules='';
0576         for j=1:numel(modelSBML.reaction(i).modifier)
0577             modifier=modelSBML.reaction(i).modifier(j).species;
0578             if ~isempty(modifier)
0579                 if strcmpi(modifier(1:2),'E_')
0580                     index=find(strcmp(modifier,geneIDs));
0581                     %This should be unique and in the geneIDs list,
0582                     %otherwise something is wrong
0583                     if numel(index)~=1
0584                         EM=['Could not get the gene association data from reaction ' reactionIDs{i}];
0585                         dispEM(EM);
0586                     end
0587                     if ~isempty(rules)
0588                         rules=[rules ' or (' geneNames{index} ')'];
0589                     else
0590                         rules=['(' geneNames{index} ')'];
0591                     end
0592                 elseif strcmp(modifier(1:2),'s_')
0593                     index=find(strcmp(modifier,metaboliteIDs));
0594                     %This should be unique and in the geneIDs list,
0595                     %otherwise something is wrong
0596                     if numel(index)~=1
0597                         EM=['Could not get the gene association data from reaction ' reactionIDs{i}];
0598                         dispEM(EM);
0599                     end
0600                     if ~isempty(rules)
0601                         rules=[rules ' or (' metaboliteIDs{index} ')'];
0602                     else
0603                         rules=['(' metaboliteIDs{index} ')'];
0604                     end
0605                 else
0606                     %It seems to be a complex. Add the corresponding
0607                     %genes from the name of the complex (not the
0608                     %reaction that creates it)
0609                     index=find(strcmp(modifier,complexIDs));
0610                     if numel(index)==1
0611                         if ~isempty(rules)
0612                             rules=[rules ' or (' strrep(complexNames{index},':',' and ') ')'];
0613                         else
0614                             rules=['(' strrep(complexNames{index},':',' and ') ')'];
0615                         end
0616                     else
0617                         %Could not find a complex
0618                         EM=['Could not get the gene association data from reaction ' reactionIDs{i}];
0619                         dispEM(EM);
0620                     end
0621                 end
0622             end
0623         end
0624         grRules{counter}=rules;
0625         grRulesFromModifier{counter}=rules;%Backup copy for grRules, useful to parse Yeast 7.6
0626     end
0627     
0628     %Add reaction compartment
0629     if isfield(modelSBML.reaction(i),'compartment')
0630         if ~isempty(modelSBML.reaction(i).compartment)
0631             rxnComp=modelSBML.reaction(i).compartment;
0632         else
0633             rxnComp='';
0634         end
0635     elseif isfield(modelSBML.reaction(i),'notes')
0636         rxnComp=parseNote(modelSBML.reaction(i).notes,'COMPARTMENT');
0637     end
0638     if ~isempty(rxnComp)
0639         %Find it in the compartment list
0640         [~, J]=ismember(rxnComp,compartmentIDs);
0641         rxnComps(counter)=J;
0642     end
0643     
0644     %Get other Miriam fields. This may include for example database indexes
0645     %to organism-specific databases. EC-codes are supported by the COBRA
0646     %Toolbox format and are therefore loaded separately
0647     if isSBML2COBRA==false
0648         miriamStruct=parseMiriam(modelSBML.reaction(i).annotation);
0649         rxnMiriams{counter}=miriamStruct;
0650         if isfield(modelSBML.reaction(i),'notes')
0651             subsystems{counter,1}=cellstr(parseNote(modelSBML.reaction(i).notes,'SUBSYSTEM'));
0652             subsystems{counter,1}(cellfun('isempty',subsystems{counter,1})) = [];
0653             if strfind(modelSBML.reaction(i).notes,'Confidence Level')
0654                 rxnconfidencescores(counter)=str2num(parseNote(modelSBML.reaction(i).notes,'Confidence Level'));
0655             end
0656             rxnreferences{counter,1}=parseNote(modelSBML.reaction(i).notes,'AUTHORS');
0657             rxnnotes{counter,1}=parseNote(modelSBML.reaction(i).notes,'NOTES');
0658         end
0659     end
0660     
0661     %Get SBO terms
0662     if isfield(modelSBML.reaction(i),'sboTerm') && ~(modelSBML.reaction(i).sboTerm==-1)
0663         rxnMiriams{counter} = addSBOtoMiriam(rxnMiriams{counter}, modelSBML.reaction(i).sboTerm);
0664     end
0665     
0666     %Get ec-codes
0667     eccode='';
0668     if ~isempty(modelSBML.reaction(i).annotation)
0669         if strfind(modelSBML.reaction(i).annotation,'urn:miriam:ec-code')
0670             eccode=parseAnnotation(modelSBML.reaction(i).annotation,'urn:miriam:',':','ec-code');
0671         elseif strfind(modelSBML.reaction(i).annotation,'http://identifiers.org/ec-code')
0672             eccode=parseAnnotation(modelSBML.reaction(i).annotation,'http://identifiers.org/','/','ec-code');
0673         elseif strfind(modelSBML.reaction(i).annotation,'https://identifiers.org/ec-code')
0674             eccode=parseAnnotation(modelSBML.reaction(i).annotation,'https://identifiers.org/','/','ec-code');
0675         end
0676     elseif isfield(modelSBML.reaction(i),'notes')
0677         if strfind(modelSBML.reaction(i).notes,'EC Number')
0678             eccode=[eccode parseNote(modelSBML.reaction(i).notes,'EC Number')];
0679         elseif strfind(modelSBML.reaction(i).notes,'PROTEIN_CLASS')
0680             eccode=[eccode parseNote(modelSBML.reaction(i).notes,'PROTEIN_CLASS')];
0681         end
0682     end
0683     eccodes{counter}=eccode;
0684     
0685     %Add all reactants
0686     for j=1:numel(modelSBML.reaction(i).reactant)
0687         %Get the index of the metabolite in metaboliteIDs. External
0688         %metabolites will be removed at a later stage
0689         metIndex=find(strcmp(modelSBML.reaction(i).reactant(j).species,metaboliteIDs),1);
0690         if isempty(metIndex)
0691             EM=['Could not find metabolite ' modelSBML.reaction(i).reactant(j).species ' in reaction ' reactionIDs{counter}];
0692             dispEM(EM);
0693         end
0694         S(metIndex,counter)=S(metIndex,counter)+modelSBML.reaction(i).reactant(j).stoichiometry*-1;
0695     end
0696     
0697     %Add all products
0698     for j=1:numel(modelSBML.reaction(i).product)
0699         %Get the index of the metabolite in metaboliteIDs.
0700         metIndex=find(strcmp(modelSBML.reaction(i).product(j).species,metaboliteIDs),1);
0701         if isempty(metIndex)
0702             EM=['Could not find metabolite ' modelSBML.reaction(i).product(j).species ' in reaction ' reactionIDs{counter}];
0703             dispEM(EM);
0704         end
0705         S(metIndex,counter)=S(metIndex,counter)+modelSBML.reaction(i).product(j).stoichiometry;
0706     end
0707 end
0708 
0709 %if FBC, objective function is separately defined. Multiple objective
0710 %functions can be defined, one is set as active
0711 if isfield(modelSBML, 'fbc_activeObjective')
0712     obj=modelSBML.fbc_activeObjective;
0713     for i=1:numel(modelSBML.fbc_objective)
0714         if strcmp(obj,modelSBML.fbc_objective(i).fbc_id)
0715             if ~isempty(modelSBML.fbc_objective(i).fbc_fluxObjective)
0716                 rxn=modelSBML.fbc_objective(i).fbc_fluxObjective.fbc_reaction;
0717                 rxn=regexprep(rxn,'^R_','');
0718                 idx=find(ismember(reactionIDs,rxn));
0719                 reactionObjective(idx)=modelSBML.fbc_objective(i).fbc_fluxObjective.fbc_coefficient;
0720             end
0721         end
0722     end
0723 end
0724 
0725 %subSystems can be stored as groups instead of in annotations
0726 if isfield(modelSBML,'groups_group')
0727     for i=1:numel(modelSBML.groups_group)
0728         groupreactions={modelSBML.groups_group(i).groups_member(:).groups_idRef};
0729         groupreactions=regexprep(groupreactions,'^R_','');
0730         [~, idx] = ismember(groupreactions, reactionIDs);
0731         if any(idx)
0732             for j=1:numel(idx)
0733                 if isempty(subsystems{idx(j)}) % First subsystem
0734                     subsystems{idx(j)} = {modelSBML.groups_group(i).groups_name};
0735                 else % Consecutive subsystems: concatenate
0736                     subsystems{idx(j)} = horzcat(subsystems{idx(j)}, modelSBML.groups_group(i).groups_name);
0737                 end
0738             end
0739         end
0740     end
0741 end
0742 
0743 %Shrink the structures if complex-forming reactions had to be skipped
0744 reactionNames=reactionNames(1:counter);
0745 reactionIDs=reactionIDs(1:counter);
0746 subsystems=subsystems(1:counter);
0747 eccodes=eccodes(1:counter);
0748 rxnconfidencescores=rxnconfidencescores(1:counter);
0749 rxnreferences=rxnreferences(1:counter);
0750 rxnnotes=rxnnotes(1:counter);
0751 grRules=grRules(1:counter);
0752 rxnMiriams=rxnMiriams(1:counter);
0753 reactionReversibility=reactionReversibility(1:counter);
0754 reactionUB=reactionUB(1:counter);
0755 reactionLB=reactionLB(1:counter);
0756 reactionObjective=reactionObjective(1:counter);
0757 S=S(:,1:counter);
0758 
0759 model.name=modelSBML.name;
0760 model.id=regexprep(modelSBML.id,'^M_',''); % COBRA adds M_ prefix
0761 model.rxns=reactionIDs;
0762 model.mets=metaboliteIDs;
0763 model.S=sparse(S);
0764 model.lb=reactionLB;
0765 model.ub=reactionUB;
0766 model.rev=reactionReversibility;
0767 model.c=reactionObjective;
0768 model.b=zeros(numel(metaboliteIDs),1);
0769 model.comps=compartmentIDs;
0770 model.compNames=compartmentNames;
0771 model.rxnConfidenceScores=rxnconfidencescores;
0772 model.rxnReferences=rxnreferences;
0773 model.rxnNotes=rxnnotes;
0774 
0775 %Load annotation if available. If there are several authors, only the first
0776 %author credentials are imported
0777 if isfield(modelSBML,'annotation')
0778     endString='</';
0779     I=strfind(modelSBML.annotation,endString);
0780     J=strfind(modelSBML.annotation,'<vCard:Family>');
0781     if any(J)
0782         model.annotation.familyName=modelSBML.annotation(J(1)+14:I(find(I>J(1),1))-1);
0783     end
0784     J=strfind(modelSBML.annotation,'<vCard:Given>');
0785     if any(J)
0786         model.annotation.givenName=modelSBML.annotation(J(1)+13:I(find(I>J(1),1))-1);
0787     end
0788     J=strfind(modelSBML.annotation,'<vCard:EMAIL>');
0789     if any(J)
0790         model.annotation.email=modelSBML.annotation(J(1)+13:I(find(I>J(1),1))-1);
0791     end
0792     J=strfind(modelSBML.annotation,'<vCard:Orgname>');
0793     if any(J)
0794         model.annotation.organization=modelSBML.annotation(J(1)+15:I(find(I>J(1),1))-1);
0795     end
0796     endString='"/>';
0797     I=strfind(modelSBML.annotation,endString);
0798     if strfind(modelSBML.annotation,'"urn:miriam:')
0799         J=strfind(modelSBML.annotation,'"urn:miriam:');
0800         if any(J)
0801             model.annotation.taxonomy=modelSBML.annotation(J+12:I(find(I>J,1))-1);
0802         end
0803     else
0804         J=strfind(modelSBML.annotation,'"http://identifiers.org/');
0805         if any(J)
0806             model.annotation.taxonomy=modelSBML.annotation(J+24:I(find(I>J,1))-1);
0807         else
0808             J=strfind(modelSBML.annotation,'"https://identifiers.org/');
0809             if any(J)
0810                 model.annotation.taxonomy=modelSBML.annotation(J+25:I(find(I>J,1))-1);
0811             end
0812         end
0813     end
0814 end
0815 if isfield(modelSBML,'notes')
0816     startString=strfind(modelSBML.notes,'xhtml">');
0817     endString=strfind(modelSBML.notes,'</body>');
0818     if any(startString) && any(endString)
0819         model.annotation.note=modelSBML.notes(startString+7:endString-1);
0820         model.annotation.note=regexprep(model.annotation.note,'<p>|</p>','');
0821         model.annotation.note=strtrim(model.annotation.note);
0822     end
0823 end
0824 
0825 if any(~cellfun(@isempty,compartmentOutside))
0826     model.compOutside=compartmentOutside;
0827 end
0828 
0829 model.rxnNames=reactionNames;
0830 model.metNames=metaboliteNames;
0831 
0832 %Match the compartments for metabolites
0833 [~, J]=ismember(metaboliteCompartments,model.comps);
0834 model.metComps=J;
0835 
0836 %If any genes have been loaded (only for the new format)
0837 if ~isempty(geneNames)
0838     %In some rare cases geneNames may not necessarily be used in grRules.
0839     %That is true for Yeast 7.6. It's therefore important to change gene
0840     %systematic names to geneIDs in sophisticated way. Gene systematic
0841     %names are not unique, since exactly the same name may be in different
0842     %compartments
0843     if all(cellfun(@isempty,strfind(grRules,geneNames{1})))
0844         geneShortNames=geneNames;
0845         %geneShortNames contain compartments as well, so these are removed
0846         geneShortNames=regexprep(geneShortNames,' \[.+$','');
0847         %grRules obtained from modifier fields contain geneNames. These are
0848         %changed into geneIDs. grRulesFromModifier is a good way to have
0849         %geneIDs and rxns association when it's important to resolve
0850         %systematic name ambiguities
0851         grRulesFromModifier=regexprep(regexprep(grRulesFromModifier,'\[|\]','_'),regexprep(geneNames,'\[|\]','_'),geneIDs);
0852         grRules=regexprep(regexprep(grRules,'\[|\]','_'),regexprep(geneNames,'\[|\]','_'),geneIDs);
0853         
0854         %Yeast 7.6 contains several metabolites, which were used in gene
0855         %associations. For that reason, the list of species ID is created
0856         %and we then check whether any of them have kegg.genes annotation
0857         %thereby obtaining systematic gene names
0858         geneShortNames=vertcat(geneShortNames,metaboliteNames);
0859         geneIDs=vertcat(geneIDs,metaboliteIDs);
0860         geneSystNames=extractMiriam(vertcat(geneMiriams,metaboliteMiriams),'kegg.genes');
0861         geneCompartments=vertcat(geneCompartments,metaboliteCompartments);
0862         geneMiriams=vertcat(geneMiriams,metaboliteMiriams);
0863         
0864         %Now we retain information for only these entries, which have
0865         %kegg.genes annotation
0866         geneShortNames=geneShortNames(~cellfun('isempty',geneSystNames));
0867         geneIDs=geneIDs(~cellfun('isempty',geneSystNames));
0868         geneSystNames=geneSystNames(~cellfun('isempty',geneSystNames));
0869         geneCompartments=geneCompartments(~cellfun('isempty',geneSystNames));
0870         geneMiriams=geneMiriams(~cellfun('isempty',geneSystNames));
0871         %Now we reorder geneIDs and geneSystNames by geneSystNames string
0872         %length
0873         geneNames=geneIDs;%Backuping geneIDs, since we need unsorted order for later
0874         [~, Indx] = sort(cellfun('size', geneSystNames, 2), 'descend');
0875         geneIDs = geneIDs(Indx);
0876         geneSystNames = geneSystNames(Indx);
0877         for i=1:numel(geneSystNames)
0878             for j=1:numel(grRules)
0879                 if strfind(grRules{j},geneSystNames{i})
0880                     if ~isempty(grRules{j})
0881                         if sum(ismember(geneSystNames,geneSystNames{i}))==1
0882                             grRules{j}=regexprep(grRules{j},geneSystNames{i},geneIDs{i});
0883                         elseif sum(ismember(geneSystNames,geneSystNames{i}))>1
0884                             counter=0;
0885                             ovrlpIDs=geneIDs(ismember(geneSystNames,geneSystNames{i}));
0886                             for k=1:numel(ovrlpIDs)
0887                                 if strfind(grRulesFromModifier{j},ovrlpIDs{k})
0888                                     counter=counter+1;
0889                                     grRules{j}=regexprep(grRules{j},geneSystNames{i},ovrlpIDs{k});
0890                                 end
0891                                 if counter>1
0892                                     EM=['Gene association is ambiguous for reaction ' modelSBML.reaction(j).id];
0893                                     dispEM(EM);
0894                                 end
0895                             end
0896                         end
0897                     end
0898                 end
0899             end
0900         end
0901     end
0902     model.genes=geneNames;
0903     model.grRules=grRules;
0904     [grRules,rxnGeneMat] = standardizeGrRules(model,true);
0905     model.grRules = grRules;
0906     model.rxnGeneMat = rxnGeneMat;
0907     
0908     %Match the compartments for genes
0909     [~, J]=ismember(geneCompartments,model.comps);
0910     model.geneComps=J;
0911 else
0912     if ~all(cellfun(@isempty,grRules))
0913         %If fbc_geneProduct exists, follow the specified gene order, such
0914         %that matching geneShortNames in function below will work
0915         if isfield(modelSBML,'fbc_geneProduct')
0916             genes={modelSBML.fbc_geneProduct.fbc_id};
0917             
0918             %Get gene Miriams if they were not retrieved above (this occurs
0919             %when genes are stored as fbc_geneProduct instead of species)
0920             if isempty(geneMiriams)
0921                 geneMiriams = cell(numel(genes),1);
0922                 if isfield(modelSBML.fbc_geneProduct,'sboTerm') && numel(unique([modelSBML.fbc_geneProduct.sboTerm])) == 1
0923                     %If all the SBO terms are identical, don't add them to geneMiriams
0924                     modelSBML.fbc_geneProduct = rmfield(modelSBML.fbc_geneProduct,'sboTerm');
0925                 end
0926                 for i = 1:numel(genes)
0927                     geneMiriams{i}=parseMiriam(modelSBML.fbc_geneProduct(i).annotation);
0928                     if isfield(modelSBML.fbc_geneProduct(i),'sboTerm') && ~(modelSBML.fbc_geneProduct(i).sboTerm==-1)
0929                         geneMiriams{i} = addSBOtoMiriam(geneMiriams{i},modelSBML.fbc_geneProduct(i).sboTerm);
0930                     end
0931                 end
0932             end
0933         else
0934             genes=getGeneList(grRules);
0935         end
0936         if strcmpi(genes{1}(1:2),'G_')
0937             genes=regexprep(genes,'^G_','');
0938             grRules=regexprep(grRules,'^G_','');
0939             grRules=regexprep(grRules,'\(G_','(');
0940             grRules=regexprep(grRules,' G_',' ');
0941         end
0942         model.genes=genes;
0943         model.grRules=grRules;
0944         [grRules,rxnGeneMat] = standardizeGrRules(model,true);
0945         model.grRules = grRules;
0946         model.rxnGeneMat = rxnGeneMat;
0947     end
0948 end
0949 
0950 if all(cellfun(@isempty,geneShortNames))
0951     if isfield(modelSBML,'fbc_geneProduct')
0952         for i=1:numel(genes)
0953             if ~isempty(modelSBML.fbc_geneProduct(i).fbc_label)
0954                 geneShortNames{i,1}=modelSBML.fbc_geneProduct(i).fbc_label;
0955             elseif ~isempty(modelSBML.fbc_geneProduct(i).fbc_name)
0956                 geneShortNames{i,1}=modelSBML.fbc_geneProduct(i).fbc_name;
0957             else
0958                 geneShortNames{i,1}='';
0959             end
0960         end
0961     end
0962 end
0963 
0964 %If any InChIs have been loaded
0965 if any(~cellfun(@isempty,metaboliteInChI))
0966     model.inchis=metaboliteInChI;
0967 end
0968 
0969 %If any formulas have been loaded
0970 if any(~cellfun(@isempty,metaboliteFormula))
0971     model.metFormulas=metaboliteFormula;
0972 end
0973 
0974 %If any charges have been loaded
0975 if ~isempty(metaboliteCharges)
0976     model.metCharges=metaboliteCharges;
0977 end
0978 
0979 %If any gene short names have been loaded
0980 if any(~cellfun(@isempty,geneShortNames))
0981     model.geneShortNames=geneShortNames;
0982 end
0983 
0984 %If any Miriam strings for compartments have been loaded
0985 if any(~cellfun(@isempty,compartmentMiriams))
0986     model.compMiriams=compartmentMiriams;
0987 end
0988 
0989 %If any Miriam strings for metabolites have been loaded
0990 if any(~cellfun(@isempty,metaboliteMiriams))
0991     model.metMiriams=metaboliteMiriams;
0992 end
0993 
0994 %If any subsystems have been loaded
0995 if any(~cellfun(@isempty,subsystems))
0996     model.subSystems=subsystems;
0997 end
0998 if any(rxnComps)
0999     if all(rxnComps)
1000         model.rxnComps=rxnComps;
1001     else
1002         if supressWarnings==false
1003             EM='The compartments for the following reactions could not be matched. Ignoring reaction compartment information';
1004             dispEM(EM,false,model.rxns(rxnComps==0));
1005         end
1006     end
1007 end
1008 
1009 %If any ec-codes have been loaded
1010 if any(~cellfun(@isempty,eccodes))
1011     model.eccodes=eccodes;
1012 end
1013 
1014 %If any Miriam strings for reactions have been loaded
1015 if any(~cellfun(@isempty,rxnMiriams))
1016     model.rxnMiriams=rxnMiriams;
1017 end
1018 
1019 %If any Miriam strings for genes have been loaded
1020 if any(~cellfun(@isempty,geneMiriams))
1021     model.geneMiriams=geneMiriams;
1022 end
1023 
1024 model.unconstrained=metaboliteUnconstrained;
1025 
1026 %Convert SBML IDs back into their original strings. Here we are using part
1027 %from convertSBMLID, originating from the COBRA Toolbox
1028 model.rxns=regexprep(model.rxns,'__([0-9]+)__','${char(str2num($1))}');
1029 model.mets=regexprep(model.mets,'__([0-9]+)__','${char(str2num($1))}');
1030 model.comps=regexprep(model.comps,'__([0-9]+)__','${char(str2num($1))}');
1031 model.grRules=regexprep(model.grRules,'__([0-9]+)__','${char(str2num($1))}');
1032 model.genes=regexprep(model.genes,'__([0-9]+)__','${char(str2num($1))}');
1033 model.id=regexprep(model.id,'__([0-9]+)__','${char(str2num($1))}');
1034 
1035 %Remove unused fields
1036 if isempty(model.annotation)
1037     model=rmfield(model,'annotation');
1038 end
1039 if isempty(model.compOutside)
1040     model=rmfield(model,'compOutside');
1041 end
1042 if isempty(model.compMiriams)
1043     model=rmfield(model,'compMiriams');
1044 end
1045 if isempty(model.rxnComps)
1046     model=rmfield(model,'rxnComps');
1047 end
1048 if isempty(model.grRules)
1049     model=rmfield(model,'grRules');
1050 end
1051 if isempty(model.rxnGeneMat)
1052     model=rmfield(model,'rxnGeneMat');
1053 end
1054 if isempty(model.subSystems)
1055     model=rmfield(model,'subSystems');
1056 else
1057     model.subSystems(cellfun(@isempty,subsystems))={{''}};
1058 end
1059 if isempty(model.eccodes)
1060     model=rmfield(model,'eccodes');
1061 end
1062 if isempty(model.rxnMiriams)
1063     model=rmfield(model,'rxnMiriams');
1064 end
1065 if cellfun(@isempty,model.rxnNotes)
1066     model=rmfield(model,'rxnNotes');
1067 end
1068 if cellfun(@isempty,model.rxnReferences)
1069     model=rmfield(model,'rxnReferences');
1070 end
1071 if isempty(model.rxnConfidenceScores) || all(isnan(model.rxnConfidenceScores))
1072     model=rmfield(model,'rxnConfidenceScores');
1073 end
1074 if isempty(model.genes)
1075     model=rmfield(model,'genes');
1076 elseif isrow(model.genes)
1077     model.genes=transpose(model.genes);
1078 end
1079 if isempty(model.geneComps)
1080     model=rmfield(model,'geneComps');
1081 end
1082 if isempty(model.geneMiriams)
1083     model=rmfield(model,'geneMiriams');
1084 end
1085 if isempty(model.geneShortNames)
1086     model=rmfield(model,'geneShortNames');
1087 end
1088 if isempty(model.inchis)
1089     model=rmfield(model,'inchis');
1090 end
1091 if isempty(model.metFormulas)
1092     model=rmfield(model,'metFormulas');
1093 end
1094 if isempty(model.metMiriams)
1095     model=rmfield(model,'metMiriams');
1096 end
1097 if ~any(model.metCharges)
1098     model=rmfield(model,'metCharges');
1099 end
1100 
1101 %This just removes the grRules if no genes have been loaded
1102 if ~isfield(model,'genes') && isfield(model,'grRules')
1103     model=rmfield(model,'grRules');
1104 end
1105 
1106 %Print warnings about bad structure
1107 if supressWarnings==false
1108     checkModelStruct(model,false);
1109 end
1110 
1111 if removeExcMets==true
1112     model=simplifyModel(model);
1113 end
1114 end
1115 
1116 function matchGenes=getGeneList(grRules)
1117 %Constructs the list of unique genes from grRules
1118 
1119 %Assumes that everything that isn't a paranthesis, " AND " or " or " is a
1120 %gene name
1121 genes=strrep(grRules,'(','');
1122 genes=strrep(genes,')','');
1123 genes=strrep(genes,' or ',' ');
1124 genes=strrep(genes,' and ',' ');
1125 genes=strrep(genes,' OR ',' ');
1126 genes=strrep(genes,' AND ',' ');
1127 genes=regexp(genes,' ','split');
1128 
1129 allNames={};
1130 for i=1:numel(genes)
1131     allNames=[allNames genes{i}];
1132 end
1133 matchGenes=unique(allNames)';
1134 
1135 %Remove the empty element if present
1136 if isempty(matchGenes{1})
1137     matchGenes(1)=[];
1138 end
1139 end
1140 
1141 function fieldContent=parseNote(searchString,fieldName)
1142 %The function obtains the particular information from 'notes' field, using
1143 %fieldName as the dummy string
1144 
1145 fieldContent='';
1146 
1147 if strfind(searchString,fieldName)
1148     [~,targetString] = regexp(searchString,['<p>' fieldName '.*?</p>'],'tokens','match');
1149     targetString=regexprep(targetString,'<p>|</p>','');
1150     targetString=regexprep(targetString,[fieldName, ':'],'');
1151     for i=1:numel(targetString)
1152         fieldContent=[fieldContent ';' strtrim(targetString{1,i})];
1153     end
1154     fieldContent=regexprep(fieldContent,'^;|;$','');
1155 else
1156     fieldContent='';
1157 end
1158 end
1159 
1160 function fieldContent=parseAnnotation(searchString,startString,midString,fieldName)
1161 
1162 fieldContent='';
1163 
1164 %Removing whitespace characters from the ending strings, which may occur in
1165 %several cases
1166 searchString=regexprep(searchString,'" />','"/>');
1167 [~,targetString] = regexp(searchString,['<rdf:li rdf:resource="' startString fieldName midString '.*?"/>'],'tokens','match');
1168 targetString=regexprep(targetString,'<rdf:li rdf:resource="|"/>','');
1169 targetString=regexprep(targetString,startString,'');
1170 targetString=regexprep(targetString,[fieldName midString],'');
1171 
1172 for i=1:numel(targetString)
1173     fieldContent=[fieldContent ';' strtrim(targetString{1,i})];
1174 end
1175 
1176 fieldContent=regexprep(fieldContent,'^;|;$','');
1177 end
1178 
1179 function miriamStruct=parseMiriam(searchString)
1180 %Generates miriam structure from annotation field
1181 
1182 %Finding whether miriams are written in the old or the new way
1183 if strfind(searchString,'urn:miriam:')
1184     startString='urn:miriam:';
1185     midString=':';
1186 elseif strfind(searchString,'http://identifiers.org/')
1187     startString='http://identifiers.org/';
1188     midString='/';
1189 elseif strfind(searchString,'https://identifiers.org/')
1190     startString='https://identifiers.org/';
1191     midString='/';
1192 else
1193     miriamStruct=[];
1194     return;
1195 end
1196 
1197 miriamStruct=[];
1198 
1199 searchString=regexprep(searchString,'" />','"/>');
1200 [~,targetString] = regexp(searchString,'<rdf:li rdf:resource=".*?"/>','tokens','match');
1201 targetString=regexprep(targetString,'<rdf:li rdf:resource="|"/>','');
1202 targetString=regexprep(targetString,startString,'');
1203 targetString=regexprep(targetString,midString,'/','once');
1204 
1205 counter=0;
1206 for i=1:numel(targetString)
1207     if isempty(regexp(targetString{1,i},'inchi|ec-code', 'once'))
1208         counter=counter+1;
1209         miriamStruct.name{counter,1} = regexprep(targetString{1,i},'/.+','','once');
1210         miriamStruct.value{counter,1} = regexprep(targetString{1,i},[miriamStruct.name{counter,1} '/'],'','once');
1211         miriamStruct.name{counter,1} = regexprep(miriamStruct.name{counter,1},'^obo\.','');
1212     end
1213 end
1214 end
1215 
1216 function miriam = addSBOtoMiriam(miriam,sboTerm)
1217 %Appends SBO term to miriam structure
1218 
1219 sboTerm = {['SBO:' sprintf('%07u',sboTerm)]};  % convert to proper format
1220 if isempty(miriam)
1221     miriam.name = {'sbo'};
1222     miriam.value = sboTerm;
1223 elseif any(strcmp('sbo',miriam.name))
1224     currSbo = strcmp('sbo',miriam.name);
1225     miriam.value(currSbo) = sboTerm;
1226 else
1227     miriam.name(end+1) = {'sbo'};
1228     miriam.value(end+1) = sboTerm;
1229 end
1230 end

Generated by m2html © 2005