Home > io > importModel.m

importModel

PURPOSE ^

importModel

SYNOPSIS ^

function model=importModel(fileName,removeExcMets,removePrefix,supressWarnings)

DESCRIPTION ^

 importModel
   Import a constraint-based model from an SBML file.

 Input:
   fileName        a SBML file to import. A dialog window will open if
                   no file name is specified.
   removeExcMets   true if exchange metabolites should be removed. This is
                   needed to be able to run simulations, but it could also
                   be done using simplifyModel at a later stage (optional,
                   default true)
   removePrefix    true if identifier prefixes should be removed when
                   loading the model: G_ for genes, R_ for reactions,
                   M_ for metabolites, and C_ for compartments. These are
                   only removed if all identifiers of a certain type
                   contain the prefix. (optional, default true)
   supressWarnings true if warnings regarding the model structure should
                   be supressed (optional, default false)

 Output:
   model
       id               model ID
       name             name of model contents
       annotation       additional information about model
       rxns             reaction ids
       mets             metabolite ids
       S                stoichiometric matrix
       lb               lower bounds
       ub               upper bounds
       rev              reversibility vector
       c                objective coefficients
       b                equality constraints for the metabolite equations
       comps            compartment ids
       compNames        compartment names
       compOutside      the id (as in comps) for the compartment
                        surrounding each of the compartments
       compMiriams      structure with MIRIAM information about the
                        compartments
       rxnNames         reaction description
       rxnComps         compartments for reactions
       grRules          reaction to gene rules in text form
       rxnGeneMat       reaction-to-gene mapping in sparse matrix form
       subSystems       subsystem name for each reaction
       eccodes          EC-codes for the reactions
       rxnMiriams       structure with MIRIAM information about the reactions
       rxnNotes         reaction notes
       rxnReferences    reaction references
       rxnConfidenceScores reaction confidence scores
       genes            list of all genes
       geneComps        compartments for genes
       geneMiriams      structure with MIRIAM information about the genes
       geneShortNames   gene alternative names (e.g. ERG10)
       proteins         protein associated to each gene
       metNames         metabolite description
       metComps         compartments for metabolites
       inchis           InChI-codes for metabolites
       metFormulas      metabolite chemical formula
       metMiriams       structure with MIRIAM information about the metabolites
       metCharges       metabolite charge
       unconstrained    true if the metabolite is an exchange metabolite

 Note: A number of consistency checks are performed in order to ensure that the
 model is valid. Take these warnings seriously and modify the model
 structure to solve them.

 Usage: model = importModel(fileName, removeExcMets, removePrefix, supressWarnings)

CROSS-REFERENCE INFORMATION ^

This function calls: This function is called by:

SUBFUNCTIONS ^

SOURCE CODE ^

0001 function model=importModel(fileName,removeExcMets,removePrefix,supressWarnings)
0002 % importModel
0003 %   Import a constraint-based model from an SBML file.
0004 %
0005 % Input:
0006 %   fileName        a SBML file to import. A dialog window will open if
0007 %                   no file name is specified.
0008 %   removeExcMets   true if exchange metabolites should be removed. This is
0009 %                   needed to be able to run simulations, but it could also
0010 %                   be done using simplifyModel at a later stage (optional,
0011 %                   default true)
0012 %   removePrefix    true if identifier prefixes should be removed when
0013 %                   loading the model: G_ for genes, R_ for reactions,
0014 %                   M_ for metabolites, and C_ for compartments. These are
0015 %                   only removed if all identifiers of a certain type
0016 %                   contain the prefix. (optional, default true)
0017 %   supressWarnings true if warnings regarding the model structure should
0018 %                   be supressed (optional, default false)
0019 %
0020 % Output:
0021 %   model
0022 %       id               model ID
0023 %       name             name of model contents
0024 %       annotation       additional information about model
0025 %       rxns             reaction ids
0026 %       mets             metabolite ids
0027 %       S                stoichiometric matrix
0028 %       lb               lower bounds
0029 %       ub               upper bounds
0030 %       rev              reversibility vector
0031 %       c                objective coefficients
0032 %       b                equality constraints for the metabolite equations
0033 %       comps            compartment ids
0034 %       compNames        compartment names
0035 %       compOutside      the id (as in comps) for the compartment
0036 %                        surrounding each of the compartments
0037 %       compMiriams      structure with MIRIAM information about the
0038 %                        compartments
0039 %       rxnNames         reaction description
0040 %       rxnComps         compartments for reactions
0041 %       grRules          reaction to gene rules in text form
0042 %       rxnGeneMat       reaction-to-gene mapping in sparse matrix form
0043 %       subSystems       subsystem name for each reaction
0044 %       eccodes          EC-codes for the reactions
0045 %       rxnMiriams       structure with MIRIAM information about the reactions
0046 %       rxnNotes         reaction notes
0047 %       rxnReferences    reaction references
0048 %       rxnConfidenceScores reaction confidence scores
0049 %       genes            list of all genes
0050 %       geneComps        compartments for genes
0051 %       geneMiriams      structure with MIRIAM information about the genes
0052 %       geneShortNames   gene alternative names (e.g. ERG10)
0053 %       proteins         protein associated to each gene
0054 %       metNames         metabolite description
0055 %       metComps         compartments for metabolites
0056 %       inchis           InChI-codes for metabolites
0057 %       metFormulas      metabolite chemical formula
0058 %       metMiriams       structure with MIRIAM information about the metabolites
0059 %       metCharges       metabolite charge
0060 %       unconstrained    true if the metabolite is an exchange metabolite
0061 %
0062 % Note: A number of consistency checks are performed in order to ensure that the
0063 % model is valid. Take these warnings seriously and modify the model
0064 % structure to solve them.
0065 %
0066 % Usage: model = importModel(fileName, removeExcMets, removePrefix, supressWarnings)
0067 
0068 if nargin<1 || isempty(fileName)
0069     [fileName, pathName] = uigetfile({'*.xml;*.sbml'}, 'Please select the model file');
0070     if fileName == 0
0071         error('You should select a model file')
0072     else
0073         fileName = fullfile(pathName,fileName);
0074     end
0075 end
0076 fileName=char(fileName);
0077 if nargin<2 || isempty(removeExcMets)
0078     removeExcMets=true;
0079 end
0080 
0081 if nargin<3 || isempty(removePrefix)
0082     removePrefix=true;
0083 end
0084 
0085 if nargin<4
0086     supressWarnings=false;
0087 end
0088 
0089 fileName=checkFileExistence(fileName,1);
0090 % If path contains non-ASCII characters, copy file to tempdir first, as
0091 % libSBML is known to have problems with this on Windows:
0092 % https://sbml.org/software/libsbml/libsbml-docs/known-pitfalls/#matlab-on-windows-has-issues-with-unicode-filenames
0093 if ispc && any(double(fileName)>128)
0094     [~,originalFile,ext] = fileparts(fileName);
0095     tempFile = fullfile(tempdir,[originalFile ext]);
0096     copyfile(fileName,tempFile);
0097     fileName = tempFile;
0098 end
0099 
0100 %This is to match the order of the fields to those you get from importing
0101 %from Excel
0102 model=[];
0103 model.id=[];
0104 model.name=[];
0105 model.annotation=[];
0106 model.rxns={};
0107 model.mets={};
0108 model.S=[];
0109 model.lb=[];
0110 model.ub=[];
0111 model.rev=[];
0112 model.c=[];
0113 model.b=[];
0114 model.comps={};
0115 model.compNames={};
0116 model.compOutside={};
0117 model.compMiriams={};
0118 model.rxnNames={};
0119 model.rxnComps=[];
0120 model.grRules={};
0121 model.rxnGeneMat=[];
0122 model.subSystems={};
0123 model.eccodes={};
0124 model.rxnMiriams={};
0125 model.rxnNotes={};
0126 model.rxnReferences={};
0127 model.rxnConfidenceScores=[];
0128 model.genes={};
0129 model.geneComps=[];
0130 model.geneMiriams={};
0131 model.geneShortNames={};
0132 model.proteins={};
0133 model.metNames={};
0134 model.metComps=[];
0135 model.inchis={};
0136 model.metFormulas={};
0137 model.metMiriams={};
0138 model.metCharges=[];
0139 model.unconstrained=[];
0140 
0141 %Load the model using libSBML
0142 [modelSBML,errorMsg] = TranslateSBML_RAVEN(fileName,0,0,[1 1]);
0143 if exist('tempFile','var')
0144     delete(tempFile)
0145 end
0146 
0147 if isempty(modelSBML)
0148     EM=['There is a problem with the SBML file. Try using the SBML Validator at http://sbml.org/Facilities/Validator.\nlibSBML reports: ', errorMsg.message];
0149     dispEM(EM);
0150 end
0151 
0152 %Retrieve compartment names and IDs
0153 compartmentNames=cell(numel(modelSBML.compartment),1);
0154 compartmentIDs=cell(numel(modelSBML.compartment),1);
0155 compartmentOutside=cell(numel(modelSBML.compartment),1);
0156 compartmentMiriams=cell(numel(modelSBML.compartment),1);
0157 
0158 if isfield(modelSBML.compartment,'sboTerm') && numel(unique([modelSBML.compartment.sboTerm])) == 1
0159     %If all the SBO terms are identical, don't add them to compMiriams
0160     modelSBML.compartment = rmfield(modelSBML.compartment,'sboTerm');
0161 end
0162 
0163 for i=1:numel(modelSBML.compartment)
0164     compartmentNames{i}=modelSBML.compartment(i).name;
0165     compartmentIDs{i}=modelSBML.compartment(i).id;
0166     if isfield(modelSBML.compartment(i),'outside')
0167         if ~isempty(modelSBML.compartment(i).outside)
0168             compartmentOutside{i}=modelSBML.compartment(i).outside;
0169         else
0170             compartmentOutside{i}='';
0171         end
0172     else
0173         compartmentOutside{i}=[];
0174     end
0175 
0176     if isfield(modelSBML.compartment(i),'annotation')
0177         compartmentMiriams{i}=parseMiriam(modelSBML.compartment(i).annotation);
0178     else
0179         compartmentMiriams{i}=[];
0180     end
0181 
0182     if isfield(modelSBML.compartment(i),'sboTerm') && ~(modelSBML.compartment(i).sboTerm==-1)
0183         compartmentMiriams{i} = addSBOtoMiriam(compartmentMiriams{i},modelSBML.compartment(i).sboTerm);
0184     end
0185 end
0186 
0187 %If there are no compartment names then use compartment id as name
0188 if all(cellfun(@isempty,compartmentNames))
0189     compartmentNames=compartmentIDs;
0190 end
0191 
0192 %Retrieve info on metabolites, genes, complexes
0193 metaboliteNames={};
0194 metaboliteIDs={};
0195 metaboliteCompartments={};
0196 metaboliteUnconstrained=[];
0197 metaboliteFormula={};
0198 metaboliteInChI={};
0199 metaboliteMiriams={};
0200 metaboliteCharges=[];
0201 
0202 geneNames={};
0203 geneIDs={};
0204 geneMiriams={};
0205 geneShortNames={};
0206 proteins={};
0207 geneCompartments={};
0208 complexIDs={};
0209 complexNames={};
0210 
0211 %If the file is not a COBRA Toolbox model. According to the format
0212 %specified in the yeast consensus model both metabolites and genes are a
0213 %type of 'species'. The metabolites have names starting with 'M_' and genes
0214 %with 'E_'
0215 geneSBOs = [];
0216 metSBOs = [];
0217 %Regex of compartment names, later to be used to remove from metabolite
0218 %names if present as suffix.
0219 regexCompNames = ['\s?\[((' strjoin({modelSBML.compartment.name},')|(') '))\]$'];
0220 for i=1:numel(modelSBML.species)
0221     if length(modelSBML.species(i).id)>=2 && strcmpi(modelSBML.species(i).id(1:2),'E_')
0222         geneNames{numel(geneNames)+1,1}=modelSBML.species(i).name;
0223 
0224         %The "E_" is included in the ID. This is because it's only used
0225         %internally in this file and it makes the matching a little
0226         %smoother
0227         geneIDs{numel(geneIDs)+1,1}=modelSBML.species(i).id;
0228         geneCompartments{numel(geneCompartments)+1,1}=modelSBML.species(i).compartment;
0229 
0230         %Get Miriam structure
0231         if isfield(modelSBML.species(i),'annotation')
0232             %Get Miriam info
0233             geneMiriam=parseMiriam(modelSBML.species(i).annotation);
0234             geneMiriams{numel(geneMiriams)+1,1}=geneMiriam;
0235         else
0236             geneMiriams{numel(geneMiriams)+1,1}=[];
0237         end
0238 
0239         %Protein short names (for example ERG10) are saved as SHORT
0240         %NAME: NAME in the notes-section of metabolites for SBML Level
0241         %2 and as PROTEIN_ASSOCIATION for each reaction in SBML Level 2
0242         %COBRA Toolbox format. For now only the SHORT NAME is loaded
0243         %and no mapping takes place
0244         if isfield(modelSBML.species(i),'notes')
0245             geneShortNames{numel(geneShortNames)+1,1}=parseNote(modelSBML.species(i).notes,'SHORT NAME');
0246         else
0247             geneShortNames{numel(geneShortNames)+1,1}='';
0248         end
0249 
0250         %Get SBO term
0251         if isfield(modelSBML.species(i),'sboTerm') && ~(modelSBML.species(i).sboTerm==-1)
0252             geneSBOs(end+1,1) = modelSBML.species(i).sboTerm;
0253         end
0254     elseif length(modelSBML.species(i).id)>=2 && strcmpi(modelSBML.species(i).id(1:3),'Cx_')
0255         %If it's a complex keep the ID and name
0256         complexIDs=[complexIDs;modelSBML.species(i).id];
0257         complexNames=[complexNames;modelSBML.species(i).name];
0258     else
0259         %If it is not gene or complex, then it must be a metabolite
0260         metaboliteNames{numel(metaboliteNames)+1,1}=modelSBML.species(i).name;
0261         metaboliteIDs{numel(metaboliteIDs)+1,1}=modelSBML.species(i).id;
0262         metaboliteCompartments{numel(metaboliteCompartments)+1,1}=modelSBML.species(i).compartment;
0263         metaboliteUnconstrained(numel(metaboliteUnconstrained)+1,1)=modelSBML.species(i).boundaryCondition;
0264 
0265         %For each metabolite retrieve the formula and the InChI code if
0266         %available First add the InChI code and the formula from the
0267         %InChI. This allows for overwriting the formula by setting the
0268         %actual formula field
0269         if ~isempty(modelSBML.species(i).annotation)
0270             %Get the formula if available
0271             startString='>InChI=';
0272             endString='</in:inchi>';
0273             formStart=strfind(modelSBML.species(i).annotation,startString);
0274             if isempty(formStart)
0275                 startString='InChI=';
0276                 endString='"/>';
0277             end
0278             formStart=strfind(modelSBML.species(i).annotation,startString);
0279             if ~isempty(formStart)
0280                 formEnd=strfind(modelSBML.species(i).annotation,endString);
0281                 formEndIndex=find(formEnd>formStart, 1 );
0282                 formula=modelSBML.species(i).annotation(formStart+numel(startString):formEnd(formEndIndex)-1);
0283                 metaboliteInChI{numel(metaboliteInChI)+1,1}=formula;
0284 
0285                 %The composition is most often present between the
0286                 %first and second "/" in the model. In some simple
0287                 %molecules, such as salts, there is no second "/". The
0288                 %formula is then assumed to be to the end of the string
0289                 compositionIndexes=strfind(formula,'/');
0290                 if numel(compositionIndexes)>1
0291                     metaboliteFormula{numel(metaboliteFormula)+1,1}=...
0292                         formula(compositionIndexes(1)+1:compositionIndexes(2)-1);
0293                 else
0294                     if numel(compositionIndexes)==1
0295                         %Probably a simple molecule which can have only
0296                         %one conformation
0297                         metaboliteFormula{numel(metaboliteFormula)+1,1}=...
0298                             formula(compositionIndexes(1)+1:numel(formula));
0299                     else
0300                         metaboliteFormula{numel(metaboliteFormula)+1,1}='';
0301                     end
0302                 end
0303             elseif isfield(modelSBML.species(i),'fbc_chemicalFormula')
0304                 metaboliteInChI{numel(metaboliteInChI)+1,1}='';
0305                 if ~isempty(modelSBML.species(i).fbc_chemicalFormula)
0306                     %Cannot extract InChi from formula, so remains
0307                     %empty
0308                     metaboliteFormula{numel(metaboliteFormula)+1,1}=modelSBML.species(i).fbc_chemicalFormula;
0309                 else
0310                     metaboliteFormula{numel(metaboliteFormula)+1,1}='';
0311                 end
0312             else
0313                 metaboliteInChI{numel(metaboliteInChI)+1,1}='';
0314                 metaboliteFormula{numel(metaboliteFormula)+1,1}='';
0315             end
0316 
0317             %Get Miriam info
0318             metMiriam=parseMiriam(modelSBML.species(i).annotation);
0319             metaboliteMiriams{numel(metaboliteMiriams)+1,1}=metMiriam;
0320         else
0321             metaboliteInChI{numel(metaboliteInChI)+1,1}='';
0322             if isfield(modelSBML.species(i),'notes')
0323                 metaboliteFormula{numel(metaboliteFormula)+1,1}=parseNote(modelSBML.species(i).notes,'FORMULA');
0324             else
0325                 metaboliteFormula{numel(metaboliteFormula)+1,1}='';
0326             end
0327             metaboliteMiriams{numel(metaboliteMiriams)+1,1}=[];
0328         end
0329         if ~isempty(modelSBML.species(i).notes)
0330             if ~isfield(modelSBML.species(i),'annotation')
0331                 metaboliteFormula{numel(metaboliteFormula)+1,1}=parseNote(modelSBML.species(i).notes,'FORMULA');
0332             end
0333         elseif ~isfield(modelSBML.species(i),'annotation')
0334             metaboliteFormula{numel(metaboliteFormula)+1,1}='';
0335         end
0336         %Get SBO term
0337         if isfield(modelSBML.species(i),'sboTerm') && ~(modelSBML.species(i).sboTerm==-1)
0338             metSBOs(end+1,1) = modelSBML.species(i).sboTerm;
0339         end
0340     end
0341 
0342     %The following lines are executed regardless isSBML2COBRA setting
0343     if isempty(modelSBML.species(i).id) || ~strcmpi(modelSBML.species(i).id(1:2),'E_')
0344         if isempty(modelSBML.species(i).id) || ~strcmpi(modelSBML.species(i).id(1:3),'Cx_')
0345             %Remove trailing [compartment] from metabolite name if present
0346             metaboliteNames{end,1}=regexprep(metaboliteNames{end,1},regexCompNames,'');
0347             metaboliteNames{end,1}=metaboliteNames{end,1};
0348             if isfield(modelSBML.species(i),'fbc_charge')
0349                 if ~isempty(modelSBML.species(i).fbc_charge) && modelSBML.species(i).isSetfbc_charge
0350                     metaboliteCharges(numel(metaboliteCharges)+1,1)=double(modelSBML.species(i).fbc_charge);
0351                 else
0352                     if isfield(modelSBML.species(i),'notes')
0353                         if strfind(modelSBML.species(i).notes,'CHARGE')
0354                             metaboliteCharges(numel(metaboliteCharges)+1,1)=str2double(parseNote(modelSBML.species(i).notes,'CHARGE'));
0355                         else
0356                             metaboliteCharges(numel(metaboliteCharges)+1,1)=NaN;
0357                         end
0358                     else
0359                         metaboliteCharges(numel(metaboliteCharges)+1,1)=NaN;
0360                     end
0361                 end
0362             elseif isfield(modelSBML.species(i),'notes')
0363                 if strfind(modelSBML.species(i).notes,'CHARGE')
0364                     metaboliteCharges(numel(metaboliteCharges)+1,1)=str2double(parseNote(modelSBML.species(i).notes,'CHARGE'));
0365                 else
0366                     metaboliteCharges(numel(metaboliteCharges)+1,1)=NaN;
0367                 end
0368             else
0369                 metaboliteCharges(numel(metaboliteCharges)+1,1)=NaN;
0370             end
0371             %Additional information from FBC format Chemical formula
0372             if isfield(modelSBML.species(i),'fbc_chemicalFormula')
0373                 if ~isempty(modelSBML.species(i).fbc_chemicalFormula)
0374                     metaboliteFormula{numel(metaboliteFormula),1}=modelSBML.species(i).fbc_chemicalFormula;
0375                 end
0376             end
0377         end
0378     end
0379 end
0380 
0381 %Add SBO terms to gene and metabolite miriam fields
0382 if numel(unique(geneSBOs)) > 1  % don't add if they're all identical
0383     for i = 1:numel(geneNames)
0384         geneMiriams{i} = addSBOtoMiriam(geneMiriams{i},geneSBOs(i));
0385     end
0386 end
0387 if numel(unique(metSBOs)) > 1
0388     for i = 1:numel(metaboliteNames)
0389         metaboliteMiriams{i} = addSBOtoMiriam(metaboliteMiriams{i},metSBOs(i));
0390     end
0391 end
0392 
0393 %Retrieve info on reactions
0394 reactionNames=cell(numel(modelSBML.reaction),1);
0395 reactionIDs=cell(numel(modelSBML.reaction),1);
0396 subsystems=cell(numel(modelSBML.reaction),1);
0397 eccodes=cell(numel(modelSBML.reaction),1);
0398 eccodes(:,:)=cellstr('');
0399 rxnconfidencescores=NaN(numel(modelSBML.reaction),1);
0400 rxnreferences=cell(numel(modelSBML.reaction),1);
0401 rxnreferences(:,:)=cellstr('');
0402 rxnnotes=cell(numel(modelSBML.reaction),1);
0403 rxnnotes(:,:)=cellstr('');
0404 grRules=cell(numel(modelSBML.reaction),1);
0405 grRules(:,:)=cellstr('');
0406 grRulesFromModifier=grRules;
0407 rxnComps=zeros(numel(modelSBML.reaction),1);
0408 rxnMiriams=cell(numel(modelSBML.reaction),1);
0409 reactionReversibility=zeros(numel(modelSBML.reaction),1);
0410 reactionUB=zeros(numel(modelSBML.reaction),1);
0411 reactionLB=zeros(numel(modelSBML.reaction),1);
0412 reactionObjective=zeros(numel(modelSBML.reaction),1);
0413 
0414 %Construct the stoichiometric matrix while the reaction info is read
0415 S=zeros(numel(metaboliteIDs),numel(modelSBML.reaction));
0416 
0417 counter=0;
0418 %If FBC, then bounds have parameter ids defined for the whole model
0419 if isfield(modelSBML,'parameter')
0420     parameter.name=cell(numel(modelSBML.parameter),1);
0421     parameter.name={modelSBML.parameter(:).id}';
0422     parameter.value={modelSBML.parameter(:).value}';
0423 end
0424 
0425 if isfield(modelSBML.reaction,'sboTerm') && numel(unique([modelSBML.reaction.sboTerm])) == 1
0426     %If all the SBO terms are identical, don't add them to rxnMiriams
0427     modelSBML.reaction = rmfield(modelSBML.reaction,'sboTerm');
0428 end
0429 
0430 for i=1:numel(modelSBML.reaction)
0431 
0432     %Check that the reaction doesn't produce a complex and nothing else. If
0433     %so, then jump to the next reaction. This is because I get the genes
0434     %for complexes from the names and not from the reactions that create
0435     %them. This only applies to the non-COBRA format
0436     if numel(modelSBML.reaction(i).product)==1
0437         if length(modelSBML.reaction(i).product(1).species)>=3
0438             if strcmp(modelSBML.reaction(i).product(1).species(1:3),'Cx_')==true
0439                 continue;
0440             end
0441         end
0442     end
0443 
0444     %It didn't look like a gene complex-forming reaction
0445     counter=counter+1;
0446 
0447     reactionNames{counter}=modelSBML.reaction(i).name;
0448 
0449     reactionIDs{counter}=modelSBML.reaction(i).id;
0450     reactionReversibility(counter)=modelSBML.reaction(i).reversible;
0451 
0452     %If model is FBC, first get parameter of bound and then replace it with
0453     %the correct value. Probably faster with replace(), but this was only
0454     %introduced in Matlab R2016b
0455     if isfield(modelSBML.reaction(i),'fbc_lowerFluxBound')
0456         lb=modelSBML.reaction(i).fbc_lowerFluxBound;
0457         ub=modelSBML.reaction(i).fbc_upperFluxBound;
0458         for n=1:numel(parameter.value)
0459             lb=regexprep(lb,parameter.name(n),num2str(parameter.value{n}));
0460             ub=regexprep(ub,parameter.name(n),num2str(parameter.value{n}));
0461         end
0462         if isempty(lb)
0463             lb='-Inf';
0464         end
0465         if isempty(ub)
0466             ub='Inf';
0467         end
0468         reactionLB(counter)=str2num(lb);
0469         reactionUB(counter)=str2num(ub);
0470         %The order of these parameters should not be hard coded
0471     elseif isfield(modelSBML.reaction(i).kineticLaw,'parameter')
0472         reactionLB(counter)=modelSBML.reaction(i).kineticLaw.parameter(1).value;
0473         reactionUB(counter)=modelSBML.reaction(i).kineticLaw.parameter(2).value;
0474         reactionObjective(counter)=modelSBML.reaction(i).kineticLaw.parameter(3).value;
0475     else
0476         if reactionReversibility(counter)==true
0477             reactionLB(counter)=-inf;
0478         else
0479             reactionLB(counter)=0;
0480         end
0481         reactionUB(counter)=inf;
0482         reactionObjective(counter)=0;
0483     end
0484 
0485     %Find the associated gene if available
0486     %If FBC, get gene association data from corresponding fields
0487     if isfield(modelSBML.reaction(i),'fbc_geneProductAssociation')
0488         if ~isempty(modelSBML.reaction(i).fbc_geneProductAssociation) && ~isempty(modelSBML.reaction(i).fbc_geneProductAssociation.fbc_association)
0489             grRules{counter}=modelSBML.reaction(i).fbc_geneProductAssociation.fbc_association.fbc_association;
0490         end
0491     elseif isfield(modelSBML.reaction(i),'notes')
0492         %This section was previously executed only if isSBML2COBRA is true. Now
0493         %it will be executed, if 'GENE_ASSOCIATION' is found in
0494         %modelSBML.reaction(i).notes
0495         if strfind(modelSBML.reaction(i).notes,'GENE_ASSOCIATION')
0496             geneAssociation=parseNote(modelSBML.reaction(i).notes,'GENE_ASSOCIATION');
0497         elseif strfind(modelSBML.reaction(i).notes,'GENE ASSOCIATION')
0498             geneAssociation=parseNote(modelSBML.reaction(i).notes,'GENE ASSOCIATION');
0499         else
0500             geneAssociation='';
0501         end
0502         if ~isempty(geneAssociation)
0503             %This adds the grRules. The gene list and rxnGeneMat are created
0504             %later
0505             grRules{counter}=geneAssociation;
0506         end
0507     end
0508     if isempty(grRules{counter}) && ~isempty(modelSBML.reaction(i).modifier)
0509         rules='';
0510         for j=1:numel(modelSBML.reaction(i).modifier)
0511             modifier=modelSBML.reaction(i).modifier(j).species;
0512             if ~isempty(modifier)
0513                 if strcmpi(modifier(1:2),'E_')
0514                     index=find(strcmp(modifier,geneIDs));
0515                     %This should be unique and in the geneIDs list,
0516                     %otherwise something is wrong
0517                     if numel(index)~=1
0518                         EM=['Could not get the gene association data from reaction ' reactionIDs{i}];
0519                         dispEM(EM);
0520                     end
0521                     if ~isempty(rules)
0522                         rules=[rules ' or (' geneNames{index} ')'];
0523                     else
0524                         rules=['(' geneNames{index} ')'];
0525                     end
0526                 elseif strcmp(modifier(1:2),'s_')
0527                     index=find(strcmp(modifier,metaboliteIDs));
0528                     %This should be unique and in the geneIDs list,
0529                     %otherwise something is wrong
0530                     if numel(index)~=1
0531                         EM=['Could not get the gene association data from reaction ' reactionIDs{i}];
0532                         dispEM(EM);
0533                     end
0534                     if ~isempty(rules)
0535                         rules=[rules ' or (' metaboliteIDs{index} ')'];
0536                     else
0537                         rules=['(' metaboliteIDs{index} ')'];
0538                     end
0539                 else
0540                     %It seems to be a complex. Add the corresponding
0541                     %genes from the name of the complex (not the
0542                     %reaction that creates it)
0543                     index=find(strcmp(modifier,complexIDs));
0544                     if numel(index)==1
0545                         if ~isempty(rules)
0546                             rules=[rules ' or (' strrep(complexNames{index},':',' and ') ')'];
0547                         else
0548                             rules=['(' strrep(complexNames{index},':',' and ') ')'];
0549                         end
0550                     else
0551                         %Could not find a complex
0552                         EM=['Could not get the gene association data from reaction ' reactionIDs{i}];
0553                         dispEM(EM);
0554                     end
0555                 end
0556             end
0557         end
0558         grRules{counter}=rules;
0559         grRulesFromModifier{counter}=rules;%Backup copy for grRules, useful to parse Yeast 7.6
0560     end
0561 
0562     %Add reaction compartment
0563     if isfield(modelSBML.reaction(i),'compartment')
0564         if ~isempty(modelSBML.reaction(i).compartment)
0565             rxnComp=modelSBML.reaction(i).compartment;
0566         else
0567             rxnComp='';
0568         end
0569     elseif isfield(modelSBML.reaction(i),'notes')
0570         rxnComp=parseNote(modelSBML.reaction(i).notes,'COMPARTMENT');
0571     end
0572     if ~isempty(rxnComp)
0573         %Find it in the compartment list
0574         [~, J]=ismember(rxnComp,compartmentIDs);
0575         rxnComps(counter)=J;
0576     end
0577 
0578 
0579     miriamStruct=parseMiriam(modelSBML.reaction(i).annotation);
0580     rxnMiriams{counter}=miriamStruct;
0581     if isfield(modelSBML.reaction(i),'notes')
0582         subsystems{counter,1}=cellstr(parseNote(modelSBML.reaction(i).notes,'SUBSYSTEM'));
0583         subsystems{counter,1}(cellfun('isempty',subsystems{counter,1})) = [];
0584         if strfind(modelSBML.reaction(i).notes,'Confidence Level')
0585             confScore = parseNote(modelSBML.reaction(i).notes,'Confidence Level');
0586             if isempty(confScore)
0587                 confScore = 0;
0588             end
0589             rxnconfidencescores(counter)=str2double(confScore);
0590         end
0591         rxnreferences{counter,1}=parseNote(modelSBML.reaction(i).notes,'AUTHORS');
0592         rxnnotes{counter,1}=parseNote(modelSBML.reaction(i).notes,'NOTES');
0593     end
0594 
0595     %Get SBO terms
0596     if isfield(modelSBML.reaction(i),'sboTerm') && ~(modelSBML.reaction(i).sboTerm==-1)
0597         rxnMiriams{counter} = addSBOtoMiriam(rxnMiriams{counter}, modelSBML.reaction(i).sboTerm);
0598     end
0599 
0600     %Get ec-codes
0601     eccode='';
0602     if ~isempty(modelSBML.reaction(i).annotation)
0603         if strfind(modelSBML.reaction(i).annotation,'urn:miriam:ec-code')
0604             eccode=parseAnnotation(modelSBML.reaction(i).annotation,'urn:miriam:',':','ec-code');
0605         elseif strfind(modelSBML.reaction(i).annotation,'http://identifiers.org/ec-code')
0606             eccode=parseAnnotation(modelSBML.reaction(i).annotation,'http://identifiers.org/','/','ec-code');
0607         elseif strfind(modelSBML.reaction(i).annotation,'https://identifiers.org/ec-code')
0608             eccode=parseAnnotation(modelSBML.reaction(i).annotation,'https://identifiers.org/','/','ec-code');
0609         end
0610     elseif isfield(modelSBML.reaction(i),'notes')
0611         if strfind(modelSBML.reaction(i).notes,'EC Number')
0612             eccode=[eccode parseNote(modelSBML.reaction(i).notes,'EC Number')];
0613         elseif strfind(modelSBML.reaction(i).notes,'PROTEIN_CLASS')
0614             eccode=[eccode parseNote(modelSBML.reaction(i).notes,'PROTEIN_CLASS')];
0615         end
0616     end
0617     eccodes{counter}=eccode;
0618 
0619     %Add all reactants
0620     for j=1:numel(modelSBML.reaction(i).reactant)
0621         %Get the index of the metabolite in metaboliteIDs. External
0622         %metabolites will be removed at a later stage
0623         metIndex=find(strcmp(modelSBML.reaction(i).reactant(j).species,metaboliteIDs),1);
0624         if isempty(metIndex)
0625             EM=['Could not find metabolite ' modelSBML.reaction(i).reactant(j).species ' in reaction ' reactionIDs{counter}];
0626             dispEM(EM);
0627         end
0628         S(metIndex,counter)=S(metIndex,counter)+modelSBML.reaction(i).reactant(j).stoichiometry*-1;
0629     end
0630 
0631     %Add all products
0632     for j=1:numel(modelSBML.reaction(i).product)
0633         %Get the index of the metabolite in metaboliteIDs.
0634         metIndex=find(strcmp(modelSBML.reaction(i).product(j).species,metaboliteIDs),1);
0635         if isempty(metIndex)
0636             EM=['Could not find metabolite ' modelSBML.reaction(i).product(j).species ' in reaction ' reactionIDs{counter}];
0637             dispEM(EM);
0638         end
0639         S(metIndex,counter)=S(metIndex,counter)+modelSBML.reaction(i).product(j).stoichiometry;
0640     end
0641 end
0642 
0643 %if FBC, objective function is separately defined. Multiple objective
0644 %functions can be defined, one is set as active
0645 if isfield(modelSBML, 'fbc_activeObjective')
0646     obj=modelSBML.fbc_activeObjective;
0647     for i=1:numel(modelSBML.fbc_objective)
0648         if strcmp(obj,modelSBML.fbc_objective(i).fbc_id)
0649             if ~isempty(modelSBML.fbc_objective(i).fbc_fluxObjective)
0650                 rxn=modelSBML.fbc_objective(i).fbc_fluxObjective.fbc_reaction;
0651                 idx=find(ismember(reactionIDs,rxn));
0652                 reactionObjective(idx)=modelSBML.fbc_objective(i).fbc_fluxObjective.fbc_coefficient;
0653             end
0654         end
0655     end
0656 end
0657 
0658 %subSystems can be stored as groups instead of in annotations
0659 if isfield(modelSBML,'groups_group')
0660     for i=1:numel(modelSBML.groups_group)
0661         groupreactions={modelSBML.groups_group(i).groups_member(:).groups_idRef};
0662         [~, idx] = ismember(groupreactions, reactionIDs);
0663         if any(idx)
0664             for j=1:numel(idx)
0665                 if isempty(subsystems{idx(j)}) % First subsystem
0666                     subsystems{idx(j)} = {modelSBML.groups_group(i).groups_name};
0667                 else % Consecutive subsystems: concatenate
0668                     subsystems{idx(j)} = horzcat(subsystems{idx(j)}, modelSBML.groups_group(i).groups_name);
0669                 end
0670             end
0671         end
0672     end
0673 end
0674 
0675 %Shrink the structures if complex-forming reactions had to be skipped
0676 reactionNames=reactionNames(1:counter);
0677 reactionIDs=reactionIDs(1:counter);
0678 subsystems=subsystems(1:counter);
0679 eccodes=eccodes(1:counter);
0680 rxnconfidencescores=rxnconfidencescores(1:counter);
0681 rxnreferences=rxnreferences(1:counter);
0682 rxnnotes=rxnnotes(1:counter);
0683 grRules=grRules(1:counter);
0684 rxnMiriams=rxnMiriams(1:counter);
0685 reactionReversibility=reactionReversibility(1:counter);
0686 reactionUB=reactionUB(1:counter);
0687 reactionLB=reactionLB(1:counter);
0688 reactionObjective=reactionObjective(1:counter);
0689 S=S(:,1:counter);
0690 
0691 model.name=modelSBML.name;
0692 model.id=modelSBML.id;
0693 model.rxns=reactionIDs;
0694 model.mets=metaboliteIDs;
0695 model.S=sparse(S);
0696 model.lb=reactionLB;
0697 model.ub=reactionUB;
0698 model.rev=reactionReversibility;
0699 model.c=reactionObjective;
0700 model.b=zeros(numel(metaboliteIDs),1);
0701 model.comps=compartmentIDs;
0702 model.compNames=compartmentNames;
0703 model.rxnConfidenceScores=rxnconfidencescores;
0704 model.rxnReferences=rxnreferences;
0705 model.rxnNotes=rxnnotes;
0706 
0707 %Load annotation if available. If there are several authors, only the first
0708 %author credentials are imported
0709 if isfield(modelSBML,'annotation')
0710     endString='</';
0711     I=strfind(modelSBML.annotation,endString);
0712     J=strfind(modelSBML.annotation,'<vCard:Family>');
0713     if any(J)
0714         model.annotation.familyName=modelSBML.annotation(J(1)+14:I(find(I>J(1),1))-1);
0715     end
0716     J=strfind(modelSBML.annotation,'<vCard:Given>');
0717     if any(J)
0718         model.annotation.givenName=modelSBML.annotation(J(1)+13:I(find(I>J(1),1))-1);
0719     end
0720     J=strfind(modelSBML.annotation,'<vCard:EMAIL>');
0721     if any(J)
0722         model.annotation.email=modelSBML.annotation(J(1)+13:I(find(I>J(1),1))-1);
0723     end
0724     J=strfind(modelSBML.annotation,'<vCard:Orgname>');
0725     if any(J)
0726         model.annotation.organization=modelSBML.annotation(J(1)+15:I(find(I>J(1),1))-1);
0727     end
0728     endString='"/>';
0729     I=strfind(modelSBML.annotation,endString);
0730     if strfind(modelSBML.annotation,'"urn:miriam:')
0731         J=strfind(modelSBML.annotation,'"urn:miriam:');
0732         if any(J)
0733             model.annotation.taxonomy=modelSBML.annotation(J+12:I(find(I>J,1))-1);
0734         end
0735     else
0736         J=strfind(modelSBML.annotation,'"http://identifiers.org/');
0737         if any(J)
0738             model.annotation.taxonomy=modelSBML.annotation(J+24:I(find(I>J,1))-1);
0739         else
0740             J=strfind(modelSBML.annotation,'"https://identifiers.org/');
0741             if any(J)
0742                 model.annotation.taxonomy=modelSBML.annotation(J+25:I(find(I>J,1))-1);
0743             end
0744         end
0745     end
0746 end
0747 if isfield(modelSBML,'notes')
0748     startString=strfind(modelSBML.notes,'xhtml">');
0749     endString=strfind(modelSBML.notes,'</body>');
0750     if any(startString) && any(endString)
0751         model.annotation.note=modelSBML.notes(startString+7:endString-1);
0752         model.annotation.note=regexprep(model.annotation.note,'<p>|</p>','');
0753         model.annotation.note=strtrim(model.annotation.note);
0754         if regexp(model.annotation.note,'This file was generated using the exportModel function in RAVEN Toolbox \d\.\d and OutputSBML in libSBML')
0755             model.annotation=rmfield(model.annotation,'note'); % Default note added when running exportModel
0756         end
0757     end
0758 end
0759 
0760 if any(~cellfun(@isempty,compartmentOutside))
0761     model.compOutside=compartmentOutside;
0762 end
0763 
0764 model.rxnNames=reactionNames;
0765 model.metNames=metaboliteNames;
0766 
0767 %Match the compartments for metabolites
0768 [~, J]=ismember(metaboliteCompartments,model.comps);
0769 model.metComps=J;
0770 
0771 %If any genes have been loaded (only for the new format)
0772 if ~isempty(geneNames)
0773     %In some rare cases geneNames may not necessarily be used in grRules.
0774     %That is true for Yeast 7.6. It's therefore important to change gene
0775     %systematic names to geneIDs in sophisticated way. Gene systematic
0776     %names are not unique, since exactly the same name may be in different
0777     %compartments
0778     if all(cellfun(@isempty,strfind(grRules,geneNames{1})))
0779         geneShortNames=geneNames;
0780         %geneShortNames contain compartments as well, so these are removed
0781         geneShortNames=regexprep(geneShortNames,' \[.+$','');
0782         %grRules obtained from modifier fields contain geneNames. These are
0783         %changed into geneIDs. grRulesFromModifier is a good way to have
0784         %geneIDs and rxns association when it's important to resolve
0785         %systematic name ambiguities
0786         grRulesFromModifier=regexprep(regexprep(grRulesFromModifier,'\[|\]','_'),regexprep(geneNames,'\[|\]','_'),geneIDs);
0787         grRules=regexprep(regexprep(grRules,'\[|\]','_'),regexprep(geneNames,'\[|\]','_'),geneIDs);
0788 
0789         %Yeast 7.6 contains several metabolites, which were used in gene
0790         %associations. For that reason, the list of species ID is created
0791         %and we then check whether any of them have kegg.genes annotation
0792         %thereby obtaining systematic gene names
0793         geneShortNames=vertcat(geneShortNames,metaboliteNames);
0794         geneIDs=vertcat(geneIDs,metaboliteIDs);
0795         geneSystNames=extractMiriam(vertcat(geneMiriams,metaboliteMiriams),'kegg.genes');
0796         geneCompartments=vertcat(geneCompartments,metaboliteCompartments);
0797         geneMiriams=vertcat(geneMiriams,metaboliteMiriams);
0798 
0799         %Now we retain information for only these entries, which have
0800         %kegg.genes annotation
0801         geneShortNames=geneShortNames(~cellfun('isempty',geneSystNames));
0802         geneIDs=geneIDs(~cellfun('isempty',geneSystNames));
0803         geneSystNames=geneSystNames(~cellfun('isempty',geneSystNames));
0804         geneCompartments=geneCompartments(~cellfun('isempty',geneSystNames));
0805         geneMiriams=geneMiriams(~cellfun('isempty',geneSystNames));
0806         %Now we reorder geneIDs and geneSystNames by geneSystNames string
0807         %length
0808         geneNames=geneIDs;%Backuping geneIDs, since we need unsorted order for later
0809         [~, Indx] = sort(cellfun('size', geneSystNames, 2), 'descend');
0810         geneIDs = geneIDs(Indx);
0811         geneSystNames = geneSystNames(Indx);
0812         for i=1:numel(geneSystNames)
0813             for j=1:numel(grRules)
0814                 if strfind(grRules{j},geneSystNames{i})
0815                     if ~isempty(grRules{j})
0816                         if sum(ismember(geneSystNames,geneSystNames{i}))==1
0817                             grRules{j}=regexprep(grRules{j},geneSystNames{i},geneIDs{i});
0818                         elseif sum(ismember(geneSystNames,geneSystNames{i}))>1
0819                             counter=0;
0820                             ovrlpIDs=geneIDs(ismember(geneSystNames,geneSystNames{i}));
0821                             for k=1:numel(ovrlpIDs)
0822                                 if strfind(grRulesFromModifier{j},ovrlpIDs{k})
0823                                     counter=counter+1;
0824                                     grRules{j}=regexprep(grRules{j},geneSystNames{i},ovrlpIDs{k});
0825                                 end
0826                                 if counter>1
0827                                     EM=['Gene association is ambiguous for reaction ' modelSBML.reaction(j).id];
0828                                     dispEM(EM);
0829                                 end
0830                             end
0831                         end
0832                     end
0833                 end
0834             end
0835         end
0836     end
0837     model.genes=geneNames;
0838     model.grRules=grRules;
0839     [grRules,rxnGeneMat] = standardizeGrRules(model,true);
0840     model.grRules = grRules;
0841     model.rxnGeneMat = rxnGeneMat;
0842 
0843     %Match the compartments for genes
0844     [~, J]=ismember(geneCompartments,model.comps);
0845     model.geneComps=J;
0846 else
0847     if ~all(cellfun(@isempty,grRules))
0848         %If fbc_geneProduct exists, follow the specified gene order, such
0849         %that matching geneShortNames in function below will work
0850         if isfield(modelSBML,'fbc_geneProduct')
0851             genes={modelSBML.fbc_geneProduct.fbc_id};
0852 
0853             %Get gene Miriams if they were not retrieved above (this occurs
0854             %when genes are stored as fbc_geneProduct instead of species)
0855             if isempty(geneMiriams)
0856                 geneMiriams = cell(numel(genes),1);
0857                 if isfield(modelSBML.fbc_geneProduct,'sboTerm') && numel(unique([modelSBML.fbc_geneProduct.sboTerm])) == 1
0858                     %If all the SBO terms are identical, don't add them to geneMiriams
0859                     modelSBML.fbc_geneProduct = rmfield(modelSBML.fbc_geneProduct,'sboTerm');
0860                 end
0861                 for i = 1:numel(genes)
0862                     geneMiriams{i}=parseMiriam(modelSBML.fbc_geneProduct(i).annotation);
0863                     if isfield(modelSBML.fbc_geneProduct(i),'sboTerm') && ~(modelSBML.fbc_geneProduct(i).sboTerm==-1)
0864                         geneMiriams{i} = addSBOtoMiriam(geneMiriams{i},modelSBML.fbc_geneProduct(i).sboTerm);
0865                     end
0866                 end
0867             end
0868             proteins={modelSBML.fbc_geneProduct.fbc_name};
0869         else
0870             genes=getGeneList(grRules);
0871         end
0872         model.genes=genes;
0873         model.grRules=grRules;
0874         [grRules,rxnGeneMat] = standardizeGrRules(model,true);
0875         model.grRules = grRules;
0876         model.rxnGeneMat = rxnGeneMat;
0877     end
0878 end
0879 
0880 if all(cellfun(@isempty,geneShortNames))
0881     if isfield(modelSBML,'fbc_geneProduct')
0882         for i=1:numel(genes)
0883             if ~isempty(modelSBML.fbc_geneProduct(i).fbc_label)
0884                 geneShortNames{i,1}=modelSBML.fbc_geneProduct(i).fbc_label;
0885             elseif ~isempty(modelSBML.fbc_geneProduct(i).fbc_name)
0886                 geneShortNames{i,1}=modelSBML.fbc_geneProduct(i).fbc_name;
0887             else
0888                 geneShortNames{i,1}='';
0889             end
0890         end
0891     end
0892 end
0893 
0894 %If any InChIs have been loaded
0895 if any(~cellfun(@isempty,metaboliteInChI))
0896     model.inchis=metaboliteInChI;
0897 end
0898 
0899 %If any formulas have been loaded
0900 if any(~cellfun(@isempty,metaboliteFormula))
0901     model.metFormulas=metaboliteFormula;
0902 end
0903 
0904 %If any charges have been loaded
0905 if ~isempty(metaboliteCharges)
0906     model.metCharges=metaboliteCharges;
0907 end
0908 
0909 %If any gene short names have been loaded
0910 if any(~cellfun(@isempty,geneShortNames))
0911     model.geneShortNames=geneShortNames;
0912 end
0913 
0914 %If any Miriam strings for compartments have been loaded
0915 if any(~cellfun(@isempty,compartmentMiriams))
0916     model.compMiriams=compartmentMiriams;
0917 end
0918 
0919 %If any Miriam strings for metabolites have been loaded
0920 if any(~cellfun(@isempty,metaboliteMiriams))
0921     model.metMiriams=metaboliteMiriams;
0922 end
0923 
0924 %If any subsystems have been loaded
0925 if any(~cellfun(@isempty,subsystems))
0926     model.subSystems=subsystems;
0927 end
0928 if any(rxnComps)
0929     if all(rxnComps)
0930         model.rxnComps=rxnComps;
0931     else
0932         if supressWarnings==false
0933             EM='The compartments for the following reactions could not be matched. Ignoring reaction compartment information';
0934             dispEM(EM,false,model.rxns(rxnComps==0));
0935         end
0936     end
0937 end
0938 
0939 %If any ec-codes have been loaded
0940 if any(~cellfun(@isempty,eccodes))
0941     model.eccodes=eccodes;
0942 end
0943 
0944 %If any Miriam strings for reactions have been loaded
0945 if any(~cellfun(@isempty,rxnMiriams))
0946     model.rxnMiriams=rxnMiriams;
0947 end
0948 
0949 %If any Miriam strings for genes have been loaded
0950 if any(~cellfun(@isempty,geneMiriams))
0951     model.geneMiriams=geneMiriams;
0952 end
0953 
0954 %If any protein strings have been loaded
0955 if any(~cellfun(@isempty,proteins))
0956     proteins = reshape(proteins,[],1);
0957     model.proteins=proteins;
0958 end
0959 
0960 model.unconstrained=metaboliteUnconstrained;
0961 
0962 %Convert SBML IDs back into their original strings. Here we are using part
0963 %from convertSBMLID, originating from the COBRA Toolbox
0964 model.rxns=regexprep(model.rxns,'__([0-9]+)__','${char(str2num($1))}');
0965 model.mets=regexprep(model.mets,'__([0-9]+)__','${char(str2num($1))}');
0966 model.comps=regexprep(model.comps,'__([0-9]+)__','${char(str2num($1))}');
0967 model.grRules=regexprep(model.grRules,'__([0-9]+)__','${char(str2num($1))}');
0968 model.genes=regexprep(model.genes,'__([0-9]+)__','${char(str2num($1))}');
0969 model.id=regexprep(model.id,'__([0-9]+)__','${char(str2num($1))}');
0970 
0971 if removePrefix
0972     [model, hasChanged]=removeIdentifierPrefix(model);
0973     dispEM(['The following fields have prefixes removed from all entries. '...
0974     'If this is undesired, run importModel with removePrefix as false. Example: '...
0975     'importModel(''filename.xml'',[],false);'],false,hasChanged)
0976 end
0977 
0978 %Remove unused fields
0979 if isempty(model.annotation)
0980     model=rmfield(model,'annotation');
0981 end
0982 if isempty(model.compOutside)
0983     model=rmfield(model,'compOutside');
0984 end
0985 if isempty(model.compMiriams)
0986     model=rmfield(model,'compMiriams');
0987 end
0988 if isempty(model.rxnComps)
0989     model=rmfield(model,'rxnComps');
0990 end
0991 if isempty(model.grRules)
0992     model=rmfield(model,'grRules');
0993 end
0994 if isempty(model.rxnGeneMat)
0995     model=rmfield(model,'rxnGeneMat');
0996 end
0997 if isempty(model.subSystems)
0998     model=rmfield(model,'subSystems');
0999 else
1000     model.subSystems(cellfun(@isempty,subsystems))={{''}};
1001 end
1002 if isempty(model.eccodes)
1003     model=rmfield(model,'eccodes');
1004 end
1005 if isempty(model.rxnMiriams)
1006     model=rmfield(model,'rxnMiriams');
1007 end
1008 if cellfun(@isempty,model.rxnNotes)
1009     model=rmfield(model,'rxnNotes');
1010 end
1011 if cellfun(@isempty,model.rxnReferences)
1012     model=rmfield(model,'rxnReferences');
1013 end
1014 if isempty(model.rxnConfidenceScores) || all(isnan(model.rxnConfidenceScores))
1015     model=rmfield(model,'rxnConfidenceScores');
1016 end
1017 if isempty(model.genes)
1018     model=rmfield(model,'genes');
1019 elseif isrow(model.genes)
1020     model.genes=transpose(model.genes);
1021 end
1022 if isempty(model.geneComps)
1023     model=rmfield(model,'geneComps');
1024 end
1025 if isempty(model.geneMiriams)
1026     model=rmfield(model,'geneMiriams');
1027 end
1028 if isempty(model.geneShortNames)
1029     model=rmfield(model,'geneShortNames');
1030 end
1031 if isempty(model.proteins)
1032     model=rmfield(model,'proteins');
1033 end
1034 if isempty(model.inchis)
1035     model=rmfield(model,'inchis');
1036 end
1037 if isempty(model.metFormulas)
1038     model=rmfield(model,'metFormulas');
1039 end
1040 if isempty(model.metMiriams)
1041     model=rmfield(model,'metMiriams');
1042 end
1043 if ~any(model.metCharges)
1044     model=rmfield(model,'metCharges');
1045 end
1046 
1047 %This just removes the grRules if no genes have been loaded
1048 if ~isfield(model,'genes') && isfield(model,'grRules')
1049     model=rmfield(model,'grRules');
1050 end
1051 
1052 %Print warnings about bad structure
1053 if supressWarnings==false
1054     checkModelStruct(model,false);
1055 end
1056 
1057 if removeExcMets==true
1058     model=simplifyModel(model);
1059 end
1060 end
1061 
1062 function matchGenes=getGeneList(grRules)
1063 %Constructs the list of unique genes from grRules
1064 
1065 %Assumes that everything that isn't a paranthesis, " AND " or " or " is a
1066 %gene name
1067 genes=strrep(grRules,'(','');
1068 genes=strrep(genes,')','');
1069 genes=strrep(genes,' or ',' ');
1070 genes=strrep(genes,' and ',' ');
1071 genes=strrep(genes,' OR ',' ');
1072 genes=strrep(genes,' AND ',' ');
1073 genes=regexp(genes,' ','split');
1074 
1075 allNames={};
1076 for i=1:numel(genes)
1077     allNames=[allNames genes{i}];
1078 end
1079 matchGenes=unique(allNames)';
1080 
1081 %Remove the empty element if present
1082 if isempty(matchGenes{1})
1083     matchGenes(1)=[];
1084 end
1085 end
1086 
1087 function fieldContent=parseNote(searchString,fieldName)
1088 %The function obtains the particular information from 'notes' field, using
1089 %fieldName as the dummy string
1090 
1091 fieldContent='';
1092 
1093 if strfind(searchString,fieldName)
1094     [~,targetString] = regexp(searchString,['<p>' fieldName '.*?</p>'],'tokens','match');
1095     targetString=regexprep(targetString,'<p>|</p>','');
1096     targetString=regexprep(targetString,[fieldName, ':'],'');
1097     for i=1:numel(targetString)
1098         fieldContent=[fieldContent ';' strtrim(targetString{1,i})];
1099     end
1100     fieldContent=regexprep(fieldContent,'^;|;$','');
1101 else
1102     fieldContent='';
1103 end
1104 end
1105 
1106 function fieldContent=parseAnnotation(searchString,startString,midString,fieldName)
1107 
1108 fieldContent='';
1109 
1110 %Removing whitespace characters from the ending strings, which may occur in
1111 %several cases
1112 searchString=regexprep(searchString,'" />','"/>');
1113 [~,targetString] = regexp(searchString,['<rdf:li rdf:resource="' startString fieldName midString '.*?"/>'],'tokens','match');
1114 targetString=regexprep(targetString,'<rdf:li rdf:resource="|"/>','');
1115 targetString=regexprep(targetString,startString,'');
1116 targetString=regexprep(targetString,[fieldName midString],'');
1117 
1118 for i=1:numel(targetString)
1119     fieldContent=[fieldContent ';' strtrim(targetString{1,i})];
1120 end
1121 
1122 fieldContent=regexprep(fieldContent,'^;|;$','');
1123 end
1124 
1125 function miriamStruct=parseMiriam(searchString)
1126 %Generates miriam structure from annotation field
1127 
1128 %Finding whether miriams are written in the old or the new way
1129 if strfind(searchString,'urn:miriam:')
1130     startString='urn:miriam:';
1131     midString=':';
1132 elseif strfind(searchString,'http://identifiers.org/')
1133     startString='http://identifiers.org/';
1134     midString='/';
1135 elseif strfind(searchString,'https://identifiers.org/')
1136     startString='https://identifiers.org/';
1137     midString='/';
1138 else
1139     miriamStruct=[];
1140     return;
1141 end
1142 
1143 miriamStruct=[];
1144 
1145 searchString=regexprep(searchString,'" />','"/>');
1146 [~,targetString] = regexp(searchString,'<rdf:li rdf:resource=".*?"/>','tokens','match');
1147 targetString=regexprep(targetString,'<rdf:li rdf:resource="|"/>','');
1148 targetString=regexprep(targetString,startString,'');
1149 targetString=regexprep(targetString,midString,'/','once');
1150 
1151 counter=0;
1152 for i=1:numel(targetString)
1153     if isempty(regexp(targetString{1,i},'inchi|ec-code', 'once'))
1154         counter=counter+1;
1155         miriamStruct.name{counter,1} = regexprep(targetString{1,i},'/.+','','once');
1156         miriamStruct.value{counter,1} = regexprep(targetString{1,i},[miriamStruct.name{counter,1} '/'],'','once');
1157         miriamStruct.name{counter,1} = regexprep(miriamStruct.name{counter,1},'^obo\.','');
1158     end
1159 end
1160 end
1161 
1162 function miriam = addSBOtoMiriam(miriam,sboTerm)
1163 %Appends SBO term to miriam structure
1164 
1165 sboTerm = {['SBO:' sprintf('%07u',sboTerm)]};  % convert to proper format
1166 if isempty(miriam)
1167     miriam.name = {'sbo'};
1168     miriam.value = sboTerm;
1169 elseif any(strcmp('sbo',miriam.name))
1170     currSbo = strcmp('sbo',miriam.name);
1171     miriam.value(currSbo) = sboTerm;
1172 else
1173     miriam.name(end+1) = {'sbo'};
1174     miriam.value(end+1) = sboTerm;
1175 end
1176 end

Generated by m2html © 2005