getGenesFromGrRules Extract gene list and rxnGeneMat from grRules array. USAGE: [genes,rxnGeneMat] = getGenesFromGrRules(grRules, originalGenes); INPUTS: grRules A cell array of model grRules, from which a list of genes are to be extracted. NOTE: Boolean operators can be text ("and", "or") or symbolic ("&", "|"), but there must be a space between operators and gene names/IDs. originalGenes The original gene list from the model as reference OUTPUTS: genes A unique list of all gene IDs that appear in grRules. rxnGeneMat (Optional) A binary matrix indicating which genes participate in each reaction, where rows correspond to reactions (entries in grRules) and columns correspond to genes.
0001 function [genes,rxnGeneMat] = getGenesFromGrRules(grRules, originalGenes) 0002 %getGenesFromGrRules Extract gene list and rxnGeneMat from grRules array. 0003 % 0004 % USAGE: 0005 % 0006 % [genes,rxnGeneMat] = getGenesFromGrRules(grRules, originalGenes); 0007 % 0008 % INPUTS: 0009 % 0010 % grRules A cell array of model grRules, from which a list of genes 0011 % are to be extracted. 0012 % NOTE: Boolean operators can be text ("and", "or") or 0013 % symbolic ("&", "|"), but there must be a space 0014 % between operators and gene names/IDs. 0015 % originalGenes The original gene list from the model as reference 0016 % 0017 % OUTPUTS: 0018 % 0019 % genes A unique list of all gene IDs that appear in grRules. 0020 % 0021 % rxnGeneMat (Optional) A binary matrix indicating which genes 0022 % participate in each reaction, where rows correspond to 0023 % reactions (entries in grRules) and columns correspond to 0024 % genes. 0025 % 0026 0027 0028 % handle input arguments 0029 if nargin < 2 0030 originalGenes = []; 0031 end 0032 0033 % check if the grRules use written or symbolic boolean operators 0034 if any(contains(grRules,{' & ',' | '})) 0035 % fix some potential missing spaces between parentheses and &/| 0036 grRules = regexprep(grRules,'\)&',') &'); % ")&" -> ") &" 0037 grRules = regexprep(grRules,'&\(','& ('); % "&(" -> "& (" 0038 grRules = regexprep(grRules,'\)\|',') |'); % ")|" -> ") |" 0039 grRules = regexprep(grRules,'\|\(','| ('); % "|(" -> "| (" 0040 else 0041 % fix some potential missing spaces between parentheses and AND/OR 0042 grRules = regexprep(grRules,'\)and',') and'); % ")and" -> ") and" 0043 grRules = regexprep(grRules,'and\(','and ('); % "and(" -> "and (" 0044 grRules = regexprep(grRules,'\)or',') or'); % ")or" -> ") or" 0045 grRules = regexprep(grRules,'or\(','or ('); % "or(" -> "or (" 0046 0047 % convert "and" to "&" and "or" to "|" (easier to work with symbols) 0048 grRules = regexprep(grRules, ' or ', ' | '); 0049 grRules = regexprep(grRules, ' and ', ' & '); 0050 end 0051 0052 % extract list of genes from each reaction 0053 rxnGenes = cellfun(@(r) regexprep(unique(strsplit(r,{' | ',' & '})),'[\(\) ]+',''),grRules,'UniformOutput',false); 0054 0055 % construct new gene list 0056 nonEmpty = ~cellfun(@isempty,rxnGenes); 0057 genes = unique(transpose([rxnGenes{nonEmpty}])); 0058 genes(cellfun(@isempty,genes)) = []; 0059 0060 if ~isempty(originalGenes) 0061 if ~isequal(sort(originalGenes), sort(genes)) 0062 error('The grRules and original gene list are inconsistent!'); 0063 else 0064 genes = originalGenes; 0065 end 0066 end 0067 0068 % construct new rxnGeneMat (if requested) 0069 if nargout > 1 0070 rxnGeneCell = cellfun(@(rg) ismember(genes,rg),rxnGenes,'UniformOutput',false); 0071 rxnGeneMat = sparse(double(horzcat(rxnGeneCell{:})')); 0072 end