randomSampling Performs random sampling of the solution space, as described in Bordel et al. (2010) PLOS Compt Biol (doi:10.1371/journal.pcbi.1000859). Input: model a model structure nSamples the number of solutions to return (opt, default 1000) replaceBoundsWithInf replace the largest upper bounds with Inf and the smallest lower bounds with -Inf. This is needed in order to get solutions without loops if your model has for example 1000/-1000 as arbitary large bounds. If your model only has "biologically relevant" bounds, then set this to false (opt, default true) supressErrors the program will halt if it has problems finding non-zero solutions which are not involved in loops. This could be because the constraints on the model are too relaxed (such as unlimited glucose uptake) or too strict (such as too many and too narrow constraints) (opt, default false) runParallel speed up calculations by parallel processing. Requires MATLAB Parallel Computing Toolbox. If this is not installed, the calculations will not be parallelized, regardless what is indicated as runParallel. (opt, default true) goodRxns double vector of indexes of those reactions that are not involved in loops and can be used as random objective functions, as generated by a previous run of randomSampling on the same model (opt, default empty) minFlux determines if a second optimization should be performed for each random sample, to minimize the number of fluxes and thereby preventing loops. Typically, loops are averaged out when a large number of samples are taken, but this is not always the case (opt, default false) Output: solutions matrix with the solutions goodRxns double vector of indexes of those reactions that are not involved in loops and can be used as random objective functions The solutions are generated by maximizing (with random weights) for a random set of three reactions. For reversible reactions it randomly chooses between maximizing and minimizing. Usage: solutions=randomSampling(model,nSamples,replaceBoundsWithInf,supressErrors,runParallel,goodRxns,minFlux)
0001 function [solutions, goodRxns]=randomSampling(model,nSamples,replaceBoundsWithInf,supressErrors,runParallel,goodRxns,minFlux) 0002 % randomSampling 0003 % Performs random sampling of the solution space, as described in Bordel 0004 % et al. (2010) PLOS Compt Biol (doi:10.1371/journal.pcbi.1000859). 0005 % 0006 % Input: 0007 % model a model structure 0008 % nSamples the number of solutions to return 0009 % (opt, default 1000) 0010 % replaceBoundsWithInf replace the largest upper bounds with Inf and 0011 % the smallest lower bounds with -Inf. This is 0012 % needed in order to get solutions without loops 0013 % if your model has for example 1000/-1000 as 0014 % arbitary large bounds. If your model only has 0015 % "biologically relevant" bounds, then set this 0016 % to false (opt, default true) 0017 % supressErrors the program will halt if it has problems 0018 % finding non-zero solutions which are not 0019 % involved in loops. This could be because the 0020 % constraints on the model are too relaxed (such 0021 % as unlimited glucose uptake) or too strict 0022 % (such as too many and too narrow constraints) 0023 % (opt, default false) 0024 % runParallel speed up calculations by parallel processing. 0025 % Requires MATLAB Parallel Computing Toolbox. If 0026 % this is not installed, the calculations will 0027 % not be parallelized, regardless what is 0028 % indicated as runParallel. (opt, default true) 0029 % goodRxns double vector of indexes of those reactions 0030 % that are not involved in loops and can be used 0031 % as random objective functions, as generated by 0032 % a previous run of randomSampling on the same 0033 % model (opt, default empty) 0034 % minFlux determines if a second optimization should be 0035 % performed for each random sample, to minimize 0036 % the number of fluxes and thereby preventing 0037 % loops. Typically, loops are averaged out when a 0038 % large number of samples are taken, but this is 0039 % not always the case (opt, default false) 0040 % 0041 % 0042 % Output: 0043 % solutions matrix with the solutions 0044 % goodRxns double vector of indexes of those reactions 0045 % that are not involved in loops and can be used 0046 % as random objective functions 0047 % 0048 % The solutions are generated by maximizing (with random weights) for a 0049 % random set of three reactions. For reversible reactions it randomly 0050 % chooses between maximizing and minimizing. 0051 % 0052 % Usage: solutions=randomSampling(model,nSamples,replaceBoundsWithInf,supressErrors,runParallel,goodRxns,minFlux) 0053 0054 if nargin<2 | isempty(nSamples) 0055 nSamples=1000; 0056 end 0057 if nargin<3 | isempty(replaceBoundsWithInf) 0058 replaceBoundsWithInf=true; 0059 end 0060 if nargin<4 | isempty(supressErrors) 0061 supressErrors=false; 0062 end 0063 if nargin<5 | isempty(runParallel) 0064 runParallel=true; 0065 end 0066 if nargin<7 | isempty(minFlux) 0067 minFlux=false; 0068 end 0069 0070 [ps, oldPoolAutoCreateSetting] = parallelPoolRAVEN(runParallel); 0071 0072 nObjRxns=2; %Number of reactions in the objective function in each iteration 0073 0074 %Simplify the model to speed stuff up a little. Keep original mapping 0075 originalRxns=model.rxns; 0076 model=simplifyModel(model,false,false,true,true); 0077 0078 %Then change the bounds to +/- Inf. This is needed in order to not have 0079 %loops in the solutions 0080 if replaceBoundsWithInf==true 0081 model.ub(model.ub==max(model.ub))=Inf; 0082 if min(model.lb)<0 % Only negative lower bounds should be set to -Inf 0083 model.lb(model.lb==min(model.lb))=-Inf; 0084 end 0085 end 0086 0087 %Check that the model is feasible given the constraints 0088 [sol,~]=solveLP(model); 0089 if isempty(sol.x) 0090 EM='The model has no feasible solution, likely due to incompatible constraints'; 0091 dispEM(EM); 0092 elseif sol.f==0 && showProgress 0093 warning('The model objective function cannot reach a non-zero value. This might be intended, so randomSampling will continue, but this could indicate problems with your model') 0094 end 0095 0096 [~,hsSol]=solveLP(model); 0097 nRxns = numel(model.rxns); 0098 %Reactions which can be involved in loops should not be optimized for. 0099 %Check which reactions reach an arbitary high upper bound 0100 if nargin<6 || isempty(goodRxns) 0101 goodRxns = true(numel(model.rxns),numel(model.rxns)); 0102 goodRxns = num2cell(goodRxns,1); 0103 PB = ProgressBar2(nRxns,'Prepare goodRxns not involved in loops','cli'); 0104 parfor (i=1:nRxns) 0105 testModel=setParam(model,'eq',model.rxns(i),1000); 0106 sol=solveLP(testModel,0,[],hsSol); 0107 if ~isempty(sol.f) 0108 notGood = abs(sol.x)>999; 0109 goodRxns{i}(notGood)=false; 0110 else 0111 %If the reaction is reversible, also check in that direction 0112 if model.rev(i) 0113 testModel=setParam(model,'eq',model.rxns(i),-1000); 0114 sol=solveLP(testModel,0,[],hsSol); 0115 if ~isempty(sol.f) 0116 notGood = abs(sol.x)>999; 0117 goodRxns{i}(notGood)=false; 0118 end 0119 end 0120 end 0121 count(PB); 0122 end 0123 goodRxns = cell2mat(goodRxns); 0124 goodRxns = find(prod(goodRxns,2)); 0125 end 0126 0127 %Reserve space for a solution matrix 0128 sols = zeros(numel(model.rxns),nSamples); 0129 sols = num2cell(sols,1); 0130 0131 %Main loop 0132 PB = ProgressBar2(nSamples,'Performing random sampling','cli'); 0133 parfor i=1:nSamples 0134 badSolutions = 1; 0135 tmpModel = model; 0136 while lt(0,badSolutions) 0137 rxns = randsample(numel(goodRxns),nObjRxns); 0138 tmpModel.c = zeros(numel(tmpModel.rxns),1); 0139 multipliers = randsample([-1 1],nObjRxns,true); 0140 multipliers(tmpModel.rev(goodRxns(rxns))==0) = 1; 0141 tmpModel.c(goodRxns(rxns)) = rand(nObjRxns,1).*multipliers; 0142 if true(minFlux) 0143 sol=solveLP(tmpModel,1,[],hsSol); 0144 else 0145 sol=solveLP(tmpModel,0,[],hsSol); 0146 end 0147 if any(sol.x) && abs(sol.f)>10^-8 0148 sols{i} = sol.x; 0149 badSolutions = 0; 0150 else 0151 badSolutions = badSolutions+1; 0152 %If it only finds bad solutions then throw an error. 0153 if badSolutions == 100 && supressErrors==false 0154 error('The program is having problems finding non-zero solutions (ignoring reactions that might be involved in loops). Review the constraints on your model. Set supressErrors to true to ignore this error'); 0155 end 0156 end 0157 end 0158 count(PB); 0159 end 0160 0161 %Map to original model 0162 sols = cell2mat(sols); 0163 [~, I]=ismember(model.rxns,originalRxns); 0164 solutions=zeros(numel(originalRxns),nSamples); 0165 solutions(I,:)=sols; 0166 solutions=sparse(solutions); 0167 % Reset original Parallel setting 0168 ps.Pool.AutoCreate = oldPoolAutoCreateSetting; 0169 end 0170 0171 %To use instead of the normal Matlab randsample function. This is in order 0172 %to not depend on the Matlab statistical toolbox. 0173 function I=randsample(n,k,replacement) 0174 if nargin<3 0175 replacement=false; 0176 end 0177 %n can be a integer, which leads to I being sampled from 1:n, or it can be 0178 %a population to sample from. 0179 if numel(n)==1 && isnumeric(n) 0180 n=1:n; 0181 end 0182 %Loop and get random numbers until the list is unique. This is only a good 0183 %option is the number of samples is small compared to the population. There 0184 %are several checks that should be made here, for example regarding size 0185 %and that the number of samples is <=population size if replacement==false. 0186 %This is not the case in randomSampling, so such checks are ignored 0187 while true 0188 J=randi(numel(n),[k,1]); 0189 if replacement==true || numel(J)==numel(unique(J)) 0190 I=n(J); 0191 break; 0192 end 0193 end 0194 I=I(:); 0195 end