Home > core > randomSampling.m

randomSampling

PURPOSE ^

randomSampling

SYNOPSIS ^

function [solutions, goodRxns]=randomSampling(model,nSamples,replaceBoundsWithInf,supressErrors,runParallel,goodRxns,minFlux)

DESCRIPTION ^

 randomSampling
   Performs random sampling of the solution space, as described in Bordel
   et al. (2010) PLOS Compt Biol (doi:10.1371/journal.pcbi.1000859).

 Input:
   model                   a model structure
   nSamples                the number of solutions to return
                           (optional, default 1000)
   replaceBoundsWithInf    replace the largest upper bounds with Inf and
                           the smallest lower bounds with -Inf. This is
                           needed in order to get solutions without loops
                           if your model has for example 1000/-1000 as
                           arbitary large bounds. If your model only has
                           "biologically relevant" bounds, then set this
                           to false (optional, default true)
   supressErrors           the program will halt if it has problems
                           finding non-zero solutions which are not
                           involved in loops. This could be because the
                           constraints on the model are too relaxed (such
                           as unlimited glucose uptake) or too strict
                           (such as too many and too narrow constraints)
                           (optional, default false)
   runParallel             speed up calculations by parallel processing.
                           Requires MATLAB Parallel Computing Toolbox. If
                           this is not installed, the calculations will
                           not be parallelized, regardless what is
                           indicated as runParallel. (optional, default
                           true)
   goodRxns                double vector of indexes of those reactions
                           that are not involved in loops and can be used
                           as random objective functions, as generated by
                           a previous run of randomSampling on the same
                           model (optional, default empty)
   minFlux                 determines if a second optimization should be
                           performed for each random sample, to minimize
                           the number of fluxes and thereby preventing
                           loops. Typically, loops are averaged out when a
                           large number of samples are taken, but this is
                           not always the case (optional, default false)

 Output:
   solutions               matrix with the solutions
   goodRxns                double vector of indexes of those reactions
                           that are not involved in loops and can be used
                           as random objective functions

 The solutions are generated by maximizing (with random weights) for a
 random set of three reactions. For reversible reactions it randomly
 chooses between maximizing and minimizing.

 Usage: solutions = randomSampling(model, nSamples, replaceBoundsWithInf,...
                       supressErrors, runParallel, goodRxns, minFlux)

CROSS-REFERENCE INFORMATION ^

This function calls: This function is called by:

SUBFUNCTIONS ^

SOURCE CODE ^

0001 function [solutions, goodRxns]=randomSampling(model,nSamples,replaceBoundsWithInf,supressErrors,runParallel,goodRxns,minFlux)
0002 % randomSampling
0003 %   Performs random sampling of the solution space, as described in Bordel
0004 %   et al. (2010) PLOS Compt Biol (doi:10.1371/journal.pcbi.1000859).
0005 %
0006 % Input:
0007 %   model                   a model structure
0008 %   nSamples                the number of solutions to return
0009 %                           (optional, default 1000)
0010 %   replaceBoundsWithInf    replace the largest upper bounds with Inf and
0011 %                           the smallest lower bounds with -Inf. This is
0012 %                           needed in order to get solutions without loops
0013 %                           if your model has for example 1000/-1000 as
0014 %                           arbitary large bounds. If your model only has
0015 %                           "biologically relevant" bounds, then set this
0016 %                           to false (optional, default true)
0017 %   supressErrors           the program will halt if it has problems
0018 %                           finding non-zero solutions which are not
0019 %                           involved in loops. This could be because the
0020 %                           constraints on the model are too relaxed (such
0021 %                           as unlimited glucose uptake) or too strict
0022 %                           (such as too many and too narrow constraints)
0023 %                           (optional, default false)
0024 %   runParallel             speed up calculations by parallel processing.
0025 %                           Requires MATLAB Parallel Computing Toolbox. If
0026 %                           this is not installed, the calculations will
0027 %                           not be parallelized, regardless what is
0028 %                           indicated as runParallel. (optional, default
0029 %                           true)
0030 %   goodRxns                double vector of indexes of those reactions
0031 %                           that are not involved in loops and can be used
0032 %                           as random objective functions, as generated by
0033 %                           a previous run of randomSampling on the same
0034 %                           model (optional, default empty)
0035 %   minFlux                 determines if a second optimization should be
0036 %                           performed for each random sample, to minimize
0037 %                           the number of fluxes and thereby preventing
0038 %                           loops. Typically, loops are averaged out when a
0039 %                           large number of samples are taken, but this is
0040 %                           not always the case (optional, default false)
0041 %
0042 % Output:
0043 %   solutions               matrix with the solutions
0044 %   goodRxns                double vector of indexes of those reactions
0045 %                           that are not involved in loops and can be used
0046 %                           as random objective functions
0047 %
0048 % The solutions are generated by maximizing (with random weights) for a
0049 % random set of three reactions. For reversible reactions it randomly
0050 % chooses between maximizing and minimizing.
0051 %
0052 % Usage: solutions = randomSampling(model, nSamples, replaceBoundsWithInf,...
0053 %                       supressErrors, runParallel, goodRxns, minFlux)
0054 
0055 if nargin<2 | isempty(nSamples)
0056     nSamples=1000;
0057 end
0058 if nargin<3 | isempty(replaceBoundsWithInf)
0059     replaceBoundsWithInf=true;
0060 end
0061 if nargin<4 | isempty(supressErrors)
0062     supressErrors=false;
0063 end
0064 if nargin<5 | isempty(runParallel)
0065     runParallel=true;
0066 end
0067 if nargin<7 | isempty(minFlux)
0068     minFlux=false;
0069 end
0070 
0071 [ps, oldPoolAutoCreateSetting] = parallelPoolRAVEN(runParallel);
0072 
0073 nObjRxns=2; %Number of reactions in the objective function in each iteration
0074 
0075 %Simplify the model to speed stuff up a little. Keep original mapping
0076 originalRxns=model.rxns;
0077 model=simplifyModel(model,false,false,true,true);
0078 
0079 %Then change the bounds to +/- Inf. This is needed in order to not have
0080 %loops in the solutions
0081 if replaceBoundsWithInf==true
0082     model.ub(model.ub==max(model.ub))=Inf;
0083     if min(model.lb)<0 % Only negative lower bounds should be set to -Inf
0084         model.lb(model.lb==min(model.lb))=-Inf;
0085     end
0086 end
0087 
0088 %Check that the model is feasible given the constraints
0089 [sol,~]=solveLP(model);
0090 if isempty(sol.x)
0091     EM='The model has no feasible solution, likely due to incompatible constraints';
0092     dispEM(EM);
0093 elseif sol.f==0 && showProgress
0094     warning('The model objective function cannot reach a non-zero value. This might be intended, so randomSampling will continue, but this could indicate problems with your model')
0095 end
0096 
0097 [~,hsSol]=solveLP(model);
0098 nRxns = numel(model.rxns);
0099 %Reactions which can be involved in loops should not be optimized for.
0100 %Check which reactions reach an arbitary high upper bound
0101 if nargin<6 || isempty(goodRxns)
0102     goodRxns = true(numel(model.rxns),numel(model.rxns));
0103     goodRxns = num2cell(goodRxns,1);
0104     PB = ProgressBar2(nRxns,'Prepare goodRxns not involved in loops','cli');
0105     parfor (i=1:nRxns)
0106         testModel=setParam(model,'eq',model.rxns(i),1000);
0107         sol=solveLP(testModel,0,[],hsSol);
0108         if ~isempty(sol.f)
0109             notGood = abs(sol.x)>999;
0110             goodRxns{i}(notGood)=false;
0111         else
0112             %If the reaction is reversible, also check in that direction
0113             if model.rev(i)
0114                 testModel=setParam(model,'eq',model.rxns(i),-1000);
0115                 sol=solveLP(testModel,0,[],hsSol);
0116                 if ~isempty(sol.f)
0117                     notGood = abs(sol.x)>999;
0118                     goodRxns{i}(notGood)=false;
0119                 end
0120             end
0121         end
0122         count(PB);
0123     end
0124     goodRxns = cell2mat(goodRxns);
0125     goodRxns = find(prod(goodRxns,2));
0126 end
0127 
0128 %Reserve space for a solution matrix
0129 sols = zeros(numel(model.rxns),nSamples);
0130 sols = num2cell(sols,1);
0131 
0132 %Main loop
0133 if nSamples > 0
0134 
0135     PB = ProgressBar2(nSamples,'Performing random sampling','cli');
0136     parfor i=1:nSamples
0137         badSolutions = 1;
0138         tmpModel = model;
0139         while lt(0,badSolutions)
0140             rxns = randsample(numel(goodRxns),nObjRxns);
0141             tmpModel.c = zeros(numel(tmpModel.rxns),1);
0142             multipliers = randsample([-1 1],nObjRxns,true);
0143             multipliers(tmpModel.rev(goodRxns(rxns))==0) = 1;
0144             tmpModel.c(goodRxns(rxns)) = rand(nObjRxns,1).*multipliers;
0145             if true(minFlux)
0146                 sol=solveLP(tmpModel,1,[],hsSol);
0147             else
0148                 sol=solveLP(tmpModel,0,[],hsSol);
0149             end
0150             if any(sol.x) && abs(sol.f)>10^-8
0151                 sols{i} = sol.x;
0152                 badSolutions = 0;
0153             else
0154                 badSolutions = badSolutions+1;
0155                 %If it only finds bad solutions then throw an error.
0156                 if badSolutions == 100 && supressErrors==false
0157                     error('The program is having problems finding non-zero solutions (ignoring reactions that might be involved in loops). Review the constraints on your model. Set supressErrors to true to ignore this error');
0158                 end
0159             end
0160         end
0161         count(PB);
0162     end
0163 
0164     %Map to original model
0165     sols = cell2mat(sols);
0166     [~, I]=ismember(model.rxns,originalRxns);
0167     solutions=zeros(numel(originalRxns),nSamples);
0168     solutions(I,:)=sols;
0169 else
0170     solutions=[];
0171 end
0172 % Reset original Parallel setting
0173 ps.Pool.AutoCreate = oldPoolAutoCreateSetting;
0174 end
0175 
0176 %To use instead of the normal Matlab randsample function. This is in order
0177 %to not depend on the Matlab statistical toolbox.
0178 function I=randsample(n,k,replacement)
0179 if nargin<3
0180     replacement=false;
0181 end
0182 %n can be a integer, which leads to I being sampled from 1:n, or it can be
0183 %a population to sample from.
0184 if numel(n)==1 && isnumeric(n)
0185     n=1:n;
0186 end
0187 %Loop and get random numbers until the list is unique. This is only a good
0188 %option is the number of samples is small compared to the population. There
0189 %are several checks that should be made here, for example regarding size
0190 %and that the number of samples is <=population size if replacement==false.
0191 %This is not the case in randomSampling, so such checks are ignored
0192 while true
0193     J=randi(numel(n),[k,1]);
0194     if replacement==true || numel(J)==numel(unique(J))
0195         I=n(J);
0196         break;
0197     end
0198 end
0199 I=I(:);
0200 end

Generated by m2html © 2005