analyzeSampling Compares the significance of change in flux between two conditions with the significance of change in gene expression Tex a vector of t-scores for the change in gene expression for each reaction. This score could be the Student t between the two conditions, or you can calculate it from a p-value (by computing the inverse of the so called error function). If you choose the second alternative you should be aware that the transcripts that increased in expression level should have positive values and those who decreased in expression level should have negative values (the p-values only tell you if the fluxes changed or not but not in which direction) df the degrees of freedom in the t-test solutionsA random solutions for the reference condition (as generated by randomSampling) solutionsB random solutions for the test condition (as generated by randomSampling) printResults prints the most significant reactions in each category (optional, default false) scores a Nx3 column matrix with the probabilities of a reaction: 1) changing both in flux and expression in the same direction 2) changing in expression but not in flux 3) changing in flux but not in expression or changing in opposed directions in flux and expression. Usage: scores=analyzeSampling(Tex, df, solutionsA, solutionsB, printResults)
0001 function scores=analyzeSampling(Tex, df, solutionsA, solutionsB, printResults) 0002 % analyzeSampling 0003 % Compares the significance of change in flux between two conditions with 0004 % the significance of change in gene expression 0005 % 0006 % Tex a vector of t-scores for the change in gene expression 0007 % for each reaction. This score could be the Student t 0008 % between the two conditions, or you can calculate it from 0009 % a p-value (by computing the inverse of the so called error 0010 % function). If you choose the second alternative you should 0011 % be aware that the transcripts that increased in expression 0012 % level should have positive values and those who decreased 0013 % in expression level should have negative values (the 0014 % p-values only tell you if the fluxes changed or not but 0015 % not in which direction) 0016 % df the degrees of freedom in the t-test 0017 % solutionsA random solutions for the reference condition (as 0018 % generated by randomSampling) 0019 % solutionsB random solutions for the test condition (as generated 0020 % by randomSampling) 0021 % printResults prints the most significant reactions in each category 0022 % (optional, default false) 0023 % 0024 % scores a Nx3 column matrix with the probabilities of a reaction: 0025 % 1) changing both in flux and expression in the same direction 0026 % 2) changing in expression but not in flux 0027 % 3) changing in flux but not in expression or changing 0028 % in opposed directions in flux and expression. 0029 % 0030 % Usage: scores=analyzeSampling(Tex, df, solutionsA, solutionsB, printResults) 0031 0032 if nargin<5 0033 printResults=false; 0034 end 0035 0036 nRxns=numel(Tex); 0037 pM=zeros(nRxns,1); 0038 pH=zeros(nRxns,1); 0039 pR=zeros(nRxns,1); 0040 0041 %Check that the number of reactions is the same in both expression and flux 0042 if nRxns~=size(solutionsA,1) 0043 EM='The number of reactions must be the same in Tex as in solutionsA'; 0044 dispEM(EM); 0045 end 0046 0047 %Get the Z-score and mean for the solutions 0048 mA=mean(solutionsA,2); 0049 mB=mean(solutionsB,2); 0050 Zf=getFluxZ(solutionsA, solutionsB); 0051 0052 %Clear up the tex if there are elements that are NaN or +/- Inf. 0053 I=isnan(Tex) | isinf(Tex); 0054 if any(I) 0055 EM='There are t-scores that are NaN or +/- Inf. These values are changed to 0.0'; 0056 dispEM(EM,false); 0057 end 0058 Tex(I)=0; 0059 0060 for i=1:nRxns 0061 %Check how the flux has changed. The means are needed because to 0062 %differentiate between a positive flux changing to a smaller flux and a 0063 %negative flux changing to a more negative flux (which is a larger 0064 %flux) 0065 I=mB(i)/mA(i); 0066 if I<0 0067 pM(i)=erf(abs(Zf(i))); 0068 pH(i)=(1-pM(i))*(2*tcdf(abs(Tex(i)),df)-1); 0069 pR(i)=0; 0070 else 0071 if mB(i)<0 0072 Zf(i)=Zf(i)*-1; 0073 end 0074 end 0075 0076 I=Zf(i)/Tex(i); 0077 if I<0 0078 pM(i)=erf(abs(Zf(i))); 0079 pH(i)=(1-pM(i))*(2*tcdf(abs(Tex(i)),df)-1); 0080 pR(i)=0; 0081 else 0082 pR(i)=erf(abs(Zf(i)))*(2*tcdf(abs(Tex(i)),df)-1); 0083 pH(i)=(2*tcdf(abs(Tex(i)),df)-1)*(1-erf(abs(Zf(i)))); 0084 pM(i)=erf(abs(Zf(i)))*(1-(2*tcdf(abs(Tex(i)),df)-1)); 0085 end 0086 end 0087 0088 scores=[pR pH pM]; 0089 0090 if printResults==true 0091 fprintf('TOP SCORING REACTIONS\n\n'); 0092 %The top 10 hits in the first category 0093 [I, J]=sort(pR,'descend'); 0094 fprintf('Reactions which change both in flux and expression in the same direction\nReaction\tProbability\n'); 0095 for i=1:10 0096 fprintf([num2str(J(i)) '\t' num2str(I(i)) '\n']); 0097 end 0098 0099 %The top 10 hits in the first category 0100 [I, J]=sort(pH,'descend'); 0101 fprintf('\nReactions which change in expression but not in flux\nReaction\tProbability\n'); 0102 for i=1:10 0103 fprintf([num2str(J(i)) '\t' num2str(I(i)) '\n']); 0104 end 0105 0106 %The top 10 hits in the first category 0107 [I, J]=sort(pM,'descend'); 0108 fprintf('\nReactions which change in flux but not in expression, or in opposed directions in flux and expression\nReaction\tProbability\n'); 0109 for i=1:10 0110 fprintf([num2str(J(i)) '\t' num2str(I(i)) '\n']); 0111 end 0112 end 0113 end 0114 0115 function p = tcdf(t,v,uflag) 0116 % TCDF Student's T cumulative distribution function (cdf). 0117 % 0118 % P = TCDF(X,V) computes the cdf for Student's T distribution 0119 % with V degrees of freedom, at the values in X. V must be a 0120 % scalar or have the same size as T. 0121 % 0122 % P = TCDF(X,V,'upper') computes it for the upper tail instead 0123 % of the lower tail. 0124 % 0125 % This is an alternative to the TCDF function that is implemented 0126 % in the Matlab statistics toolbox. This version originates from 0127 % http://www.statsci.org/matlab/statbox.html and originally was called TP. 0128 % It has been renamed to TCDF for drop-in compatibility with the Matlab 0129 % version. 0130 % 0131 % Gordon Smyth, University of Queensland, gks@maths.uq.edu.au 0132 % 3 Apr 97 0133 % 0134 % NaN compatible - Markus Bauer and Eric Maris, FCDC 0135 % 27 Jan 2005 0136 % 0137 % fixed bug concerning NaN compatibility 0138 % 21 Aug 2006, Markus Siegel 0139 % 0140 % added support for upper tail, see http://bugzilla.fieldtriptoolbox.org/show_bug.cgi?id=3045 0141 % 13 Jan 2016, Robert Oostenveld 0142 % 0143 % taken from https://github.com/fieldtrip/fieldtrip/blob/master/external/stats/tcdf.m 0144 % 05 May 2022 0145 0146 if v <= 0, error('Degrees of freedom must be positive.'); end 0147 0148 % resize v if necessary 0149 if all(size(v)==1) 0150 v = ones(size(t))*v; 0151 end 0152 0153 %check for NaN's - don't do calculations on them, give those out as NaNs 0154 if any( not(isfinite(t(:))) | not(isfinite(v(:))) ) 0155 sel = find(isfinite(t) & isfinite(v)); 0156 x=nan(size(t)); 0157 p=nan(size(t)); 0158 x(sel) = t(sel).^2 ./ (v(sel) + t(sel).^2) ; 0159 p(sel) = 0.5 .* ( 1 + sign(t(sel)) .* betainc( x(sel), 0.5, 0.5*v(sel) ) ); 0160 else 0161 x = t.^2 ./ (v + t.^2) ; 0162 p = 0.5 .* ( 1 + sign(t) .* betainc( x, 0.5, 0.5*v ) ); 0163 end 0164 0165 if nargin==3 0166 if strcmp(uflag, 'upper') 0167 % compute it for the upper tail instead of the lower tail 0168 p = 1-p; 0169 else 0170 error('incorrect specification of uflag'); 0171 end 0172 end 0173 end