求助帖:尋有CQN (Conditional Quantile Normalization)條件分位數(shù)標(biāo)準(zhǔn)化 經(jīng)驗(yàn)的大神
CQN (conditional quantile normalization) for RNA-Seq data
先附上CQN的help內(nèi)容,再來說我的疑問。如果你有使用過CQN條件分位數(shù)標(biāo)準(zhǔn)化,請?jiān)谠u論區(qū)說說你的看法,先謝謝各位讀者啦? (文章中有表述不對的地方,請指點(diǎn),大家共同進(jìn)步~)? 能解惑者必有謝
Examples
data(montgomery.subset)
data(sizeFactors.subset)
data(uCovar)
cqn.subset <- cqn(montgomery.subset【表達(dá)量矩陣】, lengths = uCovar$length【每個(gè)基因長度】,?
? ? ? ? ? ? ? ? ? x = uCovar$GCcontent【每個(gè)基因GC含量】, sizeFactors = sizeFactors.subset【文庫大小,非必要參數(shù)】,
? ? ? ? ? ? ? ? ? verbose = TRUE)
這是R中的例子。我的疑問,這個(gè)GC含量是怎么來的??糾結(jié)……疑惑,求助這一塊知識。能解惑者必有謝~
幫助文檔中也沒有說明。

Description
This function implements CQN (conditional quantile normalization) for RNA-Seq data.
Usage
cqn(counts, x, lengths, sizeFactors = NULL, subindex = NULL, tau = 0.5, sqn = TRUE,
? ? lengthMethod = c("smooth", "fixed"), verbose = FALSE)
## S3 method for class 'cqn'
print(x, ...)
Arguments
counts
An object that can be coerced to a matrix of region by sample counts. Ought to have integer values.
x
This is a covariate whose systematic influence on the counts will be removed. Typically the GC content. Has to have the same length as the number of rows of counts.
lengths
The lengths (in bp) of the regions in counts. Has to have the same length as the number of rows of counts.
sizeFactors
An optional vector of sizeFactors, ie. the sequencing effort of the various samples. If NULL this is calculated as the column sums of counts.
subindex
An optional vector of indices into the rows of counts. If not given, this becomes the indices of genes with row means of counts greater then 50.
tau
This argument is passed to rq, it indicates what quantile is being fit. The default should only be changed by expert users..
sqn
This argument indicates whether the residuals from the systematic fit are (subset) quantile normalized. The default should only be changed by expert users.
lengthMethod
Should length enter the model as a smooth function or not.
verbose
Is the function verbose?
...
Not used.
Details
These functions implement the CQN (conditional quantile normalization) for RNA-Seq data. The functions remove a single systematic effect, contained in the argument x, which will typicall be GC content. The effect of lengths will either be modelled as a smooth function (which we recommend), if you are using lengthMethod = "smooth" or as an offset (equivalent to modelling using RPKMs), if you are using lengthMethod = "fixed". Length can be complete removed from the model by having lengthMethod = "fixed" and setting all lengths to 1000.
Final corrected values are equal to value$y + value$offset.
Value
A list with the following components
counts
The value of argument counts.
x
The value of argument x.
lengths
The value of argument lengths.
sizeFactors
The value of argument sizeFactors. In case the argument was NULL, this is the value used internally.
subindex
The value of argument subindex. In case the argument was NULL, this is the value used internally.
y
The dependent value used in the systematic effect fit. Equal to log2 tranformed reads per millions.
offset
The estimated offset.
offset0
A single number used internally for identifiability.
glm.offset
An offset useful for supplying to a GLM type model function. It is on the natural log scale and includes correcting for sizeFactors.
func1
The estimated effect of function 1 (argument x). This is a matrix of function values on a grid. Columns are samples and rows are grid points.
grid1
The grid points on which function 1 (argument x) was evaluated.
knots1
The knots used for function 1 (argument x).
func2
The estimated effect of function 2 (lengths). This is a matrix of function values on a grid. Columns are samples and rows are grid points.
grid2
The grid points on which function 2 (lengths) was evaluated.
knots2
The knots used for function 2 (lengths).
call
The call.
Note
Internally, the function uses a custom implementation of subset quantile normalization, contained in the (not exported) SQN2 function.
Author(s)
Kasper Daniel Hansen, Zhijin Wu
References
KD Hansen, RA Irizarry, and Z Wu, Removing technical variability in RNA-seq data using conditional quantile normalization. Biostatistics 2012 vol. 13(2) pp. 204-216.
See Also
The package vignette.
UP主:天馬行空的坦克兵
2021-06-17