First filter using the expression threshold. Then use the coefficient of the top variance PCA components to determine the variability of the gene.

SelectData(M, gene_expression_threshold, n_features)

Arguments

M

a matrix of expression values for each gene (rows) and cell (columns)

gene_expression_threshold

for n cells, for gene_expression_threshold = m, dont consider genes expressed in more than n-m cells or genes expressed in less than m cells

n_features

number of genes to retrieve

Value

a table of features (rows) and samples (columns)