工具篇 速查手册

Seurat 函数速查手册

30+核心函数分类整理,单细胞分析的百科全书

⚠️
免责声明: 本内容仅供医学学习参考,不作为临床诊断依据。 实际临床决策请结合患者具体情况和多学科意见。

📂 快速导航

1 质控函数 (QC)

CreateSeuratObject()

基础

创建Seurat对象,是所有分析的起点

CreateSeuratObject(
  counts,
  project = "Project",
  min.cells = 3,
  min.features = 200
)

PercentageFeatureSet()

常用

计算指定基因集的表达比例(如线粒体基因)

# 计算线粒体基因比例
obj[["percent.mt"]] <- PercentageFeatureSet(obj, pattern = "^MT-")

# 计算核糖体基因比例
obj[["percent.rb"]] <- PercentageFeatureSet(obj, pattern = "^RP[SL]")

subset()

常用

过滤低质量细胞

obj <- subset(obj,
  subset = "nFeature_RNA > 200 & nFeature_RNA < 5000 & percent.mt < 20"
)

2 标准化函数

NormalizeData()

基础

Log标准化方法

obj <- NormalizeData(
  obj,
  normalization.method = "LogNormalize",
  scale.factor = 10000
)

SCTransform()

推荐

方差稳定化转换,Seurat v5推荐方法

obj <- SCTransform(
  obj,
  vars.to.regress = "percent.mt",
  verbose = FALSE
)

FindVariableFeatures()

常用

识别高变基因(HVGs)

obj <- FindVariableFeatures(
  obj,
  selection.method = "vst",
  nfeatures = 2000
)

ScaleData()

常用

数据缩放(均值为0,方差为1)

obj <- ScaleData(
  obj,
  features = rownames(obj),
  vars.to.regress = "percent.mt"
)

3 降维函数

RunPCA()

常用

主成分分析(线性降维)

obj <- RunPCA(
  obj,
  features = VariableFeatures(obj),
  npcs = 50
)

RunUMAP()

常用

UMAP非线性降维

obj <- RunUMAP(
  obj,
  dims = 1:20,
  reduction = "pca"
)

RunTSNE()

可选

t-SNE非线性降维

obj <- RunTSNE(
  obj,
  dims = 1:20,
  reduction = "pca"
)

ElbowPlot()

可视化

肘部图,帮助选择PC数量

ElbowPlot(obj, ndims = 50)

4 聚类函数

FindNeighbors()

常用

构建KNN图

obj <- FindNeighbors(
  obj,
  dims = 1:20,
  reduction = "pca"
)

FindClusters()

常用

Louvain/Leiden聚类

obj <- FindClusters(
  obj,
  resolution = 0.5,
  algorithm = 1  # 1=Louvain, 3=Leiden
)

FindAllMarkers()

常用

寻找所有cluster的marker基因

markers <- FindAllMarkers(
  obj,
  only.pos = TRUE,
  min.pct = 0.25,
  logfc.threshold = 0.25
)

FindMarkers()

常用

寻找指定cluster的marker基因

markers <- FindMarkers(
  obj,
  ident.1 = 1,
  ident.2 = 2,
  test.use = "wilcox"
)

5 可视化函数

DimPlot()

常用

降维可视化(UMAP/tSNE图)

DimPlot(
  obj,
  reduction = "umap",
  group.by = "seurat_clusters",
  label = TRUE,
  pt.size = 0.5
)

FeaturePlot()

常用

基因表达在降维图上的展示

FeaturePlot(
  obj,
  features = c("CD3D", "CD8A"),
  reduction = "umap",
  min.cutoff = "q10",
  max.cutoff = "q90"
)

VlnPlot()

常用

小提琴图,展示基因表达分布

VlnPlot(
  obj,
  features = c("nFeature_RNA", "percent.mt"),
  ncol = 2,
  pt.size = 0.1
)

DotPlot()

常用

点图,展示基因在cluster中的表达

DotPlot(
  obj,
  features = c("CD3D", "CD8A", "MS4A1")
) + RotatedAxis()

DoHeatmap()

常用

热图,展示基因表达模式

DoHeatmap(
  obj,
  features = top_markers$gene,
  size = 3
) + NoLegend()

RidgePlot()

可选

山脊图,展示基因表达分布

RidgePlot(
  obj,
  features = "CD3D",
  group.by = "seurat_clusters"
)

6 高级函数

FindIntegrationAnchors()

整合

寻找整合锚点,用于多数据集整合

anchors <- FindIntegrationAnchors(
  object.list = dataset_list,
  dims = 1:30
)

IntegrateData()

整合

整合多数据集,消除批次效应

combined <- IntegrateData(
  anchorset = anchors,
  dims = 1:30
)

RenameIdents()

注释

重命名cluster身份(细胞类型注释)

RenameIdents(
  obj,
  '0' = 'T cells',
  '1' = 'B cells'
)

subset()

按cluster或其他元数据子集化

# 提取特定cluster
t_cells <- subset(obj, idents = 'T cells')

# 提取多个cluster
sub <- subset(obj, idents = c(1, 2))

merge()

合并多个Seurat对象

merged <- merge(
  x = obj1,
  y = obj2,
  add.cell.ids = c('batch1', 'batch2'),
  project = 'combined'
)

💡 使用建议

🚀 新手推荐流程

  1. 1. CreateSeuratObject
  2. 2. PercentageFeatureSet (QC)
  3. 3. SCTransform (一体化)
  4. 4. RunPCA → RunUMAP
  5. 5. FindNeighbors → FindClusters
  6. 6. DimPlot 可视化

📚 常见问题

  • 细胞太多:先用subset取子集测试
  • 聚类不好:调整resolution参数
  • 批次效应:使用Harmony或IntegrateData
  • 找不到marker:放宽logfc阈值