📂 快速导航
1 质控函数 (QC)
CreateSeuratObject()
基础创建Seurat对象,是所有分析的起点
CreateSeuratObject( counts, project = "Project", min.cells = 3, min.features = 200 )
PercentageFeatureSet()
常用计算指定基因集的表达比例(如线粒体基因)
# 计算线粒体基因比例 obj[["percent.mt"]] <- PercentageFeatureSet(obj, pattern = "^MT-") # 计算核糖体基因比例 obj[["percent.rb"]] <- PercentageFeatureSet(obj, pattern = "^RP[SL]")
subset()
常用过滤低质量细胞
obj <- subset(obj, subset = "nFeature_RNA > 200 & nFeature_RNA < 5000 & percent.mt < 20" )
2 标准化函数
NormalizeData()
基础Log标准化方法
obj <- NormalizeData( obj, normalization.method = "LogNormalize", scale.factor = 10000 )
SCTransform()
推荐方差稳定化转换,Seurat v5推荐方法
obj <- SCTransform( obj, vars.to.regress = "percent.mt", verbose = FALSE )
FindVariableFeatures()
常用识别高变基因(HVGs)
obj <- FindVariableFeatures( obj, selection.method = "vst", nfeatures = 2000 )
ScaleData()
常用数据缩放(均值为0,方差为1)
obj <- ScaleData( obj, features = rownames(obj), vars.to.regress = "percent.mt" )
3 降维函数
RunPCA()
常用主成分分析(线性降维)
obj <- RunPCA( obj, features = VariableFeatures(obj), npcs = 50 )
RunUMAP()
常用UMAP非线性降维
obj <- RunUMAP( obj, dims = 1:20, reduction = "pca" )
RunTSNE()
可选t-SNE非线性降维
obj <- RunTSNE( obj, dims = 1:20, reduction = "pca" )
ElbowPlot()
可视化肘部图,帮助选择PC数量
ElbowPlot(obj, ndims = 50)
4 聚类函数
FindNeighbors()
常用构建KNN图
obj <- FindNeighbors( obj, dims = 1:20, reduction = "pca" )
FindClusters()
常用Louvain/Leiden聚类
obj <- FindClusters( obj, resolution = 0.5, algorithm = 1 # 1=Louvain, 3=Leiden )
FindAllMarkers()
常用寻找所有cluster的marker基因
markers <- FindAllMarkers( obj, only.pos = TRUE, min.pct = 0.25, logfc.threshold = 0.25 )
FindMarkers()
常用寻找指定cluster的marker基因
markers <- FindMarkers( obj, ident.1 = 1, ident.2 = 2, test.use = "wilcox" )
5 可视化函数
DimPlot()
常用降维可视化(UMAP/tSNE图)
DimPlot( obj, reduction = "umap", group.by = "seurat_clusters", label = TRUE, pt.size = 0.5 )
FeaturePlot()
常用基因表达在降维图上的展示
FeaturePlot( obj, features = c("CD3D", "CD8A"), reduction = "umap", min.cutoff = "q10", max.cutoff = "q90" )
VlnPlot()
常用小提琴图,展示基因表达分布
VlnPlot( obj, features = c("nFeature_RNA", "percent.mt"), ncol = 2, pt.size = 0.1 )
DotPlot()
常用点图,展示基因在cluster中的表达
DotPlot( obj, features = c("CD3D", "CD8A", "MS4A1") ) + RotatedAxis()
DoHeatmap()
常用热图,展示基因表达模式
DoHeatmap( obj, features = top_markers$gene, size = 3 ) + NoLegend()
RidgePlot()
可选山脊图,展示基因表达分布
RidgePlot( obj, features = "CD3D", group.by = "seurat_clusters" )
6 高级函数
FindIntegrationAnchors()
整合寻找整合锚点,用于多数据集整合
anchors <- FindIntegrationAnchors( object.list = dataset_list, dims = 1:30 )
IntegrateData()
整合整合多数据集,消除批次效应
combined <- IntegrateData( anchorset = anchors, dims = 1:30 )
RenameIdents()
注释重命名cluster身份(细胞类型注释)
RenameIdents( obj, '0' = 'T cells', '1' = 'B cells' )
subset()
按cluster或其他元数据子集化
# 提取特定cluster t_cells <- subset(obj, idents = 'T cells') # 提取多个cluster sub <- subset(obj, idents = c(1, 2))
merge()
合并多个Seurat对象
merged <- merge( x = obj1, y = obj2, add.cell.ids = c('batch1', 'batch2'), project = 'combined' )
💡 使用建议
🚀 新手推荐流程
- 1. CreateSeuratObject
- 2. PercentageFeatureSet (QC)
- 3. SCTransform (一体化)
- 4. RunPCA → RunUMAP
- 5. FindNeighbors → FindClusters
- 6. DimPlot 可视化
📚 常见问题
- • 细胞太多:先用subset取子集测试
- • 聚类不好:调整resolution参数
- • 批次效应:使用Harmony或IntegrateData
- • 找不到marker:放宽logfc阈值