{"id":89194,"date":"2026-05-09T16:16:20","date_gmt":"2026-05-09T16:16:20","guid":{"rendered":"https:\/\/youzum.net\/how-to-build-a-single-cell-rna-seq-analysis-pipeline-with-scanpy-for-pbmc-clustering-annotation-and-trajectory-discovery\/"},"modified":"2026-05-09T16:16:20","modified_gmt":"2026-05-09T16:16:20","slug":"how-to-build-a-single-cell-rna-seq-analysis-pipeline-with-scanpy-for-pbmc-clustering-annotation-and-trajectory-discovery","status":"publish","type":"post","link":"https:\/\/youzum.net\/it\/how-to-build-a-single-cell-rna-seq-analysis-pipeline-with-scanpy-for-pbmc-clustering-annotation-and-trajectory-discovery\/","title":{"rendered":"How to Build a Single-Cell RNA-seq Analysis Pipeline with Scanpy for PBMC Clustering, Annotation, and Trajectory Discovery"},"content":{"rendered":"<p>In this tutorial, we perform an advanced single-cell RNA-seq analysis workflow using <a href=\"https:\/\/github.com\/scverse\/scanpy\"><strong>Scanpy<\/strong><\/a> on the PBMC-3k benchmark dataset. We start by loading the dataset, inspecting its structure, and applying quality control checks to evaluate gene counts, total counts, mitochondrial content, and ribosomal gene signals. We then filter low-quality cells and genes, detect potential doublets with Scrublet, normalize the data, apply log transformation, and identify highly variable genes for downstream analysis. Also, we score cell-cycle phases, regress out unwanted technical variation, scale the data, and reduce dimensionality using PCA, UMAP, and t-SNE. We also cluster cells with the Leiden algorithm, identify marker genes, annotate cell populations using canonical PBMC markers, explore trajectory structure with PAGA and diffusion pseudotime, calculate a custom interferon-response score, and finally save the fully analyzed AnnData object for future use.<\/p>\n<div class=\"dm-code-snippet dark dm-normal-version default no-background-mobile\">\n<div class=\"control-language\">\n<div class=\"dm-buttons\">\n<div class=\"dm-buttons-left\">\n<div class=\"dm-button-snippet red-button\"><\/div>\n<div class=\"dm-button-snippet orange-button\"><\/div>\n<div class=\"dm-button-snippet green-button\"><\/div>\n<\/div>\n<div class=\"dm-buttons-right\"><a><span class=\"dm-copy-text\">Copy Code<\/span><span class=\"dm-copy-confirmed\">Copied<\/span><span class=\"dm-error-message\">Use a different Browser<\/span><\/a><\/div>\n<\/div>\n<pre class=\"no-line-numbers\"><code class=\"no-wrap language-php\">!pip install -q scanpy leidenalg python-igraph scrublet\n\n\nimport scanpy as sc\nimport numpy as np\nimport pandas as pd\nimport matplotlib.pyplot as plt\nimport warnings\nwarnings.filterwarnings(\"ignore\")\n\n\nsc.settings.verbosity = 3\nsc.settings.set_figure_params(dpi=80, facecolor=\"white\", figsize=(5, 5))\nsc.logging.print_header()\n\n\nadata = sc.datasets.pbmc3k()\nadata.var_names_make_unique()\nprint(adata)\n\n\nadata.var[\"mt\"]   = adata.var_names.str.startswith(\"MT-\")\nadata.var[\"ribo\"] = adata.var_names.str.startswith((\"RPS\", \"RPL\"))\nsc.pp.calculate_qc_metrics(\n   adata, qc_vars=[\"mt\", \"ribo\"], percent_top=None, log1p=False, inplace=True\n)\n\n\nsc.pl.violin(\n   adata,\n   [\"n_genes_by_counts\", \"total_counts\", \"pct_counts_mt\"],\n   jitter=0.4, multi_panel=True,\n)\nsc.pl.scatter(adata, x=\"total_counts\", y=\"pct_counts_mt\")\nsc.pl.scatter(adata, x=\"total_counts\", y=\"n_genes_by_counts\")<\/code><\/pre>\n<\/div>\n<\/div>\n<p>We install the required single-cell analysis libraries and import Scanpy, NumPy, Pandas, Matplotlib, and warning controls. We load the PBMC-3k benchmark dataset, make gene names unique, and inspect the AnnData object structure. We then calculate quality control metrics for mitochondrial and ribosomal genes and visualize count-level quality patterns using violin and scatter plots.<\/p>\n<div class=\"dm-code-snippet dark dm-normal-version default no-background-mobile\">\n<div class=\"control-language\">\n<div class=\"dm-buttons\">\n<div class=\"dm-buttons-left\">\n<div class=\"dm-button-snippet red-button\"><\/div>\n<div class=\"dm-button-snippet orange-button\"><\/div>\n<div class=\"dm-button-snippet green-button\"><\/div>\n<\/div>\n<div class=\"dm-buttons-right\"><a><span class=\"dm-copy-text\">Copy Code<\/span><span class=\"dm-copy-confirmed\">Copied<\/span><span class=\"dm-error-message\">Use a different Browser<\/span><\/a><\/div>\n<\/div>\n<pre class=\"no-line-numbers\"><code class=\"no-wrap language-php\">sc.pp.filter_cells(adata, min_genes=200)\nsc.pp.filter_genes(adata, min_cells=3)\nadata = adata[adata.obs.n_genes_by_counts &lt; 2500, :].copy()\nadata = adata[adata.obs.pct_counts_mt &lt; 5, :].copy()\n\n\nsc.pp.scrublet(adata)\nprint(\"Predicted doublets:\", int(adata.obs[\"predicted_doublet\"].sum()))\nadata = adata[~adata.obs[\"predicted_doublet\"], :].copy()\n\n\nadata.layers[\"counts\"] = adata.X.copy()\nsc.pp.normalize_total(adata, target_sum=1e4)\nsc.pp.log1p(adata)\n\n\nsc.pp.highly_variable_genes(adata, min_mean=0.0125, max_mean=3, min_disp=0.5)\nsc.pl.highly_variable_genes(adata)\nadata.raw = adata\nadata = adata[:, adata.var.highly_variable].copy()<\/code><\/pre>\n<\/div>\n<\/div>\n<p>We filter out low-quality cells and rarely detected genes to improve the reliability of the dataset. We use Scrublet through Scanpy to identify predicted doublets and remove them before deeper analysis. We then preserve raw counts, normalize expression values, apply log transformation, select highly variable genes, and keep only the most informative features.<\/p>\n<div class=\"dm-code-snippet dark dm-normal-version default no-background-mobile\">\n<div class=\"control-language\">\n<div class=\"dm-buttons\">\n<div class=\"dm-buttons-left\">\n<div class=\"dm-button-snippet red-button\"><\/div>\n<div class=\"dm-button-snippet orange-button\"><\/div>\n<div class=\"dm-button-snippet green-button\"><\/div>\n<\/div>\n<div class=\"dm-buttons-right\"><a><span class=\"dm-copy-text\">Copy Code<\/span><span class=\"dm-copy-confirmed\">Copied<\/span><span class=\"dm-error-message\">Use a different Browser<\/span><\/a><\/div>\n<\/div>\n<pre class=\"no-line-numbers\"><code class=\"no-wrap language-php\">s_genes = [\"MCM5\",\"PCNA\",\"TYMS\",\"FEN1\",\"MCM2\",\"MCM4\",\"RRM1\",\"UNG\",\"GINS2\",\n          \"MCM6\",\"CDCA7\",\"DTL\",\"PRIM1\",\"UHRF1\",\"HELLS\",\"RFC2\",\"NASP\",\n          \"RAD51AP1\",\"GMNN\",\"WDR76\",\"SLBP\",\"CCNE2\",\"UBR7\",\"POLD3\",\"MSH2\",\n          \"ATAD2\",\"RAD51\",\"RRM2\",\"CDC45\",\"CDC6\",\"EXO1\",\"TIPIN\",\"DSCC1\",\n          \"BLM\",\"CASP8AP2\",\"USP1\",\"CLSPN\",\"POLA1\",\"CHAF1B\",\"E2F8\"]\ng2m_genes = [\"HMGB2\",\"CDK1\",\"NUSAP1\",\"UBE2C\",\"BIRC5\",\"TPX2\",\"TOP2A\",\"NDC80\",\n            \"CKS2\",\"NUF2\",\"CKS1B\",\"MKI67\",\"TMPO\",\"CENPF\",\"TACC3\",\"SMC4\",\n            \"CCNB2\",\"CKAP2L\",\"CKAP2\",\"AURKB\",\"BUB1\",\"KIF11\",\"ANP32E\",\n            \"TUBB4B\",\"GTSE1\",\"KIF20B\",\"HJURP\",\"CDCA3\",\"CDC20\",\"TTK\",\n            \"CDC25C\",\"KIF2C\",\"RANGAP1\",\"NCAPD2\",\"DLGAP5\",\"CDCA2\",\"CDCA8\",\n            \"ECT2\",\"KIF23\",\"HMMR\",\"AURKA\",\"PSRC1\",\"ANLN\",\"LBR\",\"CKAP5\",\n            \"CENPE\",\"NEK2\",\"G2E3\",\"CBX5\",\"CENPA\"]\ns_genes   = [g for g in s_genes   if g in adata.var_names]\ng2m_genes = [g for g in g2m_genes if g in adata.var_names]\nsc.tl.score_genes_cell_cycle(adata, s_genes=s_genes, g2m_genes=g2m_genes)\n\n\nsc.pp.regress_out(adata, [\"total_counts\", \"pct_counts_mt\"])\nsc.pp.scale(adata, max_value=10)\n\n\nsc.tl.pca(adata, svd_solver=\"arpack\")\nsc.pl.pca_variance_ratio(adata, log=True, n_pcs=50)\n\n\nsc.pp.neighbors(adata, n_neighbors=10, n_pcs=40)\nsc.tl.umap(adata)\nsc.tl.tsne(adata, n_pcs=40)<\/code><\/pre>\n<\/div>\n<\/div>\n<p>We define S-phase and G2\/M-phase marker genes and retain only those present in the dataset. We score each cell for cell-cycle phase, regress out unwanted variation from total counts and mitochondrial percentage, and scale the data for downstream modeling. We then run PCA, inspect explained variance, construct the neighborhood graph, and generate UMAP and t-SNE embeddings.<\/p>\n<div class=\"dm-code-snippet dark dm-normal-version default no-background-mobile\">\n<div class=\"control-language\">\n<div class=\"dm-buttons\">\n<div class=\"dm-buttons-left\">\n<div class=\"dm-button-snippet red-button\"><\/div>\n<div class=\"dm-button-snippet orange-button\"><\/div>\n<div class=\"dm-button-snippet green-button\"><\/div>\n<\/div>\n<div class=\"dm-buttons-right\"><a><span class=\"dm-copy-text\">Copy Code<\/span><span class=\"dm-copy-confirmed\">Copied<\/span><span class=\"dm-error-message\">Use a different Browser<\/span><\/a><\/div>\n<\/div>\n<pre class=\"no-line-numbers\"><code class=\"no-wrap language-php\">sc.tl.leiden(adata, resolution=0.5, flavor=\"igraph\", n_iterations=2, directed=False)\nsc.pl.umap(adata, color=\"leiden\", legend_loc=\"on data\", title=\"Leiden clusters\")\nsc.pl.tsne(adata, color=\"leiden\", legend_loc=\"on data\", title=\"t-SNE clusters\")\n\n\nsc.tl.rank_genes_groups(adata, \"leiden\", method=\"wilcoxon\")\nsc.pl.rank_genes_groups(adata, n_genes=20, sharey=False)\n\n\nresult   = adata.uns[\"rank_genes_groups\"]\ngroups   = result[\"names\"].dtype.names\ntop_df   = pd.DataFrame({g: result[\"names\"][g][:10] for g in groups})\nprint(\"nTop 10 markers per cluster:n\", top_df)\n\n\nmarker_genes = {\n   \"B-cell\":           [\"CD79A\", \"MS4A1\"],\n   \"CD8 T-cell\":       [\"CD8A\", \"CD8B\"],\n   \"CD4 T-cell\":       [\"IL7R\", \"CD4\"],\n   \"NK\":               [\"GNLY\", \"NKG7\"],\n   \"CD14 Monocyte\":    [\"CD14\", \"LYZ\"],\n   \"FCGR3A Monocyte\":  [\"FCGR3A\", \"MS4A7\"],\n   \"Dendritic\":        [\"FCER1A\", \"CST3\"],\n   \"Megakaryocyte\":    [\"PPBP\"],\n}\nsc.pl.dotplot(adata, marker_genes, groupby=\"leiden\", standard_scale=\"var\")\nsc.pl.stacked_violin(adata, marker_genes, groupby=\"leiden\", swap_axes=True)<\/code><\/pre>\n<\/div>\n<\/div>\n<p>We apply Leiden clustering to group cells based on the neighborhood graph and visualize the clusters on UMAP and t-SNE plots. We perform differential expression analysis using the Wilcoxon test to identify the top marker genes for each cluster. We then use canonical PBMC marker genes to support cell-type annotation through dot plots and stacked violin plots.<\/p>\n<div class=\"dm-code-snippet dark dm-normal-version default no-background-mobile\">\n<div class=\"control-language\">\n<div class=\"dm-buttons\">\n<div class=\"dm-buttons-left\">\n<div class=\"dm-button-snippet red-button\"><\/div>\n<div class=\"dm-button-snippet orange-button\"><\/div>\n<div class=\"dm-button-snippet green-button\"><\/div>\n<\/div>\n<div class=\"dm-buttons-right\"><a><span class=\"dm-copy-text\">Copy Code<\/span><span class=\"dm-copy-confirmed\">Copied<\/span><span class=\"dm-error-message\">Use a different Browser<\/span><\/a><\/div>\n<\/div>\n<pre class=\"no-line-numbers\"><code class=\"no-wrap language-php\">sc.tl.paga(adata, groups=\"leiden\")\nsc.pl.paga(adata, color=\"leiden\", threshold=0.1)\n\n\nsc.tl.umap(adata, init_pos=\"paga\")\nsc.pl.umap(adata, color=\"leiden\", legend_loc=\"on data\")\n\n\nsc.tl.diffmap(adata)\nsc.pp.neighbors(adata, n_neighbors=10, use_rep=\"X_diffmap\")\nadata.uns[\"iroot\"] = np.flatnonzero(adata.obs[\"leiden\"] == adata.obs[\"leiden\"].cat.categories[0])[0]\nsc.tl.dpt(adata)\nsc.pl.umap(adata, color=[\"leiden\", \"dpt_pseudotime\"], legend_loc=\"on data\")\n\n\nifn_genes = [\"ISG15\", \"IFI6\", \"IFIT1\", \"IFIT3\", \"MX1\", \"OAS1\", \"STAT1\", \"IRF7\"]\nifn_genes = [g for g in ifn_genes if g in adata.raw.var_names]\nsc.tl.score_genes(adata, gene_list=ifn_genes, score_name=\"IFN_score\")\nsc.pl.umap(adata, color=\"IFN_score\", cmap=\"viridis\")\n\n\nadata.write(\"pbmc3k_analyzed.h5ad\")\nprint(\"n<img decoding=\"async\" src=\"https:\/\/s.w.org\/images\/core\/emoji\/17.0.2\/72x72\/2705.png\" alt=\"\u2705\" class=\"wp-smiley\" \/> Analysis complete \u2014 saved to pbmc3k_analyzed.h5ad\")\nprint(adata)<\/code><\/pre>\n<\/div>\n<\/div>\n<p>We run PAGA to model connectivity between Leiden clusters and reinitialize UMAP using the PAGA graph to obtain a clearer trajectory structure. We compute diffusion maps and diffusion pseudotime to explore possible progression patterns across cell states. We also calculate an interferon-response gene-set score, visualize it on UMAP, and save the final analyzed object as an .h5ad file.<\/p>\n<p>In conclusion, we built an end-to-end Scanpy pipeline for single-cell RNA-seq analysis, transforming raw PBMC data into interpretable biological insights. We cleaned and preprocessed the dataset, removed noisy cells and doublets, selected informative genes, and generated meaningful embeddings to visualize cellular structure. We then used Leiden clustering and differential expression analysis to discover marker genes and connect clusters to known immune cell types. By adding PAGA, diffusion pseudotime, and custom gene-set scoring, we extended the workflow beyond basic clustering and showed how Scanpy supports deeper biological interpretation. At last, we had a saved .h5ad object that contains the processed data, annotations, scores, clusters, and visual analysis results, ready for downstream exploration or reporting.<\/p>\n<hr class=\"wp-block-separator has-alpha-channel-opacity\" \/>\n<p>Check out\u00a0the\u00a0<strong><a href=\"https:\/\/github.com\/Marktechpost\/AI-Agents-Projects-Tutorials\/blob\/main\/Data%20Science\/scanpy_pbmc3k_single_cell_rnaseq_analysis_Marktechpost.ipynb\" target=\"_blank\" rel=\"noreferrer noopener\">Full Codes with Notebook here.<\/a>\u00a0<\/strong>Also,\u00a0feel free to follow us on\u00a0<strong><a href=\"https:\/\/x.com\/intent\/follow?screen_name=marktechpost\" target=\"_blank\" rel=\"noreferrer noopener\"><mark>Twitter<\/mark><\/a><\/strong>\u00a0and don\u2019t forget to join our\u00a0<strong><a href=\"https:\/\/www.reddit.com\/r\/machinelearningnews\/\" target=\"_blank\" rel=\"noreferrer noopener\">150k+ ML SubReddit<\/a><\/strong>\u00a0and Subscribe to\u00a0<strong><a href=\"https:\/\/www.aidevsignals.com\/\" target=\"_blank\" rel=\"noreferrer noopener\">our Newsletter<\/a><\/strong>. Wait! are you on telegram?\u00a0<strong><a href=\"https:\/\/t.me\/machinelearningresearchnews\" target=\"_blank\" rel=\"noreferrer noopener\">now you can join us on telegram as well.<\/a><\/strong><\/p>\n<p>Need to partner with us for promoting your GitHub Repo OR Hugging Face Page OR Product Release OR Webinar etc.?\u00a0<strong><a href=\"https:\/\/forms.gle\/MTNLpmJtsFA3VRVd9\" target=\"_blank\" rel=\"noreferrer noopener\"><mark>Connect with us<\/mark><\/a><\/strong><\/p>\n<p>The post <a href=\"https:\/\/www.marktechpost.com\/2026\/05\/08\/how-to-build-a-single-cell-rna-seq-analysis-pipeline-with-scanpy-for-pbmc-clustering-annotation-and-trajectory-discovery\/\">How to Build a Single-Cell RNA-seq Analysis Pipeline with Scanpy for PBMC Clustering, Annotation, and Trajectory Discovery<\/a> appeared first on <a href=\"https:\/\/www.marktechpost.com\/\">MarkTechPost<\/a>.<\/p>","protected":false},"excerpt":{"rendered":"<p>In this tutorial, we perform an advanced single-cell RNA-seq analysis workflow using Scanpy on the PBMC-3k benchmark dataset. We start by loading the dataset, inspecting its structure, and applying quality control checks to evaluate gene counts, total counts, mitochondrial content, and ribosomal gene signals. We then filter low-quality cells and genes, detect potential doublets with Scrublet, normalize the data, apply log transformation, and identify highly variable genes for downstream analysis. Also, we score cell-cycle phases, regress out unwanted technical variation, scale the data, and reduce dimensionality using PCA, UMAP, and t-SNE. We also cluster cells with the Leiden algorithm, identify marker genes, annotate cell populations using canonical PBMC markers, explore trajectory structure with PAGA and diffusion pseudotime, calculate a custom interferon-response score, and finally save the fully analyzed AnnData object for future use. Copy CodeCopiedUse a different Browser !pip install -q scanpy leidenalg python-igraph scrublet import scanpy as sc import numpy as np import pandas as pd import matplotlib.pyplot as plt import warnings warnings.filterwarnings(&#8220;ignore&#8221;) sc.settings.verbosity = 3 sc.settings.set_figure_params(dpi=80, facecolor=&#8221;white&#8221;, figsize=(5, 5)) sc.logging.print_header() adata = sc.datasets.pbmc3k() adata.var_names_make_unique() print(adata) adata.var[&#8220;mt&#8221;] = adata.var_names.str.startswith(&#8220;MT-&#8220;) adata.var[&#8220;ribo&#8221;] = adata.var_names.str.startswith((&#8220;RPS&#8221;, &#8220;RPL&#8221;)) sc.pp.calculate_qc_metrics( adata, qc_vars=[&#8220;mt&#8221;, &#8220;ribo&#8221;], percent_top=None, log1p=False, inplace=True ) sc.pl.violin( adata, [&#8220;n_genes_by_counts&#8221;, &#8220;total_counts&#8221;, &#8220;pct_counts_mt&#8221;], jitter=0.4, multi_panel=True, ) sc.pl.scatter(adata, x=&#8221;total_counts&#8221;, y=&#8221;pct_counts_mt&#8221;) sc.pl.scatter(adata, x=&#8221;total_counts&#8221;, y=&#8221;n_genes_by_counts&#8221;) We install the required single-cell analysis libraries and import Scanpy, NumPy, Pandas, Matplotlib, and warning controls. We load the PBMC-3k benchmark dataset, make gene names unique, and inspect the AnnData object structure. We then calculate quality control metrics for mitochondrial and ribosomal genes and visualize count-level quality patterns using violin and scatter plots. Copy CodeCopiedUse a different Browser sc.pp.filter_cells(adata, min_genes=200) sc.pp.filter_genes(adata, min_cells=3) adata = adata[adata.obs.n_genes_by_counts &lt; 2500, :].copy() adata = adata[adata.obs.pct_counts_mt &lt; 5, :].copy() sc.pp.scrublet(adata) print(&#8220;Predicted doublets:&#8221;, int(adata.obs[&#8220;predicted_doublet&#8221;].sum())) adata = adata[~adata.obs[&#8220;predicted_doublet&#8221;], :].copy() adata.layers[&#8220;counts&#8221;] = adata.X.copy() sc.pp.normalize_total(adata, target_sum=1e4) sc.pp.log1p(adata) sc.pp.highly_variable_genes(adata, min_mean=0.0125, max_mean=3, min_disp=0.5) sc.pl.highly_variable_genes(adata) adata.raw = adata adata = adata[:, adata.var.highly_variable].copy() We filter out low-quality cells and rarely detected genes to improve the reliability of the dataset. We use Scrublet through Scanpy to identify predicted doublets and remove them before deeper analysis. We then preserve raw counts, normalize expression values, apply log transformation, select highly variable genes, and keep only the most informative features. Copy CodeCopiedUse a different Browser s_genes = [&#8220;MCM5&#8243;,&#8221;PCNA&#8221;,&#8221;TYMS&#8221;,&#8221;FEN1&#8243;,&#8221;MCM2&#8243;,&#8221;MCM4&#8243;,&#8221;RRM1&#8243;,&#8221;UNG&#8221;,&#8221;GINS2&#8243;, &#8220;MCM6&#8243;,&#8221;CDCA7&#8243;,&#8221;DTL&#8221;,&#8221;PRIM1&#8243;,&#8221;UHRF1&#8243;,&#8221;HELLS&#8221;,&#8221;RFC2&#8243;,&#8221;NASP&#8221;, &#8220;RAD51AP1&#8243;,&#8221;GMNN&#8221;,&#8221;WDR76&#8243;,&#8221;SLBP&#8221;,&#8221;CCNE2&#8243;,&#8221;UBR7&#8243;,&#8221;POLD3&#8243;,&#8221;MSH2&#8243;, &#8220;ATAD2&#8243;,&#8221;RAD51&#8243;,&#8221;RRM2&#8243;,&#8221;CDC45&#8243;,&#8221;CDC6&#8243;,&#8221;EXO1&#8243;,&#8221;TIPIN&#8221;,&#8221;DSCC1&#8243;, &#8220;BLM&#8221;,&#8221;CASP8AP2&#8243;,&#8221;USP1&#8243;,&#8221;CLSPN&#8221;,&#8221;POLA1&#8243;,&#8221;CHAF1B&#8221;,&#8221;E2F8&#8243;] g2m_genes = [&#8220;HMGB2&#8243;,&#8221;CDK1&#8243;,&#8221;NUSAP1&#8243;,&#8221;UBE2C&#8221;,&#8221;BIRC5&#8243;,&#8221;TPX2&#8243;,&#8221;TOP2A&#8221;,&#8221;NDC80&#8243;, &#8220;CKS2&#8243;,&#8221;NUF2&#8243;,&#8221;CKS1B&#8221;,&#8221;MKI67&#8243;,&#8221;TMPO&#8221;,&#8221;CENPF&#8221;,&#8221;TACC3&#8243;,&#8221;SMC4&#8243;, &#8220;CCNB2&#8243;,&#8221;CKAP2L&#8221;,&#8221;CKAP2&#8243;,&#8221;AURKB&#8221;,&#8221;BUB1&#8243;,&#8221;KIF11&#8243;,&#8221;ANP32E&#8221;, &#8220;TUBB4B&#8221;,&#8221;GTSE1&#8243;,&#8221;KIF20B&#8221;,&#8221;HJURP&#8221;,&#8221;CDCA3&#8243;,&#8221;CDC20&#8243;,&#8221;TTK&#8221;, &#8220;CDC25C&#8221;,&#8221;KIF2C&#8221;,&#8221;RANGAP1&#8243;,&#8221;NCAPD2&#8243;,&#8221;DLGAP5&#8243;,&#8221;CDCA2&#8243;,&#8221;CDCA8&#8243;, &#8220;ECT2&#8243;,&#8221;KIF23&#8243;,&#8221;HMMR&#8221;,&#8221;AURKA&#8221;,&#8221;PSRC1&#8243;,&#8221;ANLN&#8221;,&#8221;LBR&#8221;,&#8221;CKAP5&#8243;, &#8220;CENPE&#8221;,&#8221;NEK2&#8243;,&#8221;G2E3&#8243;,&#8221;CBX5&#8243;,&#8221;CENPA&#8221;] s_genes = [g for g in s_genes if g in adata.var_names] g2m_genes = [g for g in g2m_genes if g in adata.var_names] sc.tl.score_genes_cell_cycle(adata, s_genes=s_genes, g2m_genes=g2m_genes) sc.pp.regress_out(adata, [&#8220;total_counts&#8221;, &#8220;pct_counts_mt&#8221;]) sc.pp.scale(adata, max_value=10) sc.tl.pca(adata, svd_solver=&#8221;arpack&#8221;) sc.pl.pca_variance_ratio(adata, log=True, n_pcs=50) sc.pp.neighbors(adata, n_neighbors=10, n_pcs=40) sc.tl.umap(adata) sc.tl.tsne(adata, n_pcs=40) We define S-phase and G2\/M-phase marker genes and retain only those present in the dataset. We score each cell for cell-cycle phase, regress out unwanted variation from total counts and mitochondrial percentage, and scale the data for downstream modeling. We then run PCA, inspect explained variance, construct the neighborhood graph, and generate UMAP and t-SNE embeddings. Copy CodeCopiedUse a different Browser sc.tl.leiden(adata, resolution=0.5, flavor=&#8221;igraph&#8221;, n_iterations=2, directed=False) sc.pl.umap(adata, color=&#8221;leiden&#8221;, legend_loc=&#8221;on data&#8221;, title=&#8221;Leiden clusters&#8221;) sc.pl.tsne(adata, color=&#8221;leiden&#8221;, legend_loc=&#8221;on data&#8221;, title=&#8221;t-SNE clusters&#8221;) sc.tl.rank_genes_groups(adata, &#8220;leiden&#8221;, method=&#8221;wilcoxon&#8221;) sc.pl.rank_genes_groups(adata, n_genes=20, sharey=False) result = adata.uns[&#8220;rank_genes_groups&#8221;] groups = result[&#8220;names&#8221;].dtype.names top_df = pd.DataFrame({g: result[&#8220;names&#8221;][g][:10] for g in groups}) print(&#8220;nTop 10 markers per cluster:n&#8221;, top_df) marker_genes = { &#8220;B-cell&#8221;: [&#8220;CD79A&#8221;, &#8220;MS4A1&#8221;], &#8220;CD8 T-cell&#8221;: [&#8220;CD8A&#8221;, &#8220;CD8B&#8221;], &#8220;CD4 T-cell&#8221;: [&#8220;IL7R&#8221;, &#8220;CD4&#8221;], &#8220;NK&#8221;: [&#8220;GNLY&#8221;, &#8220;NKG7&#8221;], &#8220;CD14 Monocyte&#8221;: [&#8220;CD14&#8221;, &#8220;LYZ&#8221;], &#8220;FCGR3A Monocyte&#8221;: [&#8220;FCGR3A&#8221;, &#8220;MS4A7&#8221;], &#8220;Dendritic&#8221;: [&#8220;FCER1A&#8221;, &#8220;CST3&#8221;], &#8220;Megakaryocyte&#8221;: [&#8220;PPBP&#8221;], } sc.pl.dotplot(adata, marker_genes, groupby=&#8221;leiden&#8221;, standard_scale=&#8221;var&#8221;) sc.pl.stacked_violin(adata, marker_genes, groupby=&#8221;leiden&#8221;, swap_axes=True) We apply Leiden clustering to group cells based on the neighborhood graph and visualize the clusters on UMAP and t-SNE plots. We perform differential expression analysis using the Wilcoxon test to identify the top marker genes for each cluster. We then use canonical PBMC marker genes to support cell-type annotation through dot plots and stacked violin plots. Copy CodeCopiedUse a different Browser sc.tl.paga(adata, groups=&#8221;leiden&#8221;) sc.pl.paga(adata, color=&#8221;leiden&#8221;, threshold=0.1) sc.tl.umap(adata, init_pos=&#8221;paga&#8221;) sc.pl.umap(adata, color=&#8221;leiden&#8221;, legend_loc=&#8221;on data&#8221;) sc.tl.diffmap(adata) sc.pp.neighbors(adata, n_neighbors=10, use_rep=&#8221;X_diffmap&#8221;) adata.uns[&#8220;iroot&#8221;] = np.flatnonzero(adata.obs[&#8220;leiden&#8221;] == adata.obs[&#8220;leiden&#8221;].cat.categories[0])[0] sc.tl.dpt(adata) sc.pl.umap(adata, color=[&#8220;leiden&#8221;, &#8220;dpt_pseudotime&#8221;], legend_loc=&#8221;on data&#8221;) ifn_genes = [&#8220;ISG15&#8221;, &#8220;IFI6&#8221;, &#8220;IFIT1&#8221;, &#8220;IFIT3&#8221;, &#8220;MX1&#8221;, &#8220;OAS1&#8221;, &#8220;STAT1&#8221;, &#8220;IRF7&#8243;] ifn_genes = [g for g in ifn_genes if g in adata.raw.var_names] sc.tl.score_genes(adata, gene_list=ifn_genes, score_name=&#8221;IFN_score&#8221;) sc.pl.umap(adata, color=&#8221;IFN_score&#8221;, cmap=&#8221;viridis&#8221;) adata.write(&#8220;pbmc3k_analyzed.h5ad&#8221;) print(&#8220;n Analysis complete \u2014 saved to pbmc3k_analyzed.h5ad&#8221;) print(adata) We run PAGA to model connectivity between Leiden clusters and reinitialize UMAP using the PAGA graph to obtain a clearer trajectory structure. We compute diffusion maps and diffusion pseudotime to explore possible progression patterns across cell states. We also calculate an interferon-response gene-set score, visualize it on UMAP, and save the final analyzed object as an .h5ad file. In conclusion, we built an end-to-end Scanpy pipeline for single-cell RNA-seq analysis, transforming raw PBMC data into interpretable biological insights. We cleaned and preprocessed the dataset, removed noisy cells and doublets, selected informative genes, and generated meaningful embeddings to visualize cellular structure. We then used Leiden clustering and differential expression analysis to discover marker genes and connect clusters to known immune cell types. By adding PAGA, diffusion pseudotime, and custom gene-set scoring, we extended the workflow beyond basic clustering and showed how Scanpy supports deeper biological interpretation. At last, we had a saved .h5ad object that contains the processed data, annotations, scores, clusters, and visual analysis results, ready for downstream exploration or reporting. Check out\u00a0the\u00a0Full Codes with Notebook here.\u00a0Also,\u00a0feel free to follow us on\u00a0Twitter\u00a0and don\u2019t forget to join our\u00a0150k+ ML SubReddit\u00a0and Subscribe to\u00a0our Newsletter. Wait! are you on telegram?\u00a0now you can join us on telegram as well. Need to partner with us for promoting your GitHub Repo OR Hugging Face Page OR Product Release OR Webinar etc.?\u00a0Connect with us The post How to Build a Single-Cell RNA-seq Analysis Pipeline with Scanpy for PBMC Clustering, Annotation, and Trajectory Discovery appeared first on MarkTechPost.<\/p>","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"pmpro_default_level":"","site-sidebar-layout":"default","site-content-layout":"","ast-site-content-layout":"","site-content-style":"default","site-sidebar-style":"default","ast-global-header-display":"","ast-banner-title-visibility":"","ast-main-header-display":"","ast-hfb-above-header-display":"","ast-hfb-below-header-display":"","ast-hfb-mobile-header-display":"","site-post-title":"","ast-breadcrumbs-content":"","ast-featured-img":"","footer-sml-layout":"","theme-transparent-header-meta":"","adv-header-id-meta":"","stick-header-meta":"","header-above-stick-meta":"","header-main-stick-meta":"","header-below-stick-meta":"","astra-migrate-meta-layouts":"default","ast-page-background-enabled":"default","ast-page-background-meta":{"desktop":{"background-color":"var(--ast-global-color-4)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"tablet":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"mobile":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""}},"ast-content-background-meta":{"desktop":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"tablet":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"mobile":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""}},"_pvb_checkbox_block_on_post":false,"footnotes":""},"categories":[52,5,7,1],"tags":[],"class_list":["post-89194","post","type-post","status-publish","format-standard","hentry","category-ai-club","category-committee","category-news","category-uncategorized","pmpro-has-access"],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v25.3 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>How to Build a Single-Cell RNA-seq Analysis Pipeline with Scanpy for PBMC Clustering, Annotation, and Trajectory Discovery - YouZum<\/title>\n<meta name=\"description\" content=\"\u0e01\u0e34\u0e08\u0e01\u0e23\u0e23\u0e21\u0e40\u0e01\u0e35\u0e48\u0e22\u0e27\u0e01\u0e31\u0e1a\u0e42\u0e14\u0e23\u0e19\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/youzum.net\/it\/how-to-build-a-single-cell-rna-seq-analysis-pipeline-with-scanpy-for-pbmc-clustering-annotation-and-trajectory-discovery\/\" \/>\n<meta property=\"og:locale\" content=\"it_IT\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"How to Build a Single-Cell RNA-seq Analysis Pipeline with Scanpy for PBMC Clustering, Annotation, and Trajectory Discovery - YouZum\" \/>\n<meta property=\"og:description\" content=\"\u0e01\u0e34\u0e08\u0e01\u0e23\u0e23\u0e21\u0e40\u0e01\u0e35\u0e48\u0e22\u0e27\u0e01\u0e31\u0e1a\u0e42\u0e14\u0e23\u0e19\" \/>\n<meta property=\"og:url\" content=\"https:\/\/youzum.net\/it\/how-to-build-a-single-cell-rna-seq-analysis-pipeline-with-scanpy-for-pbmc-clustering-annotation-and-trajectory-discovery\/\" \/>\n<meta property=\"og:site_name\" content=\"YouZum\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/DroneAssociationTH\/\" \/>\n<meta property=\"article:published_time\" content=\"2026-05-09T16:16:20+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/s.w.org\/images\/core\/emoji\/17.0.2\/72x72\/2705.png\" \/>\n<meta name=\"author\" content=\"admin NU\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Scritto da\" \/>\n\t<meta name=\"twitter:data1\" content=\"admin NU\" \/>\n\t<meta name=\"twitter:label2\" content=\"Tempo di lettura stimato\" \/>\n\t<meta name=\"twitter:data2\" content=\"7 minuti\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/youzum.net\/how-to-build-a-single-cell-rna-seq-analysis-pipeline-with-scanpy-for-pbmc-clustering-annotation-and-trajectory-discovery\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/youzum.net\/how-to-build-a-single-cell-rna-seq-analysis-pipeline-with-scanpy-for-pbmc-clustering-annotation-and-trajectory-discovery\/\"},\"author\":{\"name\":\"admin NU\",\"@id\":\"https:\/\/yousum.gpucore.co\/#\/schema\/person\/97fa48242daf3908e4d9a5f26f4a059c\"},\"headline\":\"How to Build a Single-Cell RNA-seq Analysis Pipeline with Scanpy for PBMC Clustering, Annotation, and Trajectory Discovery\",\"datePublished\":\"2026-05-09T16:16:20+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/youzum.net\/how-to-build-a-single-cell-rna-seq-analysis-pipeline-with-scanpy-for-pbmc-clustering-annotation-and-trajectory-discovery\/\"},\"wordCount\":671,\"commentCount\":0,\"publisher\":{\"@id\":\"https:\/\/yousum.gpucore.co\/#organization\"},\"image\":{\"@id\":\"https:\/\/youzum.net\/how-to-build-a-single-cell-rna-seq-analysis-pipeline-with-scanpy-for-pbmc-clustering-annotation-and-trajectory-discovery\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/s.w.org\/images\/core\/emoji\/17.0.2\/72x72\/2705.png\",\"articleSection\":[\"AI\",\"Committee\",\"News\",\"Uncategorized\"],\"inLanguage\":\"it-IT\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\/\/youzum.net\/how-to-build-a-single-cell-rna-seq-analysis-pipeline-with-scanpy-for-pbmc-clustering-annotation-and-trajectory-discovery\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/youzum.net\/how-to-build-a-single-cell-rna-seq-analysis-pipeline-with-scanpy-for-pbmc-clustering-annotation-and-trajectory-discovery\/\",\"url\":\"https:\/\/youzum.net\/how-to-build-a-single-cell-rna-seq-analysis-pipeline-with-scanpy-for-pbmc-clustering-annotation-and-trajectory-discovery\/\",\"name\":\"How to Build a Single-Cell RNA-seq Analysis Pipeline with Scanpy for PBMC Clustering, Annotation, and Trajectory Discovery - YouZum\",\"isPartOf\":{\"@id\":\"https:\/\/yousum.gpucore.co\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/youzum.net\/how-to-build-a-single-cell-rna-seq-analysis-pipeline-with-scanpy-for-pbmc-clustering-annotation-and-trajectory-discovery\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/youzum.net\/how-to-build-a-single-cell-rna-seq-analysis-pipeline-with-scanpy-for-pbmc-clustering-annotation-and-trajectory-discovery\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/s.w.org\/images\/core\/emoji\/17.0.2\/72x72\/2705.png\",\"datePublished\":\"2026-05-09T16:16:20+00:00\",\"description\":\"\u0e01\u0e34\u0e08\u0e01\u0e23\u0e23\u0e21\u0e40\u0e01\u0e35\u0e48\u0e22\u0e27\u0e01\u0e31\u0e1a\u0e42\u0e14\u0e23\u0e19\",\"breadcrumb\":{\"@id\":\"https:\/\/youzum.net\/how-to-build-a-single-cell-rna-seq-analysis-pipeline-with-scanpy-for-pbmc-clustering-annotation-and-trajectory-discovery\/#breadcrumb\"},\"inLanguage\":\"it-IT\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/youzum.net\/how-to-build-a-single-cell-rna-seq-analysis-pipeline-with-scanpy-for-pbmc-clustering-annotation-and-trajectory-discovery\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"it-IT\",\"@id\":\"https:\/\/youzum.net\/how-to-build-a-single-cell-rna-seq-analysis-pipeline-with-scanpy-for-pbmc-clustering-annotation-and-trajectory-discovery\/#primaryimage\",\"url\":\"https:\/\/s.w.org\/images\/core\/emoji\/17.0.2\/72x72\/2705.png\",\"contentUrl\":\"https:\/\/s.w.org\/images\/core\/emoji\/17.0.2\/72x72\/2705.png\"},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/youzum.net\/how-to-build-a-single-cell-rna-seq-analysis-pipeline-with-scanpy-for-pbmc-clustering-annotation-and-trajectory-discovery\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/youzum.net\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"How to Build a Single-Cell RNA-seq Analysis Pipeline with Scanpy for PBMC Clustering, Annotation, and Trajectory Discovery\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/yousum.gpucore.co\/#website\",\"url\":\"https:\/\/yousum.gpucore.co\/\",\"name\":\"YouSum\",\"description\":\"\",\"publisher\":{\"@id\":\"https:\/\/yousum.gpucore.co\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/yousum.gpucore.co\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"it-IT\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/yousum.gpucore.co\/#organization\",\"name\":\"Drone Association Thailand\",\"url\":\"https:\/\/yousum.gpucore.co\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"it-IT\",\"@id\":\"https:\/\/yousum.gpucore.co\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/youzum.net\/wp-content\/uploads\/2024\/11\/tranparent-logo.png\",\"contentUrl\":\"https:\/\/youzum.net\/wp-content\/uploads\/2024\/11\/tranparent-logo.png\",\"width\":300,\"height\":300,\"caption\":\"Drone Association Thailand\"},\"image\":{\"@id\":\"https:\/\/yousum.gpucore.co\/#\/schema\/logo\/image\/\"},\"sameAs\":[\"https:\/\/www.facebook.com\/DroneAssociationTH\/\"]},{\"@type\":\"Person\",\"@id\":\"https:\/\/yousum.gpucore.co\/#\/schema\/person\/97fa48242daf3908e4d9a5f26f4a059c\",\"name\":\"admin NU\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"it-IT\",\"@id\":\"https:\/\/yousum.gpucore.co\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/youzum.net\/wp-content\/uploads\/avatars\/2\/1746849356-bpfull.png\",\"contentUrl\":\"https:\/\/youzum.net\/wp-content\/uploads\/avatars\/2\/1746849356-bpfull.png\",\"caption\":\"admin NU\"},\"url\":\"https:\/\/youzum.net\/it\/members\/adminnu\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"How to Build a Single-Cell RNA-seq Analysis Pipeline with Scanpy for PBMC Clustering, Annotation, and Trajectory Discovery - YouZum","description":"\u0e01\u0e34\u0e08\u0e01\u0e23\u0e23\u0e21\u0e40\u0e01\u0e35\u0e48\u0e22\u0e27\u0e01\u0e31\u0e1a\u0e42\u0e14\u0e23\u0e19","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/youzum.net\/it\/how-to-build-a-single-cell-rna-seq-analysis-pipeline-with-scanpy-for-pbmc-clustering-annotation-and-trajectory-discovery\/","og_locale":"it_IT","og_type":"article","og_title":"How to Build a Single-Cell RNA-seq Analysis Pipeline with Scanpy for PBMC Clustering, Annotation, and Trajectory Discovery - YouZum","og_description":"\u0e01\u0e34\u0e08\u0e01\u0e23\u0e23\u0e21\u0e40\u0e01\u0e35\u0e48\u0e22\u0e27\u0e01\u0e31\u0e1a\u0e42\u0e14\u0e23\u0e19","og_url":"https:\/\/youzum.net\/it\/how-to-build-a-single-cell-rna-seq-analysis-pipeline-with-scanpy-for-pbmc-clustering-annotation-and-trajectory-discovery\/","og_site_name":"YouZum","article_publisher":"https:\/\/www.facebook.com\/DroneAssociationTH\/","article_published_time":"2026-05-09T16:16:20+00:00","og_image":[{"url":"https:\/\/s.w.org\/images\/core\/emoji\/17.0.2\/72x72\/2705.png","type":"","width":"","height":""}],"author":"admin NU","twitter_card":"summary_large_image","twitter_misc":{"Scritto da":"admin NU","Tempo di lettura stimato":"7 minuti"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/youzum.net\/how-to-build-a-single-cell-rna-seq-analysis-pipeline-with-scanpy-for-pbmc-clustering-annotation-and-trajectory-discovery\/#article","isPartOf":{"@id":"https:\/\/youzum.net\/how-to-build-a-single-cell-rna-seq-analysis-pipeline-with-scanpy-for-pbmc-clustering-annotation-and-trajectory-discovery\/"},"author":{"name":"admin NU","@id":"https:\/\/yousum.gpucore.co\/#\/schema\/person\/97fa48242daf3908e4d9a5f26f4a059c"},"headline":"How to Build a Single-Cell RNA-seq Analysis Pipeline with Scanpy for PBMC Clustering, Annotation, and Trajectory Discovery","datePublished":"2026-05-09T16:16:20+00:00","mainEntityOfPage":{"@id":"https:\/\/youzum.net\/how-to-build-a-single-cell-rna-seq-analysis-pipeline-with-scanpy-for-pbmc-clustering-annotation-and-trajectory-discovery\/"},"wordCount":671,"commentCount":0,"publisher":{"@id":"https:\/\/yousum.gpucore.co\/#organization"},"image":{"@id":"https:\/\/youzum.net\/how-to-build-a-single-cell-rna-seq-analysis-pipeline-with-scanpy-for-pbmc-clustering-annotation-and-trajectory-discovery\/#primaryimage"},"thumbnailUrl":"https:\/\/s.w.org\/images\/core\/emoji\/17.0.2\/72x72\/2705.png","articleSection":["AI","Committee","News","Uncategorized"],"inLanguage":"it-IT","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/youzum.net\/how-to-build-a-single-cell-rna-seq-analysis-pipeline-with-scanpy-for-pbmc-clustering-annotation-and-trajectory-discovery\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/youzum.net\/how-to-build-a-single-cell-rna-seq-analysis-pipeline-with-scanpy-for-pbmc-clustering-annotation-and-trajectory-discovery\/","url":"https:\/\/youzum.net\/how-to-build-a-single-cell-rna-seq-analysis-pipeline-with-scanpy-for-pbmc-clustering-annotation-and-trajectory-discovery\/","name":"How to Build a Single-Cell RNA-seq Analysis Pipeline with Scanpy for PBMC Clustering, Annotation, and Trajectory Discovery - YouZum","isPartOf":{"@id":"https:\/\/yousum.gpucore.co\/#website"},"primaryImageOfPage":{"@id":"https:\/\/youzum.net\/how-to-build-a-single-cell-rna-seq-analysis-pipeline-with-scanpy-for-pbmc-clustering-annotation-and-trajectory-discovery\/#primaryimage"},"image":{"@id":"https:\/\/youzum.net\/how-to-build-a-single-cell-rna-seq-analysis-pipeline-with-scanpy-for-pbmc-clustering-annotation-and-trajectory-discovery\/#primaryimage"},"thumbnailUrl":"https:\/\/s.w.org\/images\/core\/emoji\/17.0.2\/72x72\/2705.png","datePublished":"2026-05-09T16:16:20+00:00","description":"\u0e01\u0e34\u0e08\u0e01\u0e23\u0e23\u0e21\u0e40\u0e01\u0e35\u0e48\u0e22\u0e27\u0e01\u0e31\u0e1a\u0e42\u0e14\u0e23\u0e19","breadcrumb":{"@id":"https:\/\/youzum.net\/how-to-build-a-single-cell-rna-seq-analysis-pipeline-with-scanpy-for-pbmc-clustering-annotation-and-trajectory-discovery\/#breadcrumb"},"inLanguage":"it-IT","potentialAction":[{"@type":"ReadAction","target":["https:\/\/youzum.net\/how-to-build-a-single-cell-rna-seq-analysis-pipeline-with-scanpy-for-pbmc-clustering-annotation-and-trajectory-discovery\/"]}]},{"@type":"ImageObject","inLanguage":"it-IT","@id":"https:\/\/youzum.net\/how-to-build-a-single-cell-rna-seq-analysis-pipeline-with-scanpy-for-pbmc-clustering-annotation-and-trajectory-discovery\/#primaryimage","url":"https:\/\/s.w.org\/images\/core\/emoji\/17.0.2\/72x72\/2705.png","contentUrl":"https:\/\/s.w.org\/images\/core\/emoji\/17.0.2\/72x72\/2705.png"},{"@type":"BreadcrumbList","@id":"https:\/\/youzum.net\/how-to-build-a-single-cell-rna-seq-analysis-pipeline-with-scanpy-for-pbmc-clustering-annotation-and-trajectory-discovery\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/youzum.net\/"},{"@type":"ListItem","position":2,"name":"How to Build a Single-Cell RNA-seq Analysis Pipeline with Scanpy for PBMC Clustering, Annotation, and Trajectory Discovery"}]},{"@type":"WebSite","@id":"https:\/\/yousum.gpucore.co\/#website","url":"https:\/\/yousum.gpucore.co\/","name":"YouSum","description":"","publisher":{"@id":"https:\/\/yousum.gpucore.co\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/yousum.gpucore.co\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"it-IT"},{"@type":"Organization","@id":"https:\/\/yousum.gpucore.co\/#organization","name":"Drone Association Thailand","url":"https:\/\/yousum.gpucore.co\/","logo":{"@type":"ImageObject","inLanguage":"it-IT","@id":"https:\/\/yousum.gpucore.co\/#\/schema\/logo\/image\/","url":"https:\/\/youzum.net\/wp-content\/uploads\/2024\/11\/tranparent-logo.png","contentUrl":"https:\/\/youzum.net\/wp-content\/uploads\/2024\/11\/tranparent-logo.png","width":300,"height":300,"caption":"Drone Association Thailand"},"image":{"@id":"https:\/\/yousum.gpucore.co\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/DroneAssociationTH\/"]},{"@type":"Person","@id":"https:\/\/yousum.gpucore.co\/#\/schema\/person\/97fa48242daf3908e4d9a5f26f4a059c","name":"admin NU","image":{"@type":"ImageObject","inLanguage":"it-IT","@id":"https:\/\/yousum.gpucore.co\/#\/schema\/person\/image\/","url":"https:\/\/youzum.net\/wp-content\/uploads\/avatars\/2\/1746849356-bpfull.png","contentUrl":"https:\/\/youzum.net\/wp-content\/uploads\/avatars\/2\/1746849356-bpfull.png","caption":"admin NU"},"url":"https:\/\/youzum.net\/it\/members\/adminnu\/"}]}},"rttpg_featured_image_url":null,"rttpg_author":{"display_name":"admin NU","author_link":"https:\/\/youzum.net\/it\/members\/adminnu\/"},"rttpg_comment":0,"rttpg_category":"<a href=\"https:\/\/youzum.net\/it\/category\/ai-club\/\" rel=\"category tag\">AI<\/a> <a href=\"https:\/\/youzum.net\/it\/category\/committee\/\" rel=\"category tag\">Committee<\/a> <a href=\"https:\/\/youzum.net\/it\/category\/news\/\" rel=\"category tag\">News<\/a> <a href=\"https:\/\/youzum.net\/it\/category\/uncategorized\/\" rel=\"category tag\">Uncategorized<\/a>","rttpg_excerpt":"In this tutorial, we perform an advanced single-cell RNA-seq analysis workflow using Scanpy on the PBMC-3k benchmark dataset. We start by loading the dataset, inspecting its structure, and applying quality control checks to evaluate gene counts, total counts, mitochondrial content, and ribosomal gene signals. We then filter low-quality cells and genes, detect potential doublets with&hellip;","_links":{"self":[{"href":"https:\/\/youzum.net\/it\/wp-json\/wp\/v2\/posts\/89194","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/youzum.net\/it\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/youzum.net\/it\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/youzum.net\/it\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/youzum.net\/it\/wp-json\/wp\/v2\/comments?post=89194"}],"version-history":[{"count":0,"href":"https:\/\/youzum.net\/it\/wp-json\/wp\/v2\/posts\/89194\/revisions"}],"wp:attachment":[{"href":"https:\/\/youzum.net\/it\/wp-json\/wp\/v2\/media?parent=89194"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/youzum.net\/it\/wp-json\/wp\/v2\/categories?post=89194"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/youzum.net\/it\/wp-json\/wp\/v2\/tags?post=89194"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}