In the previous vignettes, NICHES was mainly demonstrated with Seurat framework. Here in this vignette, we demonstrate the interoperability of NICHES with other popular scRNA-seq frameworks, namely, Scanpy, Monocle3, and scater.

First, let’s start with Scanpy and use the pbmc3k dataset as a demonstration. Scanpy is python-based and uses AnnData as the primary data strcture. We will demonstrate how to extract the data from Scanpy to run NICHES, convert the NICHES output to AnnData, and use functions from Scanpy to perform downstream analysis.

Interperobility with Scanpy

Load Required Packages

Here besides NICHES, we will also load ALRA for imputation and reticulate for running python.

library(NICHES)

# Load ALRA
devtools::source_url('https://raw.githubusercontent.com/KlugerLab/ALRA/master/alra.R')

# Set up reticulate and the python env
library(reticulate)
# change to your corresponding python path
Sys.setenv(RETICULATE_PYTHON = "/Library/Frameworks/Python.framework/Versions/3.10/bin/python3.10")

We will also need a few python packages, in particular, scanpy.

Load the data and preprocess

Now let’s load the pbmc3k data from scanpy directly. Here we load both the raw and processed pbmc3k. The former will be used to run NICHES after a few processing steps and we will use the latter to add cell annotations.

# load the raw pbmc3k and the processed pbmc3k
adata = sc.datasets.pbmc3k()
#> WARNING: Failed to open the url with default certificates, trying with certifi.
#> 
  0%|          | 0.00/5.58M [00:00<?, ?B/s]
  1%|          | 32.0k/5.58M [00:00<00:28, 203kB/s]
  2%|1         | 88.0k/5.58M [00:00<00:17, 329kB/s]
  3%|3         | 192k/5.58M [00:00<00:20, 278kB/s] 
  7%|6         | 384k/5.58M [00:00<00:09, 547kB/s]
  8%|8         | 480k/5.58M [00:00<00:08, 618kB/s]
 11%|#         | 624k/5.58M [00:01<00:07, 681kB/s]
 12%|#2        | 688k/5.58M [00:01<00:07, 669kB/s]
 16%|#5        | 896k/5.58M [00:01<00:05, 874kB/s]
 21%|##        | 1.17M/5.58M [00:01<00:03, 1.35MB/s]
 23%|##3       | 1.29M/5.58M [00:01<00:03, 1.30MB/s]
 25%|##5       | 1.42M/5.58M [00:01<00:03, 1.25MB/s]
 28%|##8       | 1.57M/5.58M [00:01<00:03, 1.24MB/s]
 31%|###       | 1.71M/5.58M [00:02<00:03, 1.21MB/s]
 33%|###3      | 1.86M/5.58M [00:02<00:03, 1.21MB/s]
 36%|###5      | 2.01M/5.58M [00:02<00:03, 1.21MB/s]
 39%|###8      | 2.16M/5.58M [00:02<00:02, 1.23MB/s]
 41%|####1     | 2.31M/5.58M [00:02<00:02, 1.23MB/s]
 44%|####4     | 2.48M/5.58M [00:02<00:02, 1.25MB/s]
 47%|####7     | 2.62M/5.58M [00:02<00:02, 1.24MB/s]
 50%|####9     | 2.77M/5.58M [00:02<00:02, 1.30MB/s]
 51%|#####1    | 2.88M/5.58M [00:03<00:02, 1.22MB/s]
 53%|#####3    | 2.96M/5.58M [00:03<00:02, 1.10MB/s]
 56%|#####6    | 3.13M/5.58M [00:03<00:02, 1.15MB/s]
 58%|#####7    | 3.23M/5.58M [00:03<00:02, 1.10MB/s]
 60%|######    | 3.35M/5.58M [00:03<00:02, 990kB/s] 
 63%|######3   | 3.52M/5.58M [00:03<00:01, 1.11MB/s]
 66%|######5   | 3.68M/5.58M [00:03<00:02, 896kB/s] 
 69%|######8   | 3.85M/5.58M [00:04<00:01, 998kB/s]
 71%|#######   | 3.95M/5.58M [00:04<00:01, 880kB/s]
 71%|#######1  | 3.99M/5.58M [00:04<00:02, 780kB/s]
 75%|#######5  | 4.21M/5.58M [00:04<00:01, 1.07MB/s]
 77%|#######7  | 4.32M/5.58M [00:04<00:01, 1.03MB/s]
 80%|#######9  | 4.45M/5.58M [00:05<00:01, 618kB/s] 
 82%|########1 | 4.57M/5.58M [00:05<00:01, 678kB/s]
 84%|########4 | 4.70M/5.58M [00:05<00:01, 710kB/s]
 84%|########4 | 4.71M/5.58M [00:05<00:01, 606kB/s]
 88%|########8 | 4.92M/5.58M [00:05<00:00, 890kB/s]
 90%|########9 | 5.02M/5.58M [00:05<00:01, 565kB/s]
 92%|#########1| 5.13M/5.58M [00:06<00:00, 624kB/s]
 94%|#########3| 5.23M/5.58M [00:06<00:00, 676kB/s]
 96%|#########5| 5.34M/5.58M [00:06<00:00, 715kB/s]
 98%|#########7| 5.45M/5.58M [00:06<00:00, 750kB/s]
100%|#########9| 5.56M/5.58M [00:06<00:00, 761kB/s]
100%|##########| 5.58M/5.58M [00:06<00:00, 886kB/s]
adata2 = sc.datasets.pbmc3k_processed()
#> WARNING: Failed to open the url with default certificates, trying with certifi.
#> 
  0%|          | 0.00/23.5M [00:00<?, ?B/s]
  1%|          | 136k/23.5M [00:00<00:18, 1.34MB/s]
  2%|1         | 368k/23.5M [00:00<00:13, 1.80MB/s]
  3%|2         | 640k/23.5M [00:00<00:11, 2.14MB/s]
  4%|4         | 968k/23.5M [00:00<00:09, 2.57MB/s]
  5%|5         | 1.24M/23.5M [00:00<00:09, 2.43MB/s]
  7%|6         | 1.57M/23.5M [00:00<00:08, 2.71MB/s]
  8%|8         | 1.90M/23.5M [00:00<00:07, 2.88MB/s]
 10%|9         | 2.24M/23.5M [00:00<00:07, 3.07MB/s]
 11%|#         | 2.48M/23.5M [00:00<00:07, 2.88MB/s]
 12%|#2        | 2.88M/23.5M [00:01<00:06, 3.25MB/s]
 14%|#3        | 3.26M/23.5M [00:01<00:06, 3.44MB/s]
 15%|#5        | 3.55M/23.5M [00:01<00:06, 3.34MB/s]
 16%|#6        | 3.87M/23.5M [00:01<00:06, 3.28MB/s]
 18%|#7        | 4.21M/23.5M [00:01<00:06, 3.33MB/s]
 19%|#9        | 4.57M/23.5M [00:01<00:05, 3.43MB/s]
 21%|##        | 4.87M/23.5M [00:01<00:05, 3.33MB/s]
 22%|##2       | 5.21M/23.5M [00:01<00:05, 3.41MB/s]
 23%|##3       | 5.49M/23.5M [00:01<00:05, 3.25MB/s]
 25%|##4       | 5.85M/23.5M [00:02<00:05, 3.33MB/s]
 26%|##6       | 6.12M/23.5M [00:02<00:05, 3.14MB/s]
 28%|##7       | 6.48M/23.5M [00:02<00:05, 3.30MB/s]
 29%|##8       | 6.77M/23.5M [00:02<00:05, 3.15MB/s]
 30%|###       | 7.09M/23.5M [00:02<00:05, 3.17MB/s]
 31%|###1      | 7.37M/23.5M [00:02<00:05, 3.04MB/s]
 33%|###2      | 7.71M/23.5M [00:02<00:05, 3.19MB/s]
 34%|###4      | 8.01M/23.5M [00:02<00:05, 3.16MB/s]
 36%|###5      | 8.38M/23.5M [00:02<00:04, 3.38MB/s]
 37%|###7      | 8.73M/23.5M [00:02<00:04, 3.41MB/s]
 39%|###8      | 9.07M/23.5M [00:03<00:04, 3.46MB/s]
 40%|####      | 9.41M/23.5M [00:03<00:04, 3.49MB/s]
 42%|####1     | 9.77M/23.5M [00:03<00:04, 3.53MB/s]
 43%|####2     | 10.1M/23.5M [00:03<00:04, 3.49MB/s]
 44%|####4     | 10.4M/23.5M [00:03<00:04, 3.23MB/s]
 45%|####4     | 10.6M/23.5M [00:03<00:04, 2.77MB/s]
 46%|####5     | 10.8M/23.5M [00:03<00:05, 2.57MB/s]
 47%|####6     | 11.0M/23.5M [00:03<00:05, 2.52MB/s]
 48%|####7     | 11.2M/23.5M [00:03<00:05, 2.27MB/s]
 48%|####8     | 11.4M/23.5M [00:04<00:06, 2.09MB/s]
 49%|####9     | 11.6M/23.5M [00:04<00:05, 2.12MB/s]
 50%|#####     | 11.8M/23.5M [00:04<00:06, 1.98MB/s]
 51%|#####1    | 12.1M/23.5M [00:04<00:05, 2.28MB/s]
 53%|#####2    | 12.4M/23.5M [00:04<00:04, 2.61MB/s]
 54%|#####4    | 12.8M/23.5M [00:04<00:03, 2.93MB/s]
 56%|#####5    | 13.1M/23.5M [00:04<00:03, 3.08MB/s]
 57%|#####7    | 13.4M/23.5M [00:04<00:03, 3.16MB/s]
 59%|#####8    | 13.8M/23.5M [00:04<00:03, 3.31MB/s]
 60%|######    | 14.1M/23.5M [00:04<00:02, 3.39MB/s]
 62%|######1   | 14.5M/23.5M [00:05<00:02, 3.45MB/s]
 63%|######2   | 14.8M/23.5M [00:05<00:02, 3.28MB/s]
 64%|######4   | 15.1M/23.5M [00:05<00:02, 3.34MB/s]
 65%|######5   | 15.4M/23.5M [00:05<00:02, 3.22MB/s]
 67%|######6   | 15.7M/23.5M [00:05<00:02, 2.92MB/s]
 69%|######8   | 16.1M/23.5M [00:05<00:02, 3.38MB/s]
 70%|######9   | 16.4M/23.5M [00:05<00:02, 3.30MB/s]
 71%|#######1  | 16.7M/23.5M [00:05<00:02, 3.09MB/s]
 72%|#######2  | 17.0M/23.5M [00:05<00:02, 3.04MB/s]
 74%|#######3  | 17.3M/23.5M [00:05<00:02, 3.16MB/s]
 75%|#######5  | 17.7M/23.5M [00:06<00:01, 3.16MB/s]
 76%|#######6  | 17.9M/23.5M [00:06<00:01, 2.97MB/s]
 77%|#######7  | 18.2M/23.5M [00:06<00:01, 2.83MB/s]
 79%|#######8  | 18.5M/23.5M [00:06<00:01, 3.08MB/s]
 80%|########  | 18.9M/23.5M [00:06<00:01, 3.22MB/s]
 81%|########1 | 19.1M/23.5M [00:06<00:01, 3.12MB/s]
 83%|########2 | 19.5M/23.5M [00:06<00:01, 3.21MB/s]
 84%|########4 | 19.8M/23.5M [00:06<00:01, 3.06MB/s]
 85%|########5 | 20.0M/23.5M [00:06<00:01, 2.89MB/s]
 86%|########6 | 20.2M/23.5M [00:07<00:01, 2.62MB/s]
 87%|########6 | 20.4M/23.5M [00:07<00:01, 2.26MB/s]
 88%|########7 | 20.6M/23.5M [00:07<00:01, 2.24MB/s]
 88%|########8 | 20.8M/23.5M [00:07<00:01, 2.08MB/s]
 89%|########9 | 20.9M/23.5M [00:07<00:01, 1.92MB/s]
 90%|######### | 21.2M/23.5M [00:07<00:01, 2.01MB/s]
 91%|#########1| 21.5M/23.5M [00:07<00:00, 2.47MB/s]
 93%|#########2| 21.8M/23.5M [00:07<00:00, 2.74MB/s]
 94%|#########4| 22.2M/23.5M [00:07<00:00, 2.97MB/s]
 96%|#########5| 22.5M/23.5M [00:07<00:00, 3.10MB/s]
 97%|#########7| 22.9M/23.5M [00:08<00:00, 3.25MB/s]
 99%|#########8| 23.2M/23.5M [00:08<00:00, 3.33MB/s]
100%|#########9| 23.5M/23.5M [00:08<00:00, 3.12MB/s]
100%|##########| 23.5M/23.5M [00:08<00:00, 2.97MB/s]

Subset to only ‘B cells’,‘CD4 T cells’,‘CD14+ Monocytes’

Here for demonstration, For let’s focus in on signaling within a simple 3-cell system:

Extract the meta data and impute the data to run NICHES

Here we will extract the cell types metadata from anndata and impute the data to run NICHES. In particular, data.imputed will the data matrix to run NICHES.

meta.data.df <- data.frame("louvain" = py$adata$obs$louvain,row.names = py$cell_names)

# alra imputation
imputed.list <- alra(as.matrix(py$adata$X$toarray()))
#> Read matrix with 1966 cells and 13714 genes
#> Chose k=16
#> Getting nonzeros
#> Randomized SVD
#> Find the 0.001000 quantile of each gene
#> Sweep
#> Scaling all except for 393 columns
#> 0.00% of the values became negative in the scaling process and were set to zero
#> The matrix went from 5.89% nonzero to 40.75% nonzero
data.imputed <- t(imputed.list[[3]])

colnames(data.imputed) <- py$cell_names
rownames(data.imputed) <- py$gene_names

Run NICHES

Here we will take the previously imputed data matrix as input to run NICHES and specify output_format to be “raw” for easier conversion to AnnData.

niches_list <- RunNICHES(data.imputed,
                 meta.data.df = meta.data.df,
                 cell_types = "louvain",
                 species = 'human',
                 LR.database = 'fantom5',
                 CellToCell = T,output_format = "raw"
                 )
lr_names <- rownames(niches_list$CellToCell$CellToCellMatrix)

Process the output from NICHES using Scanpy

We will use the Scanpy functions to perform downstream analysis. Here we will scale the data, do PCA and UMAP.

Visualize CellToCell matrix

We can use the visualization function from Scanpy to visualize the data in UMAP.

Interporability with Monocle3 (CellDataSet)

Here we will continue to use this pbmc3k dataset, but we will switch to Monocle3 (CellDataSet). This will be a simple demonstration of how to convert the output from NICHES to CellDataSet and use certain functions from Monocle3 to perform downstream analysis.

Convert the output from NICHES to CellDataSet

Here we use the same output from NICHES from previous run and convert it to CellDataSet.

library(monocle3)
lr_metadata <- data.frame(lr_names)
rownames(lr_metadata) <- lr_names
cds <- new_cell_data_set(niches_list$CellToCell$CellToCellMatrix,
                         cell_metadata = niches_list$CellToCell$metadata,
                         gene_metadata = lr_metadata)

Data Processing

Then we will process the data using Monocle3 and run PCA and UMAP on it.

# use all genes for pca
cds <- preprocess_cds(cds, num_dim = 30,
                      method = "PCA",norm_method = "none",scaling = TRUE,verbose = TRUE)
cds <- reduce_dimension(cds)

Data Visualization

We can visualize the data on UMAP (Cells are colored by VectorTypes).

plot_cells(cds, color_cells_by="VectorType")

Interporability with Scater (SingleCellExperiment)

Here we will continue to use this pbmc3k dataset, but we will switch to Scater (SingleCellExperiment) and utilize certain functions to perform downstream analysis.

Convert the output from NICHES to SingleCellExperiment

library(scater)
sce <- SingleCellExperiment(
    assays = list(data = niches_list$CellToCell$CellToCellMatrix), 
    colData = niches_list$CellToCell$metadata,
    rowData = lr_metadata
)

RunPCA on the ouput

sce <- runPCA(sce,exprs_values="data")

Visualize Ligand-receptor interactions using violin plot

plotExpression(sce,features = "GNAI2—C5AR1",exprs_values = "data",x = "VectorType",colour_by="VectorType")+
  theme(axis.text.x = element_text(angle = 60, vjust = 1, hjust=1))

Visualize the output on UMAP

sce <- runUMAP(sce,exprs_values="data")
plotUMAP(sce, colour_by = "VectorType")