Introduction

In this vignette we describe how to generate a SingleCellExperiment object combining observed values and clustering results for a data set from the DuoClustering2018 package, and how the resulting object can be explored and visualized with the iSEE package (Rue-Albrecht et al. 2018).

Load the necessary packages

suppressPackageStartupMessages({
  library(SingleCellExperiment)
  library(DuoClustering2018)
  library(dplyr)
  library(tidyr)
})
## Error in get(paste0(generic, ".", class), envir = get_method_env()) : 
##   object 'type_sum.accel' not found

Retrieve a data set

The different ways of retrieving a data set from the package are described in the plot_performance vignette. Here, we will load a data set using the shortcut function provided in the package.

## see ?DuoClustering2018 and browseVignettes('DuoClustering2018') for documentation
## loading from cache

Read a set of clustering results

For this data set, we also load a set of clustering results obtained using different clustering methods.

## see ?DuoClustering2018 and browseVignettes('DuoClustering2018') for documentation
## loading from cache

Merge data and clustering results

We add the cluster labels for one run and for a set of different imposed number of clusters to the data set.

res <- res %>% dplyr::filter(run == 1 & k %in% c(3, 5, 9)) %>%
  dplyr::group_by(method, k) %>% 
  dplyr::filter(is.na(resolution) | resolution == resolution[1]) %>%
  dplyr::ungroup() %>% 
  tidyr::unite(col = method_k, method, k, sep = "_", remove = TRUE) %>% 
  dplyr::select(cell, method_k, cluster) %>%
  tidyr::spread(key = method_k, value = cluster)

colData(dat) <- DataFrame(
  as.data.frame(colData(dat)) %>%
    dplyr::left_join(res, by = c("Run" = "cell"))
)
head(colData(dat))
## DataFrame with 6 rows and 55 columns
##           Run LibraryName     phenoid libsize.drop feature.drop total_features
##   <character> <character> <character>    <logical>    <logical>      <integer>
## 1  SRR3952323      H7hESC      H7hESC        FALSE        FALSE           4895
## 2  SRR3952325      H7hESC      H7hESC        FALSE        FALSE           4887
## 3  SRR3952326      H7hESC      H7hESC        FALSE        FALSE           4888
## 4  SRR3952327      H7hESC      H7hESC        FALSE        FALSE           4879
## 5  SRR3952328      H7hESC      H7hESC        FALSE        FALSE           4873
## 6  SRR3952329      H7hESC      H7hESC        FALSE        FALSE           4893
##   log10_total_features total_counts log10_total_counts
##              <numeric>    <numeric>          <numeric>
## 1              3.68984      2248411            6.35188
## 2              3.68913      2271617            6.35634
## 3              3.68922       584682            5.76692
## 4              3.68842      3191810            6.50404
## 5              3.68789      2190385            6.34052
## 6              3.68966      2187289            6.33991
##   pct_counts_top_50_features pct_counts_top_100_features
##                    <numeric>                   <numeric>
## 1                    18.2790                     25.9754
## 2                    24.6725                     32.2228
## 3                    22.7328                     30.2060
## 4                    20.8674                     29.0039
## 5                    21.2879                     29.4237
## 6                    20.5931                     27.7401
##   pct_counts_top_200_features pct_counts_top_500_features is_cell_control
##                     <numeric>                   <numeric>       <logical>
## 1                     35.5376                     52.4109           FALSE
## 2                     41.5474                     57.9692           FALSE
## 3                     39.4313                     55.2858           FALSE
## 4                     38.7856                     56.0209           FALSE
## 5                     39.3077                     56.6410           FALSE
## 6                     36.7819                     52.7547           FALSE
##   sizeFactor    ascend_3    ascend_5    ascend_9      CIDR_3      CIDR_5
##    <numeric> <character> <character> <character> <character> <character>
## 1   1.889865           1          NA          NA           1           1
## 2   1.810539           1          NA          NA           1           1
## 3   0.486899           1          NA          NA           1           1
## 4   2.562950           1          NA          NA           1           1
## 5   1.848037           1          NA          NA           1           1
## 6   1.897451           1          NA          NA           1           1
##        CIDR_9   FlowSOM_3   FlowSOM_5   FlowSOM_9   monocle_3   monocle_5
##   <character> <character> <character> <character> <character> <character>
## 1           1           2           2           4           3           3
## 2           1           2           2           4           3           3
## 3           1           2           2           4           3           3
## 4           1           2           2           4           3           3
## 5           1           2           2           4           3           3
## 6           1           2           2           4           3           3
##     monocle_9     PCAHC_3     PCAHC_5     PCAHC_9 PCAKmeans_3 PCAKmeans_5
##   <character> <character> <character> <character> <character> <character>
## 1           3           1           1           1           3           1
## 2           3           1           1           1           3           1
## 3           3           1           1           1           3           1
## 4           3           1           1           1           3           1
## 5           3           1           1           1           3           1
## 6           3           1           1           1           3           1
##   PCAKmeans_9 pcaReduce_3 pcaReduce_5 pcaReduce_9   RaceID2_3   RaceID2_5
##   <character> <character> <character> <character> <character> <character>
## 1           4           1           5           5           1           1
## 2           4           1           5           5           2           2
## 3           4           1           5           5           2           2
## 4           4           1           5           5           1           1
## 5           4           1           5           5           1           1
## 6           4           1           5           5           1           2
##     RaceID2_9 RtsneKmeans_3 RtsneKmeans_5 RtsneKmeans_9      SAFE_3      SAFE_5
##   <character>   <character>   <character>   <character> <character> <character>
## 1           1             1             1             9           2           1
## 2           2             1             1             9           2           1
## 3           2             1             1             9           2           1
## 4           1             1             1             9           2           1
## 5           1             1             1             9           2           1
## 6           2             1             1             9           2           1
##        SAFE_9       SC3_3       SC3_5       SC3_9    SC3svm_3    SC3svm_5
##   <character> <character> <character> <character> <character> <character>
## 1           3           1           3           4           3           3
## 2           5           1           3           4           3           3
## 3           3           1           3           4           3           3
## 4           5           1           3           4           3           3
## 5           5           1           3           4           3           3
## 6           5           1           3           4           3           3
##      SC3svm_9    Seurat_9     TSCAN_3     TSCAN_5     TSCAN_9
##   <character> <character> <character> <character> <character>
## 1           3           5           1           1           1
## 2           3           5           1           1           2
## 3           3           5           3           3           2
## 4           3           5           1           1           1
## 5           3           5           2           2           2
## 6           3           5           1           1           1

Visualize with iSEE

The resulting SingleCellExperiment can be interactively explored using, e.g., the iSEE package. This can be useful to gain additional understanding of the partitions inferred by the different clustering methods, to visualize these in low-dimensional representations (PCA or t-SNE), and to investigate how well they agree with known or inferred groupings of the cells.

if (require(iSEE)) {
  iSEE(dat)
}

Session info

## R Under development (unstable) (2024-12-20 r87452)
## Platform: aarch64-apple-darwin20
## Running under: macOS Sonoma 14.7.2
## 
## Matrix products: default
## BLAS:   /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/lib/libRblas.0.dylib 
## LAPACK: /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.12.0
## 
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## time zone: UTC
## tzcode source: internal
## 
## attached base packages:
## [1] stats4    stats     graphics  grDevices utils     datasets  methods  
## [8] base     
## 
## other attached packages:
##  [1] tidyr_1.3.1                 dplyr_1.1.4                
##  [3] DuoClustering2018_1.7.1     SingleCellExperiment_1.29.1
##  [5] SummarizedExperiment_1.37.0 Biobase_2.67.0             
##  [7] GenomicRanges_1.59.1        GenomeInfoDb_1.43.2        
##  [9] IRanges_2.41.2              S4Vectors_0.45.2           
## [11] BiocGenerics_0.53.3         generics_0.1.3             
## [13] MatrixGenerics_1.19.0       matrixStats_1.4.1          
## [15] BiocStyle_2.35.0           
## 
## loaded via a namespace (and not attached):
##  [1] tidyselect_1.2.1        viridisLite_0.4.2       blob_1.2.4             
##  [4] Biostrings_2.75.3       filelock_1.0.3          viridis_0.6.5          
##  [7] fastmap_1.2.0           BiocFileCache_2.15.0    digest_0.6.37          
## [10] mime_0.12               lifecycle_1.0.4         KEGGREST_1.47.0        
## [13] RSQLite_2.3.9           magrittr_2.0.3          compiler_4.5.0         
## [16] rlang_1.1.4             sass_0.4.9              tools_4.5.0            
## [19] yaml_2.3.10             knitr_1.49              S4Arrays_1.7.1         
## [22] htmlwidgets_1.6.4       bit_4.5.0.1             mclust_6.1.1           
## [25] curl_6.0.1              DelayedArray_0.33.3     plyr_1.8.9             
## [28] abind_1.4-8             withr_3.0.2             purrr_1.0.2            
## [31] desc_1.4.3              grid_4.5.0              ExperimentHub_2.15.0   
## [34] colorspace_2.1-1        ggplot2_3.5.1           scales_1.3.0           
## [37] cli_3.6.3               rmarkdown_2.29          crayon_1.5.3           
## [40] ragg_1.3.3              httr_1.4.7              reshape2_1.4.4         
## [43] DBI_1.2.3               cachem_1.1.0            stringr_1.5.1          
## [46] ggthemes_5.1.0          AnnotationDbi_1.69.0    BiocManager_1.30.25    
## [49] XVector_0.47.1          vctrs_0.6.5             Matrix_1.7-1           
## [52] jsonlite_1.8.9          bookdown_0.41           bit64_4.5.2            
## [55] systemfonts_1.1.0       jquerylib_0.1.4         glue_1.8.0             
## [58] pkgdown_2.1.1.9000      stringi_1.8.4           gtable_0.3.6           
## [61] BiocVersion_3.21.1      UCSC.utils_1.3.0        munsell_0.5.1          
## [64] tibble_3.2.1            pillar_1.10.0           rappdirs_0.3.3         
## [67] htmltools_0.5.8.1       GenomeInfoDbData_1.2.13 R6_2.5.1               
## [70] dbplyr_2.5.0            textshaping_0.4.1       evaluate_1.0.1         
## [73] lattice_0.22-6          AnnotationHub_3.15.0    png_0.1-8              
## [76] memoise_2.0.1           bslib_0.8.0             Rcpp_1.0.13-1          
## [79] gridExtra_2.3           SparseArray_1.7.2       xfun_0.49              
## [82] fs_1.6.5                pkgconfig_2.0.3

References

Rue-Albrecht, K, F Marini, C Soneson, and ATL Lun. 2018. iSEE: Interactive SummarizedExperiment Explorer.” F1000Research 7: 741.