0
点赞
收藏
分享

微信扫一扫

单细胞论文记录(part10)--Computational challenges and opportunities in SRT data

学习笔记,仅供参考,有错必究

Authors:Atta Lyla,Fan Jean

Journal:Nature Communications

Year:2021


文章目录

  • ​​Computational challenges and opportunities in spatially resolved transcriptomic data analysis​​
  • ​​abstract​​
  • ​​New methods for new data​​
  • ​​Analytical challenges and opportunities​​
  • ​​Laying a foundation for the future​​

Computational challenges and opportunities in spatially resolved transcriptomic data analysis

空间分辨率的转录组数据需要新的计算分析方法来获得生物学的见解. 在此,我们对这些相关的计算挑战进行评价,并强调标准化的基准指标和数据共享对创新方面的刺激.

abstract

单细胞测序技术的进步使单个细胞的高通量转录组分析成为可能,从而可以描述和发现transcriptionally distinct的细胞类型和细胞状态. 然而,目前的方案需要将cell从组织中分离出来,从而失去了潜在的有价值的空间信息,这些空间信息可能会告知细胞类型和细胞状态是如何在组织中组织起来的,以及这种组织如何最终影响表型和功能. 为了保留这种空间信息,成像(imaging )技术的进步使得在分子和单细胞分辨率下对预先选择的RNA进行高通量的原位、定向转录组分析成为可能. 此外,基于空间分辨率的RNA捕获和测序技术已经实现了非目标的、全基因组的转录分析,分辨率为10-100微米. Though the suitability of each spatial transcriptomics technology in addressing a particular biological question will currently involve balancing the need for experimental throughput versus spatial resolution, with current imaging- based technologies generally offering higher spatial resolution but lower experimental throughput and current sequencing-based technologies generally offering higher experimental throughput but lower spatial resolution, all of these resulting large-scale spatially resolved transcriptomic data demand new computational methods to take advantage of this new spatial information to derive biological insights.

New methods for new data

鉴于这种高通量空间分辨率的转录组技术刚刚起步,分析所获数据的新计算方法仍在积极开发中. 目前,已经开发了利用高斯过程、广义线性模型和空间自相关分析的计算方法,以确定其expression表现出显著spatial variability的genes. 其中,一些方法还可以对不同的空间变异模式进行分类,如linear or periodic gene expression,以及识别空间特征,如gene expression hotspots. 这种对空间变异基因的识别可以使人们深入了解特定位置的表型以及developmental 和 migration gradients. 空间信息还可以增强对putative cell–cell communication networks的识别. 利用单细胞测序数据,cell–cell communication inference has relied on identifying coordinated expression of known ligand–receptor pairs. 使用graph convolutional neural networks、optimal transport approaches和spatial cross-correlation analysis等利用空间分辨率转录组数据增加空间信息的计算方法,可以将candidates 缩小到空间共定位的ligand–receptor pairs,这可能表明自分泌或旁分泌信号传导. 此外,空间分辨率的转录组数据与co-registered的成像数据呈现出额外的异质性来源,如形态变化,这可用于聚类,因为形态的差异可以代表细胞状态的差异或其他功能表型,如细胞周期位置、转化或侵入性. 除了基因表达信息外,纳入空间和形态信息的计算方法已被应用于进一步剖析单细胞群的异质性,以确定不仅在转录上不同,而且在形态和空间上也不同的单细胞集群. 尽管上述计算方法可以应用于单细胞分辨率和多细胞像素分辨率的空间分辨率转录组数据,但使用多细胞像素分辨率数据解释所产生的趋势时,需要考虑到包含不同的细胞类型可能带来的干扰.

Analytical challenges and opportunities

来自空间分辨率转录组技术的数据带来了独特的分析挑战和机会. 对于来自原位成像技术(situ imaging-based technologies)的空间分辨率转录组数据,独立识别的RNA分子必须被聚集到细胞中,以实现单细胞分辨率的转录组分析. 因此,需要可靠的细胞分割来充分剖析单细胞在空间背景下的异质性,以及探测它们的形态特征和描述它们的细胞内变化. 整合额外的信息,如 cellular transcriptional composition 和cell type-specific gene expression的预先知识,可以进一步提高分割性能,特别是对于拥挤但转录不同的细胞. 然而,对于具有更复杂形态的细胞,如神经元,仍然需要额外的计算方法来进行可靠的细胞分割. 除了确保更准确地估计单细胞的基因数量,可靠的分割为更多的下游计算方法打开了大门,以incorporate subcellular spatial information. 例如,通过准确核算RNA计数的亚细胞位置,这些下游方法可以通过推断RNA的原位速度来预测未来的细胞转录状态,或者对RNA的亚细胞空间异质性及其功能影响进行描述.

Likewise, spatially resolved transcriptomic data from sequencing-based, pixel-resolution, spatially resolved RNA capture technologies present a different set of unique analytical challenges. In particular for technologies with larger pixel sizes, transcripts from multiple cells may be captured in each spatially resolved pixel. As such, each resulting spatially resolved transcriptomic profile may reflect multiple cells of different cell types, thereby hindering the identification of cell-type-specific spatial organizational patterns. To overcome this challenge, several computational methods have been developed to deconvolve cell-type mixtures within each multi-cellular spatially resolved pixel, often by integrating the cell-type-specific transcriptomic profiles derived from a suitable single-cell reference or by applying generative modeling approaches. Although these deconvolution approaches infer the proportional representation of cell types within multi-cellular pixels, additional methods are needed to further dissect the spatial organization of cell types and enable the inference of sub-pixel spatial information.

Still, additional computational methods for analyzing spatially resolved transcriptomic data are needed. Notably, although computational methods have been developed to identify and characterize spatial gene expression patterns, we find that additional methods to systematically characterize and statistically evaluate how such patterns relate to anatomical features of tissues such as blood vessels or organ borders are still needed to understand the relationship between structure and phenotype. Furthermore, current computational methods generally limit spatial analysis to individual tissue sections or multiple contiguous sections from the same sample(目前的计算方法通常将空间分析限制在单个组织切片或同一样品的多个连续切片上). 为了分析从不同个体、时间点或扰动(perturbations)中收集的样本,我们预计将需要额外的计算方法来aligning一个共同的坐标系统,以比较和描述空间基因表达模式和细胞组织的差异.

Laying a foundation for the future

As these spatially resolved transcriptomic technologies become more widely adopted, we anticipate that beyond the development of new computational methods for spatially informed data analysis, such computational methods must be implemented and made accessible as robust and usable software. This is needed to ensure that users can apply these technologies and analyze the resulting data effectively and efficiently. We believe the software developed to preprocess and analyze spatially resolved transcriptomic data should therefore adhere to best practices in open-source software development, such as providing adequate documentation of soft- ware functionality and maintaining responsive issue tracking.

Further, support mechanisms need to be made available to pro- mote and incentivize such adherence. Adherence to such best practices will be especially critical to ensure that these technologies and tools are accessible to researchers with more limited computational expertise. Moreover, as these spatially resolved transcriptomic technologies and protocols are further developed to enable data collection from larger tissue sections with more genes and cells across more samples, analytical algorithms and software implementations that are scalable with respect to runtime and memory will also be critical to ensure that these technologies and tools are accessible to researchers with more limited computational resources.

随着目前空间分辨率转录组技术的不断成熟,我们认为需要建立standardized metrics 和 benchmarks ,以便对这些技术进行比较,特别是在检测灵敏度、特异性和捕获效率方面. 这种标准化的衡量标准和基准对于理解哪些技术可能更适合于特定的生物问题,如那些需要检测低表达基因或单核苷酸变异的问题,是非常重要的. 这种standardized metrics 和 benchmarks 也将促进计算方法的发展,以便对多种技术的数据进行统一的分析. 特别是对于来自原位成像技术的空间分辨率的转录组数据,报告spot calling, gene identification,和gene-to-cell assignment的标准化指标仍有待建立. 我们预计,这种具体的标准化指标将有助于减少错误向下游分析传播,减少可能导致不正确的生物学解释. 例如,spot calling, gene identification 和 cell segmentation中的错误可能会导致不准确的cellular gene expression counts,从而导致对细胞类型的错误识别. 为基于原位成像技术的空间分辨率转录组数据建立一套标准化指标的挑战之一是目前缺乏统一的预处理管道. 目前许多空间分辨率的基于原位成像的转录物组都依赖于in-house image preprocessing pipelines. 虽然统一的预处理管道正在建立,但还需要进一步努力,通过提高其易用性,提供与现有内部管道相媲美的功能,同时在现有技术平台上保持灵活性,以及在不同的使用情况下展示稳健性和可重复性.

In addition, we find that an accessible and centralized infrastructure is currently still needed for sharing spatially resolved transcriptomic data, in particular from in situ imaging- based technologies. Such an accessible and centralized infrastructure already exists for RNA-sequencing data. Further, it makes readily accessible not only the processed gene counts but also raw sequences, as well as metadata on the machines and organisms used to generate the data and metrics regarding the quality of the data such as base call quality scores. Establishing a similar infrastructure for spatially resolved transcriptomic data from in situ imaging-based technologies may prove to be more complex given the range of protocols and modalities that exist and the sheer size of the raw imaging data as well. However, establishing such an accessible data-sharing infrastructure will be especially important for accelerating the development of computational methods to analyze such spatially resolved transcriptomic data, as it ensures the availability of a wide range of data for method testing and enables the characterization of method performance with respect to data quality. We envision that additional discussion and collaboration from the community will be needed to establish the form of processed data and range of standardized metrics most useful for all invested par- ties, from those interested in developing new computational methods to those interested in further enhancing the technologies, and those interested in probing deeper into datasets for biological insights.

In conclusion, spatially resolved transcriptomic technologies offer an exciting new way of probing the intricate spatial mechanisms at play within tissue ecosystems. Computational methods are needed to enable the characterization of tissue heterogeneity using the high informational content data obtained from such spatially resolved transcriptomic technologies. Still, there remains a need for targeted perturbation, experimental validation, and investigation of generalizability to validate the insights gained from applying these computational methods. For example, although computational methods have been developed to integrate spatial and morphological information in single-cell clustering, further validation is needed to understand if new cell clusters identified through such integrative approaches represent meaningful functional heterogeneity. Furthermore, investigating the extent to which spatial and morphological characteristics of cells are independent of their gene expression can lend insights into other cell intrinsic and cell extrinsic factors that influence cell phenotype. Likewise, intracellular spatial heterogeneity and its functional consequences remain to be characterized. Ultimately, computational methods for analyzing spatially resolved transcriptomic data offer the potential to identify and characterize the heterogeneity of cells within their spatial contexts and contribute to important fundamental biological insights regarding how tissues are organized in both the healthy and diseased settings.


举报

相关推荐

0 条评论