Change points are the change in copy number state between adjacent segments. If one segment is 4 and the adjacent segment is 1, the change point is 3.
Arguments
- segs_df
dataframe. Sample copy number segments.
- first_seg_correction
string. Default: 'ignore'. Options include "diploid" or "cn_mode".
- sample_col
string. Name of column with cell/sample names
- chrom_col
string. Name of column with chromosome names
- cn_col
string. Name of column with segment copy number states.
- return
string. "values" (default) or "counts". Values are the observed values for cells, counts are the counts of these values in pre-determined categories.
- ...
can pass arguments to segs_to_reads
Details
Change points are based on the difference to the "left" adjacent segment, when moving from BP 1 to the end of a chromosome. So if there are 3 segments: 4 - 1 - 2, the change points would be: |1 - 4| and |2 - 1| resulting in: 3, 1
For the first segment on a chromosome, Drews et al. compared it to a hypothetical diploid. So if the first segment on a chromosome is 5, the change point would be 5 - 2 = 3. That's fine if the base genome is diploid, but doesn't work so well for other ploidies, or cases where you don't want to assume a diploid base case.
first_seg_correction provides options over what to do. "diploid" for Drews
solution, "cn_mode" to compare to sample ploidy estimate based on mode, or
"ignore" to not count anything for first segments.
Predefined categories for summarizing counts are:
Changepoint = 1, 2, 3, 4, 5+