Hi,
I encountered a crash in the SCALPEL pipeline at the step:
isoform_quantification:probability_distribution
The error is:
Error in seq.default(0, read_tab$transcriptomic_distance[1], -BINS) :
wrong sign in 'by' argument
This occurs in compute_prob.R at:
part_neg = c(seq(0,read_tab$transcriptomic_distance[1],-BINS), ...)
From debugging, the issue arises because the pipeline assumes that
transcriptomic_distance values are negative (i.e. upstream of the 3' end).
However, in my dataset (10x Chromium 3' snRNA-seq), many distances are positive.
Example from all_unique_reads.txt:
chr21 44335310 44335372 + 478 ...
chr21 44335399 44335489 + 361 ...
chr21 44335490 44335553 + 297 ...
Here the transcript coordinates are:
transcript: 44335251–44335851 (+ strand)
So the read lies upstream of the 3' end, but the computed dist_END is positive.
Because of this, the following call fails:
seq(0, positive_value, -BINS)
which leads to the crash.
As a temporary workaround, I inverted the sign of dist_END in compute_prob.R:
reads = reads %>%
dplyr::filter(gene_name %in% gene.tokeep) %>%
mutate(dist_END = -as.numeric(dist_END))
After this change, the pipeline proceeds normally.
My questions are:
- Should transcriptomic_distance be expected to be negative upstream of the 3' end?
- Is the current sign convention in
mapping_filtering.R intended?
- Should compute_prob.R handle both positive and negative distances more robustly (e.g. using
min() instead of [1])?
Thank you for developing SCALPEL.
Best regards
Hi,
I encountered a crash in the SCALPEL pipeline at the step:
isoform_quantification:probability_distribution
The error is:
Error in seq.default(0, read_tab$transcriptomic_distance[1], -BINS) :
wrong sign in 'by' argument
This occurs in compute_prob.R at:
part_neg = c(seq(0,read_tab$transcriptomic_distance[1],-BINS), ...)
From debugging, the issue arises because the pipeline assumes that
transcriptomic_distancevalues are negative (i.e. upstream of the 3' end).However, in my dataset (10x Chromium 3' snRNA-seq), many distances are positive.
Example from
all_unique_reads.txt:chr21 44335310 44335372 + 478 ...
chr21 44335399 44335489 + 361 ...
chr21 44335490 44335553 + 297 ...
Here the transcript coordinates are:
transcript: 44335251–44335851 (+ strand)
So the read lies upstream of the 3' end, but the computed
dist_ENDis positive.Because of this, the following call fails:
seq(0, positive_value, -BINS)
which leads to the crash.
As a temporary workaround, I inverted the sign of
dist_ENDin compute_prob.R:reads = reads %>%
dplyr::filter(gene_name %in% gene.tokeep) %>%
mutate(dist_END = -as.numeric(dist_END))
After this change, the pipeline proceeds normally.
My questions are:
mapping_filtering.Rintended?min()instead of[1])?Thank you for developing SCALPEL.
Best regards