Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .github/workflows/document.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,7 @@ jobs:
shell: Rscript {0}

- name: Commit and push changes
working-directory: RcppTskit
run: |
git config --local user.name "$GITHUB_ACTOR"
git config --local user.email "$GITHUB_ACTOR@users.noreply.github.com"
Expand Down
15 changes: 11 additions & 4 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -28,14 +28,14 @@ repos:
hooks:
- id: air-format
name: air format
entry: RcppTskit/tools/run-local-tool.sh air format .
entry: RcppTskit/tools/run_local_tool.sh air format .
language: system
pass_filenames: false
files: '\.(R|Rmd|rmd|qmd|Qmd)$'

- id: jarl-lint
name: jarl lint
entry: RcppTskit/tools/run-local-tool.sh jarl check .
entry: RcppTskit/tools/run_local_tool.sh jarl check .
language: system
pass_filenames: false
files: '\.(R|Rmd|rmd|qmd|Qmd)$'
Expand All @@ -48,6 +48,13 @@ repos:

- id: clang-tidy
name: clang-tidy for RcppTskit
entry: python RcppTskit/tools/clang-tidy.py
language: python
entry: RcppTskit/tools/clang_tidy.py
language: system
files: '\.(c|cc|cpp|cxx|h|hh|hpp|hxx)$'

- id: check-sync-between-cpp-and-hpp
name: check sync between cpp and hpp options and defaults
entry: RcppTskit/tools/check_sync_between_cpp_and_hpp.R
language: system
pass_filenames: false
files: '^(RcppTskit/src/RcppTskit\.cpp|RcppTskit/inst/include/RcppTskit_public\.hpp)$'
4 changes: 2 additions & 2 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -167,7 +167,7 @@ Hook responsibilities:
* `air format .`: format R, Rmd, and qmd files.
* `jarl check .`: lint R, Rmd, and qmd files.
* `clang-format -i --style=file`: format C/C++ sources and headers.
* `python RcppTskit/tools/clang-tidy.py`: run clang-tidy checks for C/C++.
* `python RcppTskit/tools/clang_tidy.py`: run clang-tidy checks for C/C++.
* Standard pre-commit hygiene hooks:
whitespace, line endings, YAML checks,
merge-conflict markers, and large-file checks.
Expand Down Expand Up @@ -202,7 +202,7 @@ export CLANG_TIDY="$(brew --prefix llvm)/bin/clang-tidy"
Then you can run the wrapper script directly:

```sh
python RcppTskit/tools/clang-tidy.py RcppTskit/src/RcppTskit.cpp
python RcppTskit/tools/clang_tidy.py RcppTskit/src/RcppTskit.cpp
```

### Coverage with covr
Expand Down
30 changes: 21 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,10 +17,12 @@ The `Python` API can be called from `R` via the `reticulate` `R` package to
seamlessly load and analyse a tree sequence, as described at
https://tskit.dev/tutorials/RcppTskit.html.
`RcppTskit` provides `R` access to the `tskit C` API for use cases where the
`reticulate` option is not optimal. For example, for high-performance and
low-level work with tree sequences. Currently, `RcppTskit` provides a limited
number of `R` functions due to the availability of extensive `Python` API and
the `reticulate` option.
`reticulate` option is not optimal.
For example, for high-performance and low-level work with tree sequences.
Currently, `RcppTskit` provides a limited number of functions
due to the availability of extensive `Python` API and the `reticulate` option.
The provided `RcppTskit R` API mirrors the `tskit Python` API,
while the `RcppTskit C++` API mirrors the `tskit C` API.

See more details on the state of the tree sequence ecosystem and aims of
`RcppTskit` in [the introduction vignette](https://highlanderlab.r-universe.dev/articles/RcppTskit/RcppTskit_intro.html) ([source](RcppTskit/vignettes/RcppTskit_intro.qmd)).
Expand Down Expand Up @@ -153,9 +155,13 @@ Specifically, we use:
To install the hooks, run:

```
pre-commit install
pre-commit install --install-hooks
pre-commit install --hook-type pre-push
```

Run these once per clone.
This enables automatic checks on `commit` and `push`.

### tskit

If you plan to update `tskit`, follow instructions in `extern/README.md`.
Expand Down Expand Up @@ -203,12 +209,18 @@ On Windows, replace `tar.gz` with `zip`.

### Pre-commit run

Before committing your changes, run the `pre-commit` hooks to ensure code quality:
When committing your changes,
`pre-commit` hooks should kick-in automatically
to ensure code quality.
Manually, you can run them using:

```
# pre-commit autoupdate # to update the hooks
pre-commit run --all-files
# pre-commit run <hook_id>
pre-commit autoupdate # to update the hooks
pre-commit run # on changed files
pre-commit run --all-files # on all files
pre-commit run <hook_id> # just a specific hook
pre-commit run <hook_id> --all-files # ... on all files
# see also --hook-stage option
```

### Continuous integration
Expand Down
12 changes: 7 additions & 5 deletions RcppTskit/DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ Type: Package
Package: RcppTskit
Title: 'R' Access to the 'tskit C' API
Version: 0.3.0
Date: 2026-01-27
Date: 2026-03-01
Authors@R: c(
person("Gregor", "Gorjanc", , "gregor.gorjanc@gmail.com", role = c("aut", "cre", "cph"),
comment = c(ORCID = "0000-0001-8008-2787")),
Expand All @@ -16,14 +16,16 @@ Description: 'Tskit' enables efficient storage, manipulation, and analysis
described in Jeffrey et al. (2026) <doi:10.48550/arXiv.2602.09649>.
See also <https://tskit.dev> for project news, documentation, and
tutorials. 'Tskit' provides 'Python', 'C', and 'Rust' application
programming interfaces (APIs). The 'Python' API can be called from 'R' via
the 'reticulate' package to load and analyse tree sequences as
programming interfaces (APIs). The 'Python' API can be called from 'R'
via the 'reticulate' package to load and analyse tree sequences as
described at <https://tskit.dev/tutorials/tskitr.html>. 'RcppTskit'
provides 'R' access to the 'tskit C' API for cases where the
'reticulate' option is not optimal; for example, high-performance or
low-level work with tree sequences. Currently, 'RcppTskit' provides a
limited set of 'R' functions because the 'Python' API and 'reticulate'
already covers most needs.
limited set of functions because the 'Python' API and 'reticulate'
already cover most needs. The provided `RcppTskit R` API mirrors the
`tskit Python` API, while the `RcppTskit C++` API mirrors the `tskit
C` API.
License: MIT + file LICENSE
URL: https://github.com/HighlanderLab/RcppTskit
BugReports: https://github.com/HighlanderLab/RcppTskit/issues
Expand Down
53 changes: 34 additions & 19 deletions RcppTskit/NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,11 +4,11 @@ All notable changes to `RcppTskit` are documented in this file.
The file format is based on [Keep a Changelog](https://keepachangelog.com),
and releases adhere to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [0.3.0] 2026-MM-DD
## [0.3.0] 2026-03-02

### Added (new features)

- Added the following scalar getters to match tskit C/Python API
- Added the following scalar getters to match `tskit C/Python` API
- `TreeSequence$discrete_genome()` to query whether genome coordinates
are discrete integer values.
- `TreeSequence$has_reference_sequence()` to query whether a tree sequence
Expand All @@ -25,41 +25,56 @@ and releases adhere to [Semantic Versioning](https://semver.org/spec/v2.0.0.html
- `TableCollection$has_index()` to query whether edge indexes are present.
- Added a public header and defaults for downstream use of `C++` functions in
`inst/include/RcppTskit_public.hpp`, included by `inst/include/RcppTskit.hpp`.
- Added `TableCollection$build_index()` to build indexes and
`TableCollection$drop_index()` to drop indexes.
- TODO

### Changed

- Renamed low-level external-pointer API names from `*_ptr_*` to `*_xptr_*`
(for example, `ts_ptr_load()` to `ts_xptr_load()`) to make external-pointer
vs standard/raw-pointer semantics explicit.
- Renamed low-level C++ and R API names such that we map onto `tskit C` API,
for example, `ts_ptr_load()` to `rtsk_treeseq_load()`.
This is an internal breaking change, but in a good direction now that the
package is still young and in experimental mode.
- Renamed `TreeSequence` and `TableCollection` external-pointer field and
constructor argument from `pointer` to `xptr`.
- Ensured `TableCollection$tree_sequence()` matches `tskit Python` API:
it now builds indexes on the `TableCollection`, if indexes are not present.
- TODO

### Maintenance

- Turn vignette URL as hyperlinks and similar cosmetics.
- State mirroring of the `R/Python` APIs and `C++/C` APIs across the package.
- TODO

## [0.2.0] - 2026-02-22

### Added (new features)

- Added TableCollection R6 class alongside `tc_load()` or `TableCollection$new()`,
as well as `dump()`, `tree_sequence()`, and `print()` methods.
- Added `TableCollection` `R6` class alongside `tc_load()` or
`TableCollection$new()`, as well as `dump()`, `tree_sequence()`, and
`print()` methods.

- Added `TreeSequence$dump_tables()` to copy tables into a TableCollection.
- Added `TreeSequence$dump_tables()` to copy tables into a `TableCollection`.

- Added TableCollection and reticulate Python round-trip helpers:
- Added `TableCollection` and reticulate `Python` round-trip helpers:
`TableCollection$r_to_py()` and `tc_py_to_r()`.

- Changed the R API to follow tskit Python API for loading:
- Changed the `R` API to follow `tskit Python` API for loading:
`ts_load()`, `tc_load()`, `TreeSequence$new()`, and `TableCollection$new()`
now use `skip_tables` and `skip_reference_sequence` logical arguments instead
of an integer `options` bitmask.

- Removed user-facing `options` from `TreeSequence$dump()`,
`TreeSequence$dump_tables()`, `TableCollection$dump()`, and
`TableCollection$tree_sequence()` to match R API with the tskit Python API,
while C++ API has the bitwise `options` like the tskit C API.
`TableCollection$tree_sequence()` to match `R` API with the `tskit Python` API,
while `C++` API has the bitwise `options` like the `tskit C` API.

- The bitwise options passed to C++ are now validated.
- The bitwise options passed to `C++` are now validated.

### Changed

- We now specify C++20 standard to go around the CRAN Windows issue,
- We now specify `C++20` standard to go around the CRAN Windows issue,
see #63 for further details.

### Maintenance
Expand All @@ -81,21 +96,21 @@ This is the first release.

### Added (new features)

- Initial version of RcppTskit using the tskit C API (1.3.0).
- Initial version of `RcppTskit` using the `tskit C` API (1.3.0).

- TreeSequence R6 class so R code looks Pythonic.
- `TreeSequence R6` class so `R` code looks Pythonic.

- `ts_load()` or `TreeSequence$new()` to load a tree sequence from file into R.
- `ts_load()` or `TreeSequence$new()` to load a tree sequence from file into `R`.

- Methods to summarise a tree sequence and its contents `ts$print()`,
`ts$num_nodes()`, etc.

- Method to save a tree sequence to a file `ts$dump()`.

- Method to push tree sequence between R and reticulate Python
- Method to push tree sequence between `R` and reticulate `Python`
`ts$r_to_py()` and `ts_py_to_r()`.

- Most methods have an underlying (unexported) C++ function that works with
- Most methods have an underlying (unexported) `C++` function that works with
a pointer to tree sequence object, for example, `RcppTskit:::ts_ptr_load()`.

- All implemented functionality is documented and demonstrated with a vignette.
Loading
Loading