Is it possible to share Rcpp object/pointer between R and Python? #1439
Replies: 7 comments 10 replies
-
|
From the top of my head, we should be able to send a pointer a All these are great questions and would be really good to have a studied reference example. |
Beta Was this translation helpful? Give feedback.
-
|
I'm spitballing, but does A somewhat bare version of what I'm imagining is the "advanced" Armadillo constructors, whereby you can create an Armadillo matrix from a pointer to auxiliary memory / data. https://arma.sourceforge.net/docs.html#adv_constructors_mat |
Beta Was this translation helpful? Give feedback.
-
|
As Dirk said, this is exactly what Arrow does, so the answer is yes. You only need to ensure that both sides know the data structure, and then you pass the pointer around. And now that we are at it, I was taking a look and I see this. You are doing too much work here. You can just define a new XPtr type that knows the C function it needs to call to free the object it wraps. This is e.g. what |
Beta Was this translation helpful? Give feedback.
-
|
Shared pointer is definitely one way to do this, and it is done by quite a few packages. Maybe search github.com/cran/ for |
Beta Was this translation helpful? Give feedback.
-
|
Here I am compiling pointers to what
|
Beta Was this translation helpful? Give feedback.
-
|
Ah. And I just learned in the |
Beta Was this translation helpful? Give feedback.
-
|
@gregorgorjanc I have a complete, working and (at least I think so) quite interesting demo using external pointers. I wrote a minimal Python package, and I think it mirrors your situation. I was looking around for a small and simple enough Back to our use case. A nice feature is that all it took in Python for this particular example was to add a) one new constructor taking a string encoding the memory address of the object from this shared C++ class and b) exposing the class (and of course this new constructor) via I will clean this up a little and make the Python package a repo, and will then blog or write a bit more. That may take me a day or two. In the meantime, a demo (from R) follows. > library(reticulate)
> use_virtualenv("/opt/venv/stopwatch")
> cc <- import("stopwatch") # access the Python object from R
>
> ## quick demo just from the R side
> sw <- RcppSpdlog::get_stopwatch() # we use a simple C++ struct as example
> Sys.sleep(1.234) # imagine doing some code here
> print(sw) # stopwatch shows elapsed time
1.399661
>
> xptr::is_xptr(sw) # from R, a stopwatch is an external pointer (plus two S3 methods)
[1] TRUE
> xptr::xptr_address(sw) # access where the external pointer points to, format is "0x...."
[1] "0x5bde23be23b0"
>
> sw2 <- xptr::new_xptr(xptr::xptr_address(sw)) # cloned (!!) but that is unclassed, so an external pointer
> attr(sw2, "class") <- c("stopwatch", "externalptr")# class it .. and use it!
> print(sw2) # so `xptr` allows us close and use
19.477163
>
> sw3 <- cc$Stopwatch( xptr::xptr_address(sw) ) # and so does the Python object _with an added string ctor_
> print(sw3$elapsed())
datetime.timedelta(seconds=24, microseconds=69720) # different output _format_ as we hit the Python formatter
>
> print(sw) # and it still works on the R side
36.453815
> We can also run this as one script (and I reduced the sleep to 0.5s): $ Rscript example.R
0.500946
[1] TRUE
[1] "0x5711cd7e48f0"
0.502682
datetime.timedelta(microseconds=503092)
0.503710
$ which shows the marginal 'cost' accruing from running the R (and Python) interpreter and crossing back and forth. Total cost including the standard output printing (and of course ignoring the demo 'sleep' of half a second) is 3.7 milliseconds. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
I am developing R access to the tskit C API (a library for working with massive genomic datasets by mapping how we relate to each other through our ancestors in the past - https://tskit.dev/). tskit has a very good C API that is further enhanced via a Python API. Both are very complete, well tested, and documented, but the Python API is more popular. I am well aware of reticulate to call Python API from R, but that doesn't suit all our needs(calling the tskit C API within loops in R/Rcpp, hence my effort to bring tskit C API to R/Rcpp).
I have developed an R package tskitr at https://github.com/HighlanderLab/tskitr which includes the tskit's C code and have 1) wrote some Rcpp functions that call the C API and 2) developed library build with instructions for other packages to use tskit C API via Rcpp in their work (the reason why I have built the package in the first place).
While this all works very well thanks to Rcpp, I don't want to duplicate the work done on the Python API. Most users will still be encouraged to analyse the objects using Python API, possibly via reticulate.
While we can build tskit objects on the R side (using Rcpp), export them to disk, load into Python and analyse in standalone or reticulate Python, I wonder if we could share Rcpp object/pointer between R and reticulate Python sessions? Doing a bit of LLMing, I surmise that the answer is no due to ABI incompatibility, but I thought I would ask real human experts! In the meanwhile I have written ts_r_to_py() and ts_py_to_r() that use disk-write-read to transfer between R and reticulate Python. tskit objects are complex so not sure I can figure out r_to_py()/py_to_r() wrappers and I only have a pointer in R session anyway - looking at the C++ code of the reticulate package I get lost very quickly.
Thanks!
Beta Was this translation helpful? Give feedback.
All reactions