What is rcpp




















They are the next step up from basic loops, abstracting away the details of the underlying data structure. Iterators have three main operators:. We start at x. Notice the type of the iterator: NumericVector::iterator. Each vector type has its own iterator type: LogicalVector::iterator , CharacterVector::iterator , etc. For example, we could again rewrite sum to use the accumulate function, which takes a starting and an ending iterator, and adds up all the values in the vector.

For example, we could write a basic Rcpp version of findInterval that takes two arguments a vector of values and a vector of breaks, and locates the bin that each x falls into. This shows off a few more advanced iterator features. Read the code below and see if you can figure out how it works. Small note: if we want this function to be as fast as findInterval in R which uses handwritten C code , we need to compute the calls to.

This is easy, but it distracts from this example so it has been omitted. Using standard algorithms also makes the intent of your code more clear, helping to make it more readable and more maintainable.

You may want to try them for your problem. Rcpp knows how to convert from many STL data structures to their R equivalents, so you can return them from your functions without explicitly converting to R data structures. An STL vector is very similar to an R vector, except that it grows efficiently.

You can access individual elements of a vector using the standard [] notation, and you can add a new element to the end of the vector using. If you have some idea in advance how big the vector will be, you can use. The following code implements run length encoding rle. It produces two vectors of output: a vector of values, and a vector lengths giving how many times each element is repeated.

An alternative implementation would be to replace i with the iterator lengths. You might want to try implementing that yourself. They are useful for problems that involve duplicates or unique values like unique , duplicated , or in. Unordered sets tend to be much faster because they use a hash table internally rather than a tree , so even if you need an ordered set, you should consider using an unordered set and then sorting the output.

The following function uses an unordered set to implement an equivalent to duplicated for integer vectors. Note the use of seen. A map is similar to a set, but instead of storing presence or absence, it can store additional data. The following example shows how you could use a map to implement table for numeric vectors:.

The challenge is to predict a model response from three inputs. The basic R version of the predictor looks like:. Fixed a bug in Vector that caused random behavior due to the lack of copy constructor in the Vector template. The trait class that was used to identify if a type is convertible to another had too many false positives on pre gcc 4. These classes have the same functionality as Vector but have a different set of constructors which checks that the input SEXP is a matrix.

Row contains a reference to the underlying Vector and exposes a nested iterator type that allows use of STL algorithms on each element of a matrix row. The Vector class gains a row int method that returns a Row instance. Usage examples are available in the runit. R unit test file. The Rcpp::as template function has been reworked to be more generic. It now handles more STL containers, such as deque and list, and the genericity can be used to implement as for more types.

The package RcppArmadillo has examples of this. The RcppExamples package has an example of this. The created call was wrong. CharacterVector gains a random access iterator, begin and end to support STL algorithms; iterator dereferences to a StringProxy. The range based version of wrap is now exposed at the Rcpp:: level with the following interface : Rcpp::wrap InputIterator first, InputIterator last This is dispatched internally to the most appropriate implementation using traits.

The methods RObject::asFoo are deprecated and will be removed in the next version. The method RObject::slot can now be used to get or set the associated slot. This is one more example of the proxy pattern. This is yet another example of the proxy pattern. They gain constructors with up to 5 templated arguments. It is now possible to call a function with up to 5 templated arguments candidate for implicit wrap. All vector classes gain a constructor taking a Dimension reference.

The template now attempts to build an object of the requested template parameter T by using the constructor for the type taking a SEXP. This is mostly intended for internal use and is used on all vector classes. Environment now takes advantage of the augmented smartness of as and wrap templates. New R function Rcpp. The class Rcpp::VectorBase was introduced. All vector classes derive from it. The class handles behaviour that is common to all vector types: length, names, etc Rcpp::ExpressionVector gains a constructor taking a std::string representing some R code to parse.

This also uses the proxy pattern. For example wrap bool returns a LogicalVector. Factored out of RObject. The garbage collection has been improved and is now automatic and hidden. The user needs not to worry about it at all. This covers anything from vectors, matrices or lists to environments, functions and more. For example, numeric vectors are represented as instances of the Rcpp::NumericVector class, environments are represented as instances of Rcpp::Environment , functions are represented as Rcpp::Function , etc … The Rcpp-introduction vignette now published as a TAS paper ; an earlier introduction was also published as a JSS paper provides a good entry point to Rcpp as do the Rcpp website , the Rcpp page and the Rcpp Gallery.

Full documentation is provided by the Rcpp book. The Rcpp-modules vignette details the current set of features of Rcpp-modules. Sugar takes advantage of lazy evaluation and expression templates to achieve great performance while exposing a syntax that is much nicer to use than the equivalent low-level loop code.

The Rcpp-sugar gives an overview of the feature. This is the only Rcpp data structure I use -- as Armadillo does not provide one, and in general as long as we do not try to do too much with it, we should be ok. I also tend to specify the return value as a List so that we can stick whatever values we want in it.

This works well and again if you keep it shorter than say 1, entries for what you return you should not hit any snags with weird memory stuff. This will not matter if you overwrite all of the values right off of the bat, but if you are constructing a distribution or something like that, this can spell trouble!

Fortunately Armadillo has the arma::zeros to come to the rescue. Its just one of those things. Lets take a look at a couple of example Rcpp functions I have written for different applications to start getting the hang of looping and other related concepts. Here is a function that I wrote about in a blog post here , which calculates the mutual information of an arbitrary joint distribution.

You can read more about mutual information in that post, or by checking out the wikipedia page , but what is important is that we have to traverse all the entries of a matrix and calculate some quantity.

Another thing to note here is that function returns do not get enclosed in like they do in R. We can try compiling it and you will see it pop up in the Functions pane of RStudio. Now lets take a look at another function I wrote that finds the unique words in a corpus of documents and counts the number of times each unique word appears. There are a couple of new things in this function. First, we are printing stuff to the R console using Rcpp::Rcout , a newly introduced function in Rcpp that make printing seamless and easy.

Here is some example code:. You can find out more information on Rcout by checking out this tutorial. We need to use the standard vector class because as far as I know, the Armadillo vectors do not support strings. We can also stick these back in a Rcpp::List and then return them to R without any trouble. Here is the example block of code:. To do this, you will need to define your own namespace before you can define sub-functions to be used by your other functions.

In the example below, the cdf function would get called in my Rcpp program by mjd::cdf. These are three functions that calculate the erf, pdf, and cdf of draws from a normal distribution. Fortunately, we can make use of the Boost libraries to sample from all sorts of distributions and a whole bunch of other low level stuff that can be really useful.

Once we have called the library, we can then start writing functions using the Boost features as follows. Now lets look at some common functions we might want to grab from the boost libraries, starting with a random number generator where seed is an integer we pass in from R:.

You can access the source file here.



0コメント

  • 1000 / 1000