vspline 1.1.0
Generic C++11 Code for Uniform B-Splines
|
Classes | |
struct | build_ev |
helper object to create a type-erased vspline::evaluator for a given bspline object. The evaluator is specialized to the spline's degree, so that degree-0 splines are evaluated with nearest neighbour interpolation, degree-1 splines with linear interpolation, and all other splines with general b-spline evaluation. The resulting vspline::evaluator is 'grokked' to erase it's type to make it easier to handle on the receiving side: build_ev will always return a vspline::grok_type, not one of the several possible evaluators which it produces initially. Why the type erasure? Because a function can only return one distinct type. With specialization for degree-0, degre-1 and arbitrary spline degrees, there are three distinct types of evaluator to take care of. If they are to be returned as a common type, type erasure is the only way. More... | |
struct | build_safe_ev |
helper object to create a vspline::mapper object with gate types matching a bspline's boundary conditions and extents matching the spline's lower and upper limits. Please note that these limits depend on the boundary conditions and are not always simply 0 and N-1, as they are for, say, mirror boundary conditions. see lower_limit() and upper_limit() in vspline::bspline. More... | |
struct | build_safe_ev< -1, spline_type, rc_type, _vsize, math_ele_type, result_type, gate_types ... > |
at level -1, there are no more axes to deal with, here the recursion ends and the actual mapper object is created. Specializing on the spline's degree (0, 1, or indeterminate), an evaluator is created and chained to the mapper object. The resulting functor is grokked to produce a uniform return type, which is returned to the caller. More... | |
struct | grev_generator |
we need a 'generator functor' to implement grid_eval using the code in wielding.h. this functor precalculates the b-spline evaluation weights corresponding to the coordinates in the grid and stores them in vectorized format, to speed up their use as much as possible. More... | |
struct | grid_eval_functor |
grid_eval_functor is used for 'generalized' grid evaluation, where the functor passed in is not a bspline evaluator. For this general case, we can't precalculate anything to speed things up, all we need to do is pick the grid values and put them together to form arguments for calls to the functor. More... | |
struct | inner_evaluator |
'inner_evaluator' implements evaluation of a uniform b-spline, or some other spline-like construct relying on basis functions which can provide sets of weights for given deltas. While class evaluator (below, after namespace detail ends) provides objects derived from vspline::unary_functor which are meant to be used by user code, here we have a 'workhorse' object to which 'evaluator' delegates. This code 'rolls out' the per-axis weights the basis functor produces to the set of coefficients relevant to the current evaluation locus (the support window). We rely on a few constraints: More... | |
struct | separable_filter |
struct separable_filter is the central object used for 'wielding' filters. The filters themselves are defined as 1D operations, which is sufficient for a separable filter: the 1D operation is applied to each axis in turn. If the data themselves are 1D, this is inefficient if the run of data is very long: we'd end up with a single thread processing the data without vectorization. So for this special case, we use a bit of trickery: long runs of 1D data are folded up, processed as 2D (with multithreading and vectorization) and the result of this operation, which isn't correct everywhere, is 'mended' where it is wrong. If the data are nD, we process them by buffering chunks collinear to the processing axis and applying the 1D filter to these chunks. 'Chunks' isn't quite the right word to use here - what we're buffering are 'bundles' of 1D subarrays, where a bundle holds as many 1D subarrays as a SIMD vector is wide. this makes it possible to process the buffered data with vectorized code. While most of the time the buffering will simply copy data into and out of the buffer, we use a distinct data type for the buffer which makes sure that arithmetic can be performed in floating point and with sufficient precision to do the data justice. With this provision we can safely process arrays of integral type. Such data are 'promoted' to this type when they are buffered and converted to the result type afterwards. Of course there will be quantization errors if the data are converted to an integral result type; it's best to use a real result type. The type for arithmetic operations inside the filter is fixed via stripe_handler_type, which takes a template argument '_math_ele_type'. This way, the arithmetic type is distributed consistently. Also note that an integral target type will receive the data via a simple type conversion and not with saturation arithmetics. If this is an issue, filter to a real-typed target and process separately. A good way of using integral data is to have integral input and real-typed output. Promoting the integral data to a real type preserves them precisely, and the 'exact' result is then stored in floating point. With such a scheme, raw data (like image data, which are often 8 or 16 bit integers) can be 'sucked in' without need for previous conversion, producing filtered data in, say, float for further processing. More... | |
Functions | |
template<typename source_view_type , typename target_view_type , typename stripe_handler_type > | |
void | present (vspline::atomic< std::ptrdiff_t > *p_tickets, const source_view_type *p_source, target_view_type *p_target, const typename stripe_handler_type::arg_type *p_args, int axis) |
'present' feeds 'raw' data to a filter and returns the filtered data. In order to perform this task with maximum efficiency, the actual code is quite involved. More... | |
template<typename source_view_type , typename target_view_type , typename stripe_handler_type > | |
void | vpresent (vspline::atomic< std::ptrdiff_t > *p_tickets, const std::vector< source_view_type > *p_source, std::vector< target_view_type > *p_target, const typename stripe_handler_type::arg_type *p_args, int axis) |
vpresent is a variant of 'present' processing 'stacks' of arrays. See 'present' for discussion. This variant of 'present' will rarely be used. Having it does no harm but if you study the code, you may safely ignore it unless you are actually using single-axis filtering of stacks of arrays. the code is structurally similar to 'present', with the extra complication of processing stacks instead of single arrays. More... | |
template<typename functor_type > | |
void | transform (std::true_type, grid_spec< functor_type::dim_in, typename functor_type::in_ele_type > &grid, const functor_type &functor, vigra::MultiArrayView< functor_type::dim_in, typename functor_type::out_type > &result, int njobs=vspline::default_njobs, vspline::atomic< bool > *p_cancel=0) |
given the generator functor 'grev_generator' (above), performing the grid_eval becomes trivial: construct the generator and pass it to the wielding code. The signature looks complex, but the types needed can all be inferred by ATD - callers just need to pass a grid_spec object, a b-spline evaluator and a target array to accept the result - plus, optionally, a pointer to a cancellation atomic, which, if given, may be set by the calling code to gracefully abort the calculation at the next convenient occasion. Note that grid_eval is specific to b-spline evaluation and will not work with any evaluator type which is not a 'plain', 'raw' b-spline evaluator. So, even functors gained by calling vspline's factory functions (make_evaluator, make_safe_evaluator) will not work here, since they aren't of type vspline::evaluator (they are in fact grok_type). To use arbitrary functors on a grid, use gen_grid_eval, below. More... | |
template<typename functor_type > | |
void | transform (std::false_type, grid_spec< functor_type::dim_in, typename functor_type::in_ele_type > &grid, const functor_type &functor, vigra::MultiArrayView< functor_type::dim_in, typename functor_type::out_type > &result, int njobs=vspline::default_njobs, vspline::atomic< bool > *p_cancel=0) |
generalized grid evaluation. The production of result values from input values is done by an instance of grid_eval_functor, see above. The template argument, ev_type, has to be a functor (usually this will be a vspline::unary_functor). If the functor's in_type has dim_in components, grid_spec must also contain dim_in 1D arrays, since ev's input is put together by picking a value from each of the arrays grid_spec points to. The result obviously has to have as many dimensions. More... | |
void vspline::detail::present | ( | vspline::atomic< std::ptrdiff_t > * | p_tickets, |
const source_view_type * | p_source, | ||
target_view_type * | p_target, | ||
const typename stripe_handler_type::arg_type * | p_args, | ||
int | axis | ||
) |
'present' feeds 'raw' data to a filter and returns the filtered data. In order to perform this task with maximum efficiency, the actual code is quite involved.
we have two variants of the routine, one for 'stacks' of several arrays (vpresent) and one for single arrays (present).
The routine used in vspline is 'present', 'vpresent' is for special cases. present splits the data into 'bundles' of 1D subarrays collinear to the processing axis. These bundles are fed to the 'handler', which copies them into a buffer, performs the actual filtering, and then writes them back to target memory.
Using 'vpresent', incoming data are taken as std::vectors of source_view_type. The incoming arrays have to have the same extent in every dimension except the processing axis. While the actual process of extracting parts of the data for processing is slightly more involved, it is analogous to first concatenating all source arrays into a single array, stacking along the processing axis. The combined array is then split up into 1D subarrays collinear to the processing axis, and sets of these subarrays are passed to the handler by calling it's 'get' method. The set of 1D subarrays / is coded as a 'bundle', which describes such a set by a combination of base address and a set of gather/scatter indexes.
Once the data have been accepted by the handler, the handler's operator() is called, which results in the handler filtering the data (or whatever else it might do). Next, the processed data are taken back from the handler by calling it's 'put' routine. The put routine also receives a 'bundle' parameter, resulting in the processed data being distributed back into a multidimensional array (or a set of them, like the input).
This mechanism sounds complicated, but buffering the data for processing (which oftentimes has to look at the data several times) is usually more efficient than operating on the data in their in-array locations, which are often widely distributed, making the memory access slow. On top of the memory efficiency gain, there is another aspect: by choosing the bundle size wisely, the buffered data can be processed by vector code. Even if the data aren't explicit SIMD vectors (which is an option), the simple fact that they 'fit' allows the optimizer to autovectorize the code, a technique which I call 'goading': You present the data in vector-friendly guise and thereby lure the optimizer to do the right thing. Another aspect of buffering is that the buffer can use a specific data type best suited to the arithmetic operation at hand which may be different from the source and target data. This is especially useful if incoming data are of an integral type: operating directly on integers would spoil the data, but if the buffer is set up to contain a real type, the data are lifted to it on arrival in the buffer and processed with float maths. A drawback to this method of dealing with integral data is the fact that, when filtering nD data along several axes, intermediate results are stored back to the integral type after processing along each axis, accruing quantization errors with each pass. If this is an issue - like, with high-dimensional data or insufficient dynamic range, please consider moving to a real data type before filtering.
Note that this code operates on arrays of fundamentals. The code calling this routine will have element-expanded data which weren't fundamentals in the first place. This expansion helps automatic vectorization, and for explicit vectorization with Vc it is even necessary.
Also note that this routine operates in single-threaded code: It's invoked via vspline::multithread, and each worker thread will perform it's own call to 'present'. This is why the first argument is a range (containing the range of the partitioning assigned to the current worker thread) and why the other arguments come in as pointers, where I'd usually pass by reference.
void vspline::detail::transform | ( | std::false_type | , |
grid_spec< functor_type::dim_in, typename functor_type::in_ele_type > & | grid, | ||
const functor_type & | functor, | ||
vigra::MultiArrayView< functor_type::dim_in, typename functor_type::out_type > & | result, | ||
int | njobs = vspline::default_njobs , |
||
vspline::atomic< bool > * | p_cancel = 0 |
||
) |
generalized grid evaluation. The production of result values from input values is done by an instance of grid_eval_functor, see above. The template argument, ev_type, has to be a functor (usually this will be a vspline::unary_functor). If the functor's in_type has dim_in components, grid_spec must also contain dim_in 1D arrays, since ev's input is put together by picking a value from each of the arrays grid_spec points to. The result obviously has to have as many dimensions.
Definition at line 1241 of file transform.h.
void vspline::detail::transform | ( | std::true_type | , |
grid_spec< functor_type::dim_in, typename functor_type::in_ele_type > & | grid, | ||
const functor_type & | functor, | ||
vigra::MultiArrayView< functor_type::dim_in, typename functor_type::out_type > & | result, | ||
int | njobs = vspline::default_njobs , |
||
vspline::atomic< bool > * | p_cancel = 0 |
||
) |
given the generator functor 'grev_generator' (above), performing the grid_eval becomes trivial: construct the generator and pass it to the wielding code. The signature looks complex, but the types needed can all be inferred by ATD - callers just need to pass a grid_spec object, a b-spline evaluator and a target array to accept the result - plus, optionally, a pointer to a cancellation atomic, which, if given, may be set by the calling code to gracefully abort the calculation at the next convenient occasion. Note that grid_eval is specific to b-spline evaluation and will not work with any evaluator type which is not a 'plain', 'raw' b-spline evaluator. So, even functors gained by calling vspline's factory functions (make_evaluator, make_safe_evaluator) will not work here, since they aren't of type vspline::evaluator (they are in fact grok_type). To use arbitrary functors on a grid, use gen_grid_eval, below.
Definition at line 1117 of file transform.h.
void vspline::detail::vpresent | ( | vspline::atomic< std::ptrdiff_t > * | p_tickets, |
const std::vector< source_view_type > * | p_source, | ||
std::vector< target_view_type > * | p_target, | ||
const typename stripe_handler_type::arg_type * | p_args, | ||
int | axis | ||
) |
vpresent is a variant of 'present' processing 'stacks' of arrays. See 'present' for discussion. This variant of 'present' will rarely be used. Having it does no harm but if you study the code, you may safely ignore it unless you are actually using single-axis filtering of stacks of arrays. the code is structurally similar to 'present', with the extra complication of processing stacks instead of single arrays.