vspline 1.1.0
Generic C++11 Code for Uniform B-Splines
|
►NHWY_NAMESPACE | |
►Chwy_simd_type | |
Cmasked_type | |
Cmchunk_t | Mask type for hwy_simd_type. This is a type which holds a set of masks stored in uint8_t, as the highway mask storing function provides. So this type is memory-backed, just like hwy_simd_type. Template arguments are the corresponding hwy_simd_type's tag type and it's lane count. highway is strict about which vectors and masks can interoperate, and only allows 'direct' interoperation if the types involved 'match' in size. Masks pertaining to vectors of differently-sized T aren't directly interoperable because they don't have the same lane count. One requires k masks of one type and k * 2 ^ i of the other. Here, we follow a different paradigm: The top-level objects we're dealing with have a fixed 'vsize', the number of lanes they hold. This should be a power of two. The paradigm is that objects with equal vsize should be interoperable, no matter what lane count the hardware vectors have which are used to implement their functionality. This makes user code simpler: users pick a vsize which they use for a body of code, all vector-like objects use the common vsize, and the implementation of the vector-like objects takes care of 'rolling out' the operations to hardware vectors. At times this produces what I call 'friction' - if the underlying hardware vectors and masks are not directly compatible, code is needed to interoperate them, and this code can at times be slow. So the recommendation for users is to avoid 'friction' by avoiding mixing differently-sized types, but with the given paradigm, this is a matter of performance tuning rather than imposing constraints on code structure. Some of the 'friction' might be mitigated by additional code using highway's up- and down-scaling routines, but for now the code rather uses 'goading' with small loops over the backing memory, relying on the compiler to handle this efficiently |
►Nstd | |
Callocator_traits< vspline::hwy_simd_type< T, N > > | |
Callocator_traits< vspline::simd_type< T, N > > | |
►Nstd14 | |
Cinteger_sequence | |
Cmake_integer_sequence | |
Cmake_integer_sequence< T, 0, Is... > | |
►Nvspline | |
►Ndetail | |
Cbuild_ev | Helper object to create a type-erased vspline::evaluator for a given bspline object. The evaluator is specialized to the spline's degree, so that degree-0 splines are evaluated with nearest neighbour interpolation, degree-1 splines with linear interpolation, and all other splines with general b-spline evaluation. The resulting vspline::evaluator is 'grokked' to erase it's type to make it easier to handle on the receiving side: build_ev will always return a vspline::grok_type, not one of the several possible evaluators which it produces initially. Why the type erasure? Because a function can only return one distinct type. With specialization for degree-0, degre-1 and arbitrary spline degrees, there are three distinct types of evaluator to take care of. If they are to be returned as a common type, type erasure is the only way |
Cbuild_safe_ev | Helper object to create a vspline::mapper object with gate types matching a bspline's boundary conditions and extents matching the spline's lower and upper limits. Please note that these limits depend on the boundary conditions and are not always simply 0 and N-1, as they are for, say, mirror boundary conditions. see lower_limit() and upper_limit() in vspline::bspline |
Cbuild_safe_ev< -1, spline_type, rc_type, _vsize, math_ele_type, result_type, gate_types ... > | At level -1, there are no more axes to deal with, here the recursion ends and the actual mapper object is created. Specializing on the spline's degree (0, 1, or indeterminate), an evaluator is created and chained to the mapper object. The resulting functor is grokked to produce a uniform return type, which is returned to the caller |
Cgrev_generator | We need a 'generator functor' to implement grid_eval using the code in wielding.h. this functor precalculates the b-spline evaluation weights corresponding to the coordinates in the grid and stores them in vectorized format, to speed up their use as much as possible |
Cgrid_eval_functor | Grid_eval_functor is used for 'generalized' grid evaluation, where the functor passed in is not a bspline evaluator. For this general case, we can't precalculate anything to speed things up, all we need to do is pick the grid values and put them together to form arguments for calls to the functor |
Cinner_evaluator | 'inner_evaluator' implements evaluation of a uniform b-spline, or some other spline-like construct relying on basis functions which can provide sets of weights for given deltas. While class evaluator (below, after namespace detail ends) provides objects derived from vspline::unary_functor which are meant to be used by user code, here we have a 'workhorse' object to which 'evaluator' delegates. This code 'rolls out' the per-axis weights the basis functor produces to the set of coefficients relevant to the current evaluation locus (the support window). We rely on a few constraints: |
Cseparable_filter | Struct separable_filter is the central object used for 'wielding' filters. The filters themselves are defined as 1D operations, which is sufficient for a separable filter: the 1D operation is applied to each axis in turn. If the data themselves are 1D, this is inefficient if the run of data is very long: we'd end up with a single thread processing the data without vectorization. So for this special case, we use a bit of trickery: long runs of 1D data are folded up, processed as 2D (with multithreading and vectorization) and the result of this operation, which isn't correct everywhere, is 'mended' where it is wrong. If the data are nD, we process them by buffering chunks collinear to the processing axis and applying the 1D filter to these chunks. 'Chunks' isn't quite the right word to use here - what we're buffering are 'bundles' of 1D subarrays, where a bundle holds as many 1D subarrays as a SIMD vector is wide. this makes it possible to process the buffered data with vectorized code. While most of the time the buffering will simply copy data into and out of the buffer, we use a distinct data type for the buffer which makes sure that arithmetic can be performed in floating point and with sufficient precision to do the data justice. With this provision we can safely process arrays of integral type. Such data are 'promoted' to this type when they are buffered and converted to the result type afterwards. Of course there will be quantization errors if the data are converted to an integral result type; it's best to use a real result type. The type for arithmetic operations inside the filter is fixed via stripe_handler_type, which takes a template argument '_math_ele_type'. This way, the arithmetic type is distributed consistently. Also note that an integral target type will receive the data via a simple type conversion and not with saturation arithmetics. If this is an issue, filter to a real-typed target and process separately. A good way of using integral data is to have integral input and real-typed output. Promoting the integral data to a real type preserves them precisely, and the 'exact' result is then stored in floating point. With such a scheme, raw data (like image data, which are often 8 or 16 bit integers) can be 'sucked in' without need for previous conversion, producing filtered data in, say, float for further processing |
Callocator_traits | Vspline creates vigra::MultiArrays of vectorized types. As long as the vectorized types are Vc::SimdArray or vspline::simd_type, using std::allocator is fine, but when using other types, using a specific allocator may be necessary. Currently this is never the case, but I have the lookup of allocator type from this traits class in place if it should become necessary |
Callocator_traits< hwy_simd_type< T, N > > | |
Callocator_traits< vc_simd_type< T, N > > | |
Camplify_type | Amplify_type amplifies it's input with a factor. If the data are multi-channel, the factor is multi-channel as well and the channels are amplified by the corresponding elements of the factor. I added this class to make work with integer-valued splines more comfortable - if these splines are prefiltered with 'boost', the effect of the boost has to be reversed at some point, and amplify_type does just that when you use 1/boost as the 'factor' |
Cbasis_functor | Basis_functor is an object producing the b-spline basis function value for given arguments, or optionally a derivative of the basis function. While basis_functor can produce single basis function values for single arguments, it can also produce a set of basis function values for a given 'delta'. This set is a unit-spaced sampling of the basis function sampled at n + delta for all n E N. Such samplings are used to evaluate b-splines; they constitute the set of weights which have to be applied to a set of b-spline coefficients to form the weighted sum which is the spline's value at a given position |
Cbf_grok_type | If there are several differently-typed basis functors to be combined in a multi_bf_type object, we can erase their type, just like grok_type does for vspline::unary_functors. grokking a basis functor may cost a little bit of performance but it makes the code to handle multi_bf_types simple: instead of having to cope for several, potentially differently-typed per-axis functors there is only one type - which may be a bf_grok_type if the need arises to put differently-typed basis functors into the multi_bf_type. With this mechanism, the code to build evaluators can be kept simple (handling only one uniform type of basis functor used for all axes) and still use different basis functors |
Cbracer | Class bracer encodes the entire bracing process. Note that contrary to my initial implementation, class bracer is now used exclusively for populating the frame around a core area of data. It has no code to determine which size a brace/frame should have. This is now determined in class bspline, see especially class bspline's methods get_left_brace_size(), get_right_brace_size() and setup_metrics() |
Cbroadcast | Struct broadcast is a mixin providing an 'eval' method to a functor which can process vectorized arguments. This mixin is inherited by the functor missing that capability, using CRTP. Because here, in the providing class, nothing is known (or, even, knowable) about the functor, we need to pass additional template arguments to establish the usual vspline unary functor frame of reference, namely in_type, out_type, vsize etc. The resulting 'vectorized' eval may not be efficient: it has to build individual 'in_type' values from the vectorized input, process them with the derived functor's eval routine, then insert the resulting out_type in the vectorized output. But it's a quick way of getting vectorized evaluation capability without writing the vector code. This is particularly useful when the functor's unvectorized eval() is complex (like, calling into legacy code or even into opaque binary) and 'proper' vectorization is hard to do. And with a bit of luck, the optimizer 'recognizes' what's going on and produces SIMD code anyway. Note that the derived class needs a using declaration for the vectorized eval overload inherited from this base class - see broadcast_type (below) for an example of using this mixin |
Cbroadcast_type | |
Cbspl_prefilter | Class to provide b-spline prefiltering, using 'iir_filter' above. The actual filter object has to interface with the data handling routine ('present', see filter.h). So this class functions as an adapter, combining the code needed to set up adequate buffers and creation of the actual IIR filter itself. The interface to the data handling routine is provided by inheriting from class buffer_handling |
Cbspline | Class bspline now builds on class bspline_base, adding coefficient storage, while bspline_base provides metadata handling. This separation makes it easier to generate classes which use the same metadata scheme. One concrete example is a class to type-erase a spline (not yet part of the library) which abstracts a spline, hiding the type of it's coefficients. Such a class can inherit from bspline_base and initialize the base class in the c'tor from a given bspline object, resulting in a uniform interface to the metadata. class bspline takes an additional template argument: the value type. This is the type of the coefficients stored in the spline, and it will typically be a single fundamental type or a small aggregate - like a vigra::TinyVector. vspline uses vigra's ExpandElementResult mechanism to inquire for a value type's elementary type and size, which makes it easy to adapt to new value types because the mechanism is traits-based |
Cbspline_base | Struct bspline is the object in vspline holding b-spline coefficients. In a way, the b-spline 'is' it's coefficients, since it is totally determined by them - while, of course, the 'actual' spline is an n-dimensional curve. So, even if this is a bit sloppy, I often refer to the coefficients as 'the spline', and have named struct bspline so even if it just holds the coefficients |
Cbspline_evaluator_tag | Tag class used to identify all vspline::evaluator instantiations |
Cbuffer_handling | Buffer_handling provides services needed for interfacing with a buffer of simdized/goading data. The init() routine receives two views: one to a buffer accepting incoming data, and one to a buffer providing results. Currently, all filters used in vspline operate in-place, but the two-argument form leaves room to manoevre. get() and put() receive 'bundle' arguments which are used to transfer incoming data to the view defined in in_window, and to transfer result data from the view defined in out_window back to target memory |
Cbundle | Class 'bundle' holds all information needed to access a set of vsize 1D subarrays of an nD array. This is the data structure we use to tell the buffering and unbuffering code which data we want it to put into the buffer or distribute back out. The buffer itself holds the data in compact form, ready for vector code to access them at maximum speed |
Ccallable | Mixin 'callable' is used with CRTP: it serves as additional base to unary functors which are meant to provide operator() and takes the derived class as it's first template argument, followed be the argument types and vectorization width, so that the parameter and return type for operator() and - if vsize is greater than 1 - it's vectorized overload can be produced. This formulation has the advantage of not having to rely on the 'out_type_of' mechanism I was using before and provides precisely the operator() overload(s) which are appropriate |
Cchain_type | Class chain_type is a helper class to pass one unary functor's result as argument to another one. We rely on T1 and T2 to provide a few of the standard types used in unary functors. Typically, T1 and T2 will both be vspline::unary_functors, but the type requirements could also be fulfilled 'manually' |
Cclamp_gate | Clamp gate clamps out-of-bounds values. clamp_gate takes four arguments: the lower and upper limit of the gate, and the values which are returned if the input is outside the range: 'lfix' if it is below 'lower' and 'ufix' if it is above 'upper' |
Cconvolve | Class convolve provides the combination of class fir_filter above with a vector-friendly buffer. Calling code provides information about what should be buffered, the data are sucked into the buffer, filtered, and moved back from there. The operation is orchestrated by the code in filter.h, which is also used to 'wield' the b-spline prefilter. Both operations are sufficiently similar to share the wielding code |
Cdimension_mismatch | Dimension-mismatch is thrown if two arrays have different dimensions which should have the same dimensions |
Cdomain_type | Class domain is a coordinate transformation functor. It provides a handy way to translate an arbitrary range of incoming coordinates to an arbitrary range of outgoing coordinates. This is done with a linear translation function. if the source range is [s0,s1] and the target range is [t0,t1], the translation function s->t is: |
Cevaluator | Class evaluator encodes evaluation of a spline-like object. This is a generalization of b-spline evaluation, which is the default if no other basis functions are specified. Technically, a vspline::evaluator is a vspline::unary_functor, which has the specific capability of evaluating a specific spline-like object. This makes it a candidate to be passed to the functions in transform.h, like remap() and transform(), and it also makes it suitable for vspline's functional constructs like chaining, mapping, etc |
Cextrapolator | Struct extrapolator is a helper class providing extrapolated values for a 1D buffer indexed with possibly out-of-range indices. The extrapolated value is returned by value. boundary conditions PERIODIC , MIRROR , REFLECT, NATURAL and CONSTANT are currently supported. An extrapolator is set up by passing the boundary condition code (see common.h) and a const reference to the 1D data set, coded as a 1D vigra::MultiArrayView. The view has to refer to valid data for the time the extrapolator is in use. Now the extrapolator object can be indexed with arbitrary indices, and it will return extrapolated values. The indexing is done with operator() rather than operator[] to mark the semantic difference. Note how buffers with size 1 are treated specially for some boundary conditions: here we simply return the value at index 0 |
Cfir_filter | Class fir_filter provides the 'solve' routine which convolves a 1D signal with selectable extrapolation. Here, the convolution kernel is applied to the incoming signal and the result is written to the specified output location. Note that this operation can be done in-place, but input and output may also be different. While most of the time this routine will be invoked by class convolve (below), it is also directly used by the specialized code for 1D filtering. Note how we conveniently inherit from the specs class. This also enables us to use an instance of fir_filter or class convolve as specs argument to create further filters with the same arguments |
Cfir_filter_specs | Fir_filter_specs holds the parameters for a filter performing a convolution along a single axis. In vspline, the place where the specifications for a filter are fixed and the place where it is finally created are far apart: the filter is created in the separate worker threads. So this structure serves as a vehicle to transport the arguments. Note the specification of 'headroom': this allows for non-symmetrical and even kernels. When applying the kernel to obtain output[i], the kernel is applied to input [ i - headroom ] , ... , input [ i - headroom + ksize - 1 ] |
Cflip | Flip functor produces it's input with component order reversed. This can be used to deal with situations where coordinates in the 'wrong' order have to be fed to a functor expecting the opposite order and should be a fast way of doing so, since the compiler can likely optimize it well. I added this class to provide simple handling of incoming NumPy coordinates, which are normally in reverse order of vigra coordinates |
Cgrok_type | Class grok_type is a helper class wrapping a vspline::unary_functor so that it's type becomes opaque - a technique called 'type erasure', here applied to vspline::unary_functors with their specific capability of providing both vectorized and unvectorized operation in one common object |
Cgrok_type< IN, OUT, 1 > | Specialization of grok_type for _vsize == 1 this is the only possible specialization if vectorization is not used. here we don't use _v_ev but only the unvectorized evaluation |
Chomogeneous_mbf_type | Homogeneous_mbf_type can be used for cases where all basis functors are the same. The evaluation code uses operator[] to pick the functor for each axis, so here we merely override operator[] to always yield a const reference to the same basis functor |
Ciir_filter | Class iir_filter implements an n-pole forward/backward recursive filter to be used for b-spline prefiltering. It inherits from the 'specs' class for easy initialization |
Ciir_filter_specs | Structure to hold specifications for an iir_filter object. This set of parameters has to be passed through from the calling code through the multithreading code to the worker threads where the filter objects are finally constructed. Rather than passing the parameters via some variadic mechanism, it's more concise and expressive to contain them in a structure and pass that around. The filter itself inherits its specification type, and if the code knows the handler's type, it can derive the spec type. This way the argument passing can be formalized, allowing for uniform handling of several different filter types with the same code. Here we have the concrete parameter set needed for b-spline prefiltering. We'll pass one set of 'specs' per axis; it contains: |
Cinvalid_scalar | |
►Cmap_functor | Finally we define class mapper which is initialized with a set of gate objects (of arbitrary type) which are applied to each component of an incoming nD coordinate in turn. The trickery with the variadic template argument list is necessary, because we want to be able to combine arbitrary gate types (which have distinct types) to make the mapper as efficient as possible. the only requirement for a gate type is that it has to provide the necessary eval() functions |
C_map | |
C_map< 0, 1, coordinate_type > | |
C_map< 0, dimension, nd_coordinate_type > | |
Cmirror_gate | Mirror gate 'folds' coordinates into the range. From the infinite number of mirror images resulting from mirroring the input on the bounds, the only one inside the range is picked as the result. When using this gate type with splines with MIRROR boundary conditions, if the shape of the core for the axis in question is M, _lower would be passed 0 and _upper M-1. For splines with REFLECT boundary conditions, we'd pass -0.5 to _lower and M-0.5 to upper, since here we mirror 'between bounds' and the defined range is wider |
Cmulti_bf_type | When several basis functors have to be passed to an evaluator, it's okay to pass a container type like a std::array or std::vector. All of these basis functors have to be of the same type, but using the method given above ('grokking' basis functors) it's possible to 'slot in' differently-typed basis functors - the functionality required by the evaluation code is provided by the 'grokked' functors and their inner workings remain opaque |
Cmulti_bf_type< vspline::basis_functor< math_type >, ndim > | For b-spline processing, we use a multi_bf_type of b-spline basis functors (class vspline::basis_functor). We can't use the unspecialized template for this basis functor, because it takes the derivative specification, which is specified per-axis, so we need to 'pick it out' from the derivative specification and pass the per-axis value to the per-axis basis function c'tors. So here, instead of merely using the index_sequence to produce a sequence of basis function c'tor calls and ignoring the indices, we use the indices to pick out the per-axis derivative specs |
Cnot_implemented | For interfaces which need specific implementations we use: |
Cnot_supported | Exception which is thrown if an opertion is requested which vspline does not support |
Cnumeric_overflow | Exception which is thrown when assigning an rvalue which is larger than what the lvalue can hold |
Cout_of_bounds | Out_of_bounds is thrown by mapping mode REJECT for out-of-bounds coordinates this exception is left without a message, it only has a very specific application, and there it may be thrown often, so we don't want anything slowing it down |
Cpass_gate | Class pass_gate passes it's input to it's output unmodified |
Cperiodic_gate | Periodic mapping also folds the incoming value into the allowed range. The resulting value will be ( N * period ) from the input value and inside the range, period being upper - lower. For splines done with PERIODIC boundary conditions, if the shape of the core for this axis is M, we'd pass 0 to _lower and M to _upper |
Creject_gate | Reject_gate throws vspline::out_of_bounds for invalid coordinates |
Cshape_mismatch | Shape mismatch is the exception which is thrown if the shapes of an input array and an output array do not match |
Csimd_allocator | |
Csimd_traits | Traits class simd_traits provides three traits: |
►Csimd_type | Class template simd_type provides a fixed-size container type for small sets of fundamentals which are stored in a C vector. The type offers arithmetic capabilities which are implemented by using loops over the elements in the vector, expecting that the compiler will autovectorize these loops into 'proper' SIMD code. The interface of this type is modelled to be compatible with Vc's SimdArray. Unfortunately, Vc::SimdArray requires additional template arguments, so at times it's difficult to use the two types instead of each other. The interface compatibility does not mean that the arithmetic will produce the same results - this is intended but neither tested nor enforced |
Cmasked_type | |
Csink_functor | Sink_functor is used for functors without an output - e.g. reductors which are used for analytic purposes on data sets. They use the same system of input types, but omit the output types |
Csink_functor_tag | While 'normal' unary_functors are all derived from unary_functor_tag, sink functors will be derived from sink_functor_tag |
Cunary_functor | Class unary_functor provides a functor object which offers a system of types for concrete unary functors derived from it. If vectorization isn't used, this is trivial, but with vectorization in use, we get vectorized types derived from plain IN and OUT via query of vspline::vector_traits |
Cunary_functor_tag | We derive all vspline::unary_functors from this empty class, to have a common base type for all of them. This enables us to easily check if a type is a vspline::unary_functor without having to wrangle with unary_functor's template arguments |
►Cvc_simd_type | Class template vc_simd_type provides a fixed-size SIMD type. This implementation of vspline::vc_simd_type uses Vc::SimdArray The 'acrobatics' may seem futile - why inherit privately from Vc::SimdArray, then code a class template which does essentially the same? There are several reasons: first, the wrapper class results in a common interface shared with the other SIMD implementations, second, there are some added members which can't be 'put into' Vc::SimdArray from the outside. And, third, the template signature is uniform, avoiding Vc::SimdArray's two additional template arguments |
Cmasked_type | |
Cvector_traits | With the definition of 'simd_traits', we can proceed to implement 'vector_traits': struct vector_traits is a traits class fixing the types used for vectorized code in vspline. These types go beyond mere vectors of fundamentals: most of the time, the data vspline has to process are not fundamentals, but what I call 'xel' data: pixels, voxels, stereo sound samples, etc. - so, small aggregates of a fundamental type. vector_traits defines how fundamentals and 'xel' data are to be vectorized. with the types defined by vector_traits, a system of type names is introduced which uses a set of patterns: |
Cvector_traits< T, _vsize, typename std::enable_if< vspline::is_element_expandable< T > ::value > ::type > | Specialization of vector_traits for 'element-expandable' types. These types are recognized by vigra's ExpandElementResult mechanism, resulting in the formation of a 'vectorized' version of the type. These data are what I call 'xel' data. As explained above, vectorization is horizontal, so if T is, say, a pixel of three floats, the type generated here will be a TinyVector of three vectors of vsize floats |
Cyield_type | At times we require reading access to an nD array at given coordinates, as a functor which, receiving the coordinates, produces the values from the array. In the scalar case, this is trivial: if the coordinate is integral, we have a simple indexed access, and if it is real, we can use std::round to produce a nearby discrete coordinate. But for the vectorized case we need a bit more effort: We need to translate the access with a vector of coordinates into a gather operation. We start out with a generalized template class 'yield-type': |
Cyield_type< crd_t, data_t, _vsize, typename std::enable_if< ! crd_integral< crd_t > ::value > ::type > | Next, we specialize for real coordinates |
Cyield_type< crd_t, data_t, _vsize, typename std::enable_if< crd_integral< crd_t > ::value > ::type > | |
►Nvspline_threadpool | |
Cthread_pool | |
►Nwielding | |
Ccoupled_aggregator | Aggregator for separate - possibly different - source and target. If source and target are in fact different, the inner functor will read data from source, process them and then write them to target. If source and target are the same, the operation will be in-place, but not explicitly so. vspline uses this style of two-argument functor, and this is the aggregator we use for vspline's array-based transforms. The code in this template will only be used for vectorized operation, If vectorization is not used, only the specialization for vsize == 1 below is used |
Ccoupled_aggregator< 1, ic_type, functor_type > | Specialization for vsz == 1. Here the data are simply processed one by one in a loop, without vectorization |
Cgenerate_aggregator | Generate_aggregator is very similar to indexed_aggregator, but instead of managing and passing a coordinate to the functor, the functor now manages the argument side of the operation: it acts as a generator. To make this possible, the generator has to hold run-time modifiable state and can't be const like the functors used in the other aggregators, where the functors are 'pure' in a functional programming sense. A 'generator' functor to be used with this body of code is expected to behave in a certain fashion: |
Cgenerate_aggregator< 1, ic_type, functor_type > | Specialization for vsz == 1. Here the data are simply processed one by one in a loop, without vectorization |
Cindexed_aggregator | Indexed_aggregator receives the start coordinate and processing axis along with the data to process, this is meant for index-transforms. The coordinate is updated for every call to the 'inner' functor so that the inner functor has the current coordinate as input. The code in this template will only be used for vectorized operation, without vectorization, only the specialization for vsize == 1 below is used |
Cindexed_aggregator< 1, ic_type, functor_type > | Specialization for vsz == 1. Here the data are simply processed one by one in a loop, without vectorization |
Cindexed_reductor | Indexed_reductor is used for reductions and has no output. The actual reduction is handled by the functor: each thread has it's own copy of the functor, which does it's own part of the reduction, and 'offloads' it's result to some mutex-protected receptacle when it's destructed, see the 'reduce' functions in transform.h for a more detailed explanation and an example of such a functor. idexed_reductor processes discrete coordinates, whereas yield_reductor (the next class down) processes values. This variant works just like an indexed_aggregator, only that it produces no output - at least not for every coordinate fed to the functor, the functor itself does hold state (the reduction) and is also responsible for offloading per-thread results when the worker threads terminate. This class holds a copy of the functor, and each thread has an instance of this class, ensuring that each worker thread can reduce it's share of the work load independently |
Cindexed_reductor< 1, ic_type, functor_type > | Specialization for vsz == 1. Here the data are simply processed one by one in a loop, without vectorization |
Cvs_adapter | Vs_adapter wraps a vspline::unary_functor to produce a functor which is compatible with the wielding code. This is necessary, because vspline's unary_functors take 'naked' arguments if the data are 1D, while the wielding code always passes TinyVectors. The operation of this wrapper class should not have a run-time effect; it's simply converting references. the wrapped functor is only used via operator(), so this is what we provide. While it would be nice to simply pass through the unwrapped unary_functor, this would force us to deal with the distinction between data in TinyVectors and 'naked' fundamentals deeper down in the code, and here is a good central place where we can route to uniform access via TinyVectors - possibly with only one element. By inheriting from inner_type, we provide all of inner_type's type system which we don't explicitly override. Rest assured: the reinterpret_cast is safe. If the data are single-channel, the containerized version takes up the same meory as the uncontainerized version of the datum. multi-channel data are containerized anyway |
Cvs_sink_adapter | Same procedure for a vspline::sink_type |
Cwield | Reimplementation of wield using the new 'neutral' multithread. The workers now all receive the same task to process one line at a time until all lines are processed. This simplifies the code; the wield object directly calls 'multithread' in it's operator(). And it improves performance, presumably because tail-end idling is reduced: all active threads have data to process until the last line has been picked up by an aggregator. So tail-end idling is in the order of magnitude of a line's worth, in contrast to half a worker's share of the data in the previous implementation. The current implementation does away with specialized partitioning code (at least for the time being); it looks like the performance is decent throughout, even without exploiting locality by partitioning to tiles |
Cwield< 1, in_type, out_type > | |
Cyield_reductor | Aggregator to reduce arrays. This is like using indexed_reductor with a functor gathering from an array, but due to the use of 'bunch' this class is faster for certain array types, because it can use load/shuffle operations instead of always gathering |
Cyield_reductor< 1, ic_type, functor_type > | Specialization for vsz == 1. Here the data are simply processed one by one in a loop, without vectorization |
Ccalculate_gradient_type | |
Ccalculate_pickup_type | |
Ccolorize | |
Cev_ca_correct | |
Cev_gsm | We build a vspline::unary_functor which calculates the sum of gradient squared magnitudes. Note how the 'compound evaluator' we construct follows a pattern of |
Cev_meta | |
Cev_radial_correction | |
Cgrokkee_type | Grokkee_type is a vspline::unary_functor returning twice it's input |
Cimage_base_type | |
Cimage_type | |
Cmandelbrot_functor | |
Cmeta_filter | |
Cmultitest | |
Cmultitest< dim, tuple_type, 0 > | |
Coffset_f | Invoke with: bls <spline degree> <number of iterations> \ [ <frequency cutoff> ] |
Crandom_polynomial | |
Ctest | |
Ctest< 0, T > |