vspline 1.1.0
Generic C++11 Code for Uniform B-Splines
transform.h
Go to the documentation of this file.
1/************************************************************************/
2/* */
3/* vspline - a set of generic tools for creation and evaluation */
4/* of uniform b-splines */
5/* */
6/* Copyright 2015 - 2023 by Kay F. Jahnke */
7/* */
8/* The git repository for this software is at */
9/* */
10/* https://bitbucket.org/kfj/vspline */
11/* */
12/* Please direct questions, bug reports, and contributions to */
13/* */
14/* kfjahnke+vspline@gmail.com */
15/* */
16/* Permission is hereby granted, free of charge, to any person */
17/* obtaining a copy of this software and associated documentation */
18/* files (the "Software"), to deal in the Software without */
19/* restriction, including without limitation the rights to use, */
20/* copy, modify, merge, publish, distribute, sublicense, and/or */
21/* sell copies of the Software, and to permit persons to whom the */
22/* Software is furnished to do so, subject to the following */
23/* conditions: */
24/* */
25/* The above copyright notice and this permission notice shall be */
26/* included in all copies or substantial portions of the */
27/* Software. */
28/* */
29/* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND */
30/* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES */
31/* OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND */
32/* NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT */
33/* HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, */
34/* WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING */
35/* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR */
36/* OTHER DEALINGS IN THE SOFTWARE. */
37/* */
38/************************************************************************/
39
40/*! \file transform.h
41
42 \brief set of generic remap, transform and apply functions
43
44 My foremost reason to have efficient B-spline processing is the
45 formulation of generic remap-like functions. remap() is a function
46 which takes an array of real-valued nD coordinates and an interpolator
47 over a source array. Now each of the real-valued coordinates is fed
48 into the interpolator in turn, yielding a value, which is placed in
49 the output array at the same place the coordinate occupies in the
50 coordinate array. To put it concisely, if we have
51
52 - c, the coordinate array (or 'warp' array, 'arguments' array)
53 - a, the source array (containing 'original' or 'knot point' data)
54 - i, the interpolator over a
55 - j, a coordinate into both c and t
56 - t, the target array (receiving the 'result' of the remap)
57
58 remap defines the operation
59
60 t[j] = i(c[j]) for all j
61
62 Now we widen the concept of remapping to a 'transform' function.
63 Instead of limiting the process to the use of an 'interpolator',
64 we use an arbitrary unary functor transforming incoming values to
65 outgoing values, where the type of the incoming and outgoing values
66 is determined by the functor. If the functor actually is an
67 interpolator, we have a 'true' remap transforming coordinates
68 into values, but this is merely a special case. So here we have:
69
70 - c, an array containing input values
71 - f, a unary functor converting input to output values
72 - j, a coordinate into c and t
73 - t, the target array
74
75 transform performs the operation
76
77 t[j] = f(c[j]) for all j
78
79 remaps/transforms to other-dimensional objects are supported.
80 This makes it possible to, for example, remap from a volume to a
81 2D image, using a 2D coordinate array containing 3D coordinates
82 ('slicing' a volume)
83
84 There is also a variant of this transform function in this file,
85 which doesn't take an input array. Instead, for every target
86 location, the location's discrete coordinates are passed to the
87 unary_functor type object. This way, transformation-based remaps
88 can be implemented easily: the user code just has to provide a
89 suitable functor to yield values for coordinates. This functor
90 will internally take the discrete incoming coordinates (into the
91 target array) and take it from there, eventually producing values
92 of the target array's value_type.
93
94 Here we have:
95
96 - f, a unary functor converting discrete coordinates to output values
97 - j, a discrete coordinate into t
98 - t, the target array
99
100 'index-based' transform performs the operation
101
102 t[j] = f(j) for all j
103
104 This file also has code to evaluate a b-spline at coordinates in a
105 grid, which can be used for scaling, and for separable geometric
106 transformations.
107
108 Finally there is a function to restore the original data from a
109 b-spline to the precision possible with the given data type and
110 degree of the spline. This is done with a separable convolution,
111 using a unit-stepped sampling of the basis function as the
112 convolution kernel along every axis.
113
114 Let me reiterate the strategy used to perform the transforms and
115 remaps in this file. The approach is functional: A 'processing chain'
116 is set up and encoded as a functor providing two evaluation functions:
117 one for 'single' data and one for vectorized data. This functor is
118 applied to the data by 'wielding' code, which partitions the data
119 into several jobs to be performed by individual worker threads,
120 invokes the worker threads, and, once in the worker thread, feeds the
121 data to the functor in turn, using hardware vectorization if possible.
122 So while at the user code level a single call to some 'transform'
123 or 'remap' routine is issued, passing in arrays of data and functors,
124 all the 'wielding' is done automatically without any need for the user
125 to even be aware of it, while still using highly efficient vector code
126 with a tread pool, potentially speeding up the operation greatly as
127 compared to single-threaded, unvectorized operation, of course
128 depending on the presence of several cores and vector units. On
129 my system (Haswell 4-core i5 with AVX2) the speedup is about one
130 order of magnitude. The only drawback is the production of a
131 hardware-specific binary if vectorization is used. Both Vc use
132 and multithreading can be easily activated/deactivated by #define
133 switches, providing a clean fallback solution to produce code suitable
134 for any target, even simple single-core machines with no vector units.
135 Vectorization will be used if possible - either explicit vectorization
136 (by defining USE_VC) - or autovectorization per default.
137 Defining VSPLINE_SINGLETHREAD will disable multithreading.
138 The code accessing multithreading and/or Vc use is #ifdeffed, so if
139 these features are disabled, their code 'disappears' and the relevant
140 headers are not included, nor do the corresponding libraries have to
141 be present.
142*/
143
144// TODO: don't multithread or reduce number of jobs for small data sets
145
146#ifndef VSPLINE_TRANSFORM_H
147#define VSPLINE_TRANSFORM_H
148
149#include "multithread.h" // vspline's multithreading code
150#include "eval.h" // evaluation of b-splines
151#include "poles.h"
152#include "convolve.h"
153
154// The bulk of the implementation of vspline's two 'transform' functions
155// is now in wielding.h:
156
157#include "wielding.h"
158
159namespace vspline {
160
161/// implementation of two-array transform using wielding::coupled_wield.
162///
163/// 'array-based' transform takes two template arguments:
164///
165/// - 'unary_functor_type', which is a class satisfying the interface
166/// laid down in unary_functor.h. Typically, this would be a type
167/// inheriting from vspline::unary_functor, but any type will do as
168/// long as it provides the required typedefs and an the relevant
169/// eval() routines.
170///
171/// - the dimensionality of the input and output array
172///
173/// this overload of transform takes three parameters:
174///
175/// - a reference to a const unary_functor_type object providing the
176/// functionality needed to generate values from arguments.
177///
178/// - a reference to a const MultiArrayView holding arguments to feed to
179/// the unary functor object. It has to have the same shape as the target
180/// array and contain data of the unary_functor's 'in_type'.
181///
182/// - a reference to a MultiArrayView to use as a target. This is where the
183/// resulting data are put, so it has to contain data of unary_functor's
184/// 'out_type'. It has to have the same shape as the input array.
185///
186/// transform can be used without template arguments, they will be inferred
187/// by ATD from the arguments.
188///
189/// trailing the argument list, all transform overloads have two further
190/// parameters with default values, which are only used for special purposes:
191///
192/// - njobs gives the number of worker threads to be used when multithreading
193/// the transform. The default results in using as many worker threads as
194/// there are worker threads in the thread pool. This is a generous number,
195/// and several times the number of physical cores, but this benefits
196/// performance. Passing '1' here will execute the transform in 'this'
197/// thread rather than invoking a single worker thread from the pool.
198///
199/// - p_cancel, when passed, has to point to a std::atomic<bool> which
200/// is initially 'true' but can be (atomically) set to 'false' by calling
201/// code. This will result in the transform stopping 'at the next convenient
202/// point', usually after having processed a complete line or line segment
203/// of data, resulting in an invalid result. While the result can not be
204/// used, this is a graceful way out if the calling code finds that
205/// continuing with the transform is futile - for instance because the
206/// user asks for the whole program to terminate. The cancellation still
207/// takes 'a while' to 'trickle down' to all worker threads.
208
209template < typename unary_functor_type ,
210 unsigned int dimension >
211void transform ( const unary_functor_type & functor ,
212 const vigra::MultiArrayView
213 < dimension ,
214 typename unary_functor_type::in_type
215 > & input ,
216 vigra::MultiArrayView
217 < dimension ,
218 typename unary_functor_type::out_type
219 > & output ,
220 int njobs = vspline::default_njobs ,
221 vspline::atomic < bool > * p_cancel = 0
222 )
223{
224 // check shape compatibility
225
226 if ( output.shape() != input.shape() )
227 {
229 ( "transform: the shapes of the input and output array do not match" ) ;
230 }
231
232 // wrap the vspline::unary_functor to be used with wielding code.
233 // The wrapper is necessary because the code in wielding.h feeds
234 // arguments as TinyVectors, even if the data are 'singular'.
235 // The wrapper simply reinterpret_casts any TinyVectors of one
236 // element to their corresponding value_type before calling the
237 // functor.
238
239 typedef wielding::vs_adapter < unary_functor_type > coupled_functor_type ;
240 coupled_functor_type coupled_functor ( functor ) ;
241
242 // we'll cast the pointers to the arrays to these types to be
243 // compatible with the wrapped functor above.
244
245 typedef typename coupled_functor_type::in_type src_type ;
246 typedef typename coupled_functor_type::out_type trg_type ;
247
248 typedef vigra::MultiArrayView < dimension , src_type > src_view_type ;
249 typedef vigra::MultiArrayView < dimension , trg_type > trg_view_type ;
250
251 wielding::coupled_wield < coupled_functor_type , dimension >
252 ( coupled_functor ,
253 (src_view_type*)(&input) ,
254 (trg_view_type*)(&output) ,
255 njobs ,
256 p_cancel
257 ) ;
258}
259
260/// reduce function processing by value. sink_functor_type is a functor
261/// taking values (which are loaded from an array). The reduce function,
262/// by delegation to wielding::value_reduce, performs the feeding part
263/// by parcelling the data, the functor receives either single in_type
264/// or vectorized data. The functor has to be a construct capable of
265/// accumulating data with several threads in parallel. A typical
266/// example would be a functor cumulating to some internal structure
267/// and 'offloading' to a mutex-protected pooling target during it's
268/// destructor. The wielding code creates a copy of the functor which
269/// is passed here for each worker thread and guarantees that this
270/// copy's destructor is executed before the call to reduce returns,
271/// so the user can be assured of having collected the partial results
272/// of all worker threads in the pooling target. The functor needs a
273/// copy constructor to communicate the pooling target to the copies
274/// made for the individual threads, see the index-based reduce function
275/// below this one for an illustration of the simple and elegant mechanism.
276/// Note that sink_functor will normally be a type inheriting from
277/// vspline::sink_functor, a close relative of vspline::unary_functor
278/// and also defined in unary_functor.h. This inheritance does not
279/// produce any functionality, just the standard vspline functor
280/// argument type system, which this code relies on. Of course the
281/// sink functor can be coded 'manually' as well - inheriting form
282/// vspline::sink_functor is merely convenient.
283
284template < typename sink_functor_type ,
285 unsigned int dimension >
286void reduce ( const sink_functor_type & functor ,
287 const vigra::MultiArrayView
288 < dimension ,
289 typename sink_functor_type::in_type
290 > & input ,
291 int njobs = vspline::default_njobs ,
292 vspline::atomic < bool > * p_cancel = 0
293 )
294{
295 // wrap the vspline::sink_functor to be used with wielding code.
296 // The wrapper is necessary because the code in wielding.h feeds
297 // all arguments as TinyVectors, even if the data are 'singular'.
298 // The wrapper simply reinterpret_casts any TinyVectors of one
299 // element to the corresponding value_type before calling the
300 // functor. At the same time, this function reinterpret_casts
301 // the incoming array to an array of TinyVectors, even if the
302 // data in the array are fundamentals. This is because the
303 // wielding code expects an array of TinyVectors in every case.
304
306 < sink_functor_type > adapted_functor_type ;
307 adapted_functor_type adapted_functor ( functor ) ;
308
309 // we'll cast the pointers to the array to this type to be
310 // compatible with the wrapped functor above.
311
312 typedef typename adapted_functor_type::in_type src_type ;
313
314 typedef vigra::MultiArrayView < dimension , src_type > src_view_type ;
315
316 // now delegate to the wielding code
317
318 wielding::value_reduce < adapted_functor_type , dimension >
319 ( adapted_functor ,
320 (src_view_type*)(&input) ,
321 njobs ,
322 p_cancel
323 ) ;
324}
325
326/// reduce function processing by coordinate. sink_functor_type is
327/// a functor receiving integral coordinates, with no fixed meaning,
328/// but usually pertaining to an array. The reduce function feeds
329/// all coordinates encompassed by the shape, either in vectorized
330/// or single form. Here's an example of a functor for this function:
331
332/*
333
334template < typename coordinate_type > struct count_items
335: public vspline::sink_functor < coordinate_type >
336{
337 typedef vspline::sink_functor < coordinate_type > base_type ;
338 using typename base_type::in_type ;
339 using typename base_type::in_v ;
340 using base_type::vsize ;
341
342 // reference to a thread-safe 'pooling target'
343
344 std::atomic < long > & collector ;
345 long sum ;
346
347 // this c'tor is used to create the initial count_items object and
348 // fixes the reference to the 'pooling target'
349
350 count_items ( std::atomic < long > & _collector )
351 : collector ( _collector )
352 { sum = 0 ; }
353
354 // count_items needs a copy c'tor to propagate the reference to
355 // the 'pooling target'. Note that copy assignment won't work.
356
357 count_items ( const count_items & other )
358 : collector ( other.collector )
359 { sum = 0 ; }
360
361 // finally, two operator() overloads, one for scalar input and
362 // one for SIMD input.
363
364 void operator() ( const in_type & crd )
365 { sum ++ ; }
366
367 void operator() ( const in_v & crd )
368 { sum += vsize ; }
369
370 ~count_items()
371 { collector += sum ; }
372} ;
373
374*/
375
376/// implementation of index-based transform using wielding::index_wield
377///
378/// this overload of transform() is very similar to the first one, but
379/// instead of picking input from an array, it feeds the discrete coordinates
380/// of the successive places data should be rendered to to the
381/// unary_functor_type object.
382///
383/// This sounds complicated, but is really quite simple. Let's assume you have
384/// a 2X3 output array to fill with data. When this array is passed to transform,
385/// the functor will be called with every coordinate pair in turn, and the result
386/// the functor produces is written to the output array. So for the example given,
387/// with 'ev' being the functor, we have this set of operations:
388///
389/// output [ ( 0 , 0 ) ] = ev ( ( 0 , 0 ) ) ;
390///
391/// output [ ( 1 , 0 ) ] = ev ( ( 1 , 0 ) ) ;
392///
393/// output [ ( 2 , 0 ) ] = ev ( ( 2 , 0 ) ) ;
394///
395/// output [ ( 0 , 1 ) ] = ev ( ( 0 , 1 ) ) ;
396///
397/// output [ ( 1 , 1 ) ] = ev ( ( 1 , 1 ) ) ;
398///
399/// output [ ( 2 , 1 ) ] = ev ( ( 2 , 1 ) ) ;
400///
401/// this transform overload takes one template argument:
402///
403/// - 'unary_functor_type', which is a class satisfying the interface laid
404/// down in unary_functor.h. This is an object which can provide values
405/// given *discrete* coordinates, like class evaluator, but generalized
406/// to allow for arbitrary ways of achieving it's goal. The unary functor's
407/// 'in_type' determines the number of dimensions of the coordinates - since
408/// they are coordinates into the target array, the functor's input type
409/// has to have the same number of dimensions as the target. The functor's
410/// 'out_type' has to be the same as the data type of the target array, since
411/// the target array stores the results of calling the functor.
412///
413/// this transform overload takes two parameters:
414///
415/// - a reference to a const unary_functor_type object providing the
416/// functionality needed to generate values from discrete coordinates
417///
418/// - a reference to a MultiArrayView to use as a target. This is where the
419/// resulting data are put.
420///
421/// Please note that vspline holds with vigra's coordinate handling convention,
422/// which puts the fastest-changing index first. In a 2D, image processing,
423/// context, this is the column index, or the x coordinate. C and C++ do
424/// instead put this index last when using multidimensional array access code.
425///
426/// transform can be used without template arguments, they will be inferred
427/// by ATD from the arguments.
428///
429/// See the remarks on the two trailing parameters given with the two-array
430/// overload of transform.
431
432template < class unary_functor_type >
433void transform ( const unary_functor_type & functor ,
434 vigra::MultiArrayView
435 < unary_functor_type::dim_in ,
436 typename unary_functor_type::out_type
437 > & output ,
438 int njobs = vspline::default_njobs ,
439 vspline::atomic < bool > * p_cancel = 0 )
440{
441 enum { dimension = unary_functor_type::dim_in } ;
442
443 // wrap the vspline::unary_functor to be used with wielding code
444
445 typedef wielding::vs_adapter < unary_functor_type > index_functor_type ;
446 index_functor_type index_functor ( functor ) ;
447
448 // we'll cast the pointer to the target array to this type to be
449 // compatible with the wrapped functor above
450
451 typedef typename index_functor_type::out_type trg_type ;
452 typedef vigra::MultiArrayView < dimension , trg_type > trg_view_type ;
453
454 // now delegate to the wielding code
455
456 wielding::index_wield < index_functor_type , dimension >
457 ( index_functor ,
458 (trg_view_type*)(&output) ,
459 njobs ,
460 p_cancel ) ;
461}
462
463/// while the function above fills an array by calling a functor with
464/// the target coordinates, this function accumulates the results by
465/// means of a sink_functor. The effect is as if an index-based transform
466/// was executed first, filling an intermediate array, followed by a call
467/// to reduce taking the intermediate array as input. The combination of
468/// both operations 'cuts out' the intermediate. Since this function does
469/// not have a target array, the shape has to be communicated explicitly.
470/// The functor will be invoked for every possible coordinate within the
471/// shape's scope. This is done by the function 'index_reduce' in wielding.h
472/// the type of the argument 'shape' is derived from the sink_functor's
473/// 'dim_in' value, to make the code 'respect the singular': if the data
474/// are 1D, reduce will accept a single integer as 'shape', whereas nD
475/// data require a TinyVector of int.
476/// For suitable sink_functors, see the remarks and example given above
477/// in the comments for the array-base overload of 'reduce'.
478
479template < typename sink_functor_type >
480void reduce ( const sink_functor_type & functor ,
481 typename std::conditional
482 < ( sink_functor_type::dim_in == 1 ) ,
483 int ,
484 vigra::TinyVector < int , sink_functor_type::dim_in >
485 > :: type shape ,
486 int njobs = vspline::default_njobs ,
487 vspline::atomic < bool > * p_cancel = 0
488 )
489{
490 enum { dimension = sink_functor_type::dim_in } ;
491
492 // adapt the functor to be compatible with the wielding code
493
495 < sink_functor_type > adapted_functor_type ;
496
497 adapted_functor_type adapted_functor ( functor ) ;
498
499 // the wielding code expects all shape information as TinyVectors,
500 // even if the data are 1D:
501
502 typename sink_functor_type::in_nd_ele_type nd_shape ( shape ) ;
503
504 wielding::index_reduce < adapted_functor_type , dimension >
505 ( adapted_functor ,
506 nd_shape ,
507 njobs ,
508 p_cancel
509 ) ;
510}
511
512/// we code 'apply' as a special variant of 'transform' where the output
513/// is also used as input, so the effect is to feed the unary functor
514/// each 'output' value in turn, let it process it and store the result
515/// back to the same location. While this looks like a rather roundabout
516/// way of performing an apply, it has the advantage of using the same
517/// type of functor (namely one with const input and writable output),
518/// rather than a different functor type which modifies it's argument
519/// in-place. While, at this level, using such a functor looks like a
520/// feasible idea, It would require specialized code 'further down the
521/// line' when complex functors are built with vspline's functional
522/// programming tools: the 'apply-capable' functors would need to read
523/// the output values first and write them back after anyway, resulting
524/// in the same sequence of loads and stores as we get with the current
525/// 'fake apply' implementation.
526/// From the direct delegation to the two-array overload of 'transform',
527/// you can see that this is merely syntactic sugar.
528
529template < typename unary_functor_type , // functor to apply
530 unsigned int dimension > // input/output array's dimension
531void apply ( const unary_functor_type & ev ,
532 vigra::MultiArrayView
533 < dimension ,
534 typename unary_functor_type::out_type >
535 & output ,
536 int njobs = vspline::default_njobs ,
537 vspline::atomic < bool > * p_cancel = 0
538 )
539{
540 // make sure the functor's input and output type are the same
541
542 static_assert ( std::is_same < typename unary_functor_type::in_type ,
543 typename unary_functor_type::out_type > :: value ,
544 "apply: functor's input and output type must be the same" ) ;
545
546 // delegate to transform
547
548 transform ( ev , output , output , njobs , p_cancel ) ;
549}
550
551/// a type for a set of boundary condition codes, one per axis
552
553template < unsigned int dimension >
554using bcv_type = vigra::TinyVector < vspline::bc_code , dimension > ;
555
556/// Implementation of 'classic' remap, which directly takes an array of
557/// values and remaps it, internally creating a b-spline of given order
558/// just for the purpose. This is used for one-shot remaps where the spline
559/// isn't reused, and specific to b-splines, since the functor used is a
560/// b-spline evaluator. The spline defaults to a cubic b-spline with
561/// mirroring on the bounds.
562///
563/// So here we have the 'classic' remap, where the input array holds
564/// coordinates and the functor used is actually an interpolator. Since
565/// this is merely a special case of using transform(), we delegate to
566/// transform() once we have the evaluator.
567///
568/// The template arguments are chosen to allow the user to call 'remap'
569/// without template arguments; the template arguments can be found by ATD
570/// by looking at the MultiArrayViews passed in.
571///
572/// - original_type is the value_type of the array holding the 'original' data over
573/// which the interpolation is to be performed
574///
575/// - result_type is the value_type of the array taking the result of the remap,
576/// namely the values produced by the interpolation. these data must have as
577/// many channels as original_type
578///
579/// - coordinate_type is the type for coordinates at which the interpolation is to
580/// be performed. coordinates must have as many components as the input array
581/// has dimensions.
582///
583/// optionally, remap takes a set of boundary condition values and a spline
584/// degree, to allow creation of splines for specific use cases beyond the
585/// default. I refrain from extending the argument list further; user code with
586/// more specific requirements will have to create an evaluator and use 'transform'.
587///
588/// Note that remap can be called without template arguments, the types will
589/// be inferred by ATD from the arguments passed in.
590///
591/// See the remarks on the two trailing parameters given with the two-array
592/// overload of transform.
593
594template < typename original_type , // data type of original data
595 typename result_type , // data type for interpolated data
596 typename coordinate_type , // data type for coordinates
597 unsigned int cf_dimension , // dimensionality of original data
598 unsigned int trg_dimension , // dimensionality of result array
599 int bcv_dimension = cf_dimension > // see below. g++ ATD needs this.
600void remap ( const vigra::MultiArrayView
601 < cf_dimension , original_type > & input ,
602 const vigra::MultiArrayView
603 < trg_dimension , coordinate_type > & coordinates ,
604 vigra::MultiArrayView
605 < trg_dimension , result_type > & output ,
608 int degree = 3 ,
609 int njobs = vspline::default_njobs ,
610 vspline::atomic < bool > * p_cancel = 0 )
611{
612 static_assert ( vigra::ExpandElementResult < original_type > :: size
613 == vigra::ExpandElementResult < result_type > :: size ,
614 "input and output data type must have same nr. of channels" ) ;
615
616 static_assert ( cf_dimension
617 == vigra::ExpandElementResult < coordinate_type > :: size ,
618 "coordinate type must have the same dimension as input array" ) ;
619
620 // this is silly, but when specifying bcv_type < cf_dimension >, the code failed
621 // to compile with g++. So I use a separate template argument bcv_dimension
622 // and static_assert it's the same as cf_dimension. TODO this sucks...
623
624 static_assert ( cf_dimension
625 == bcv_dimension ,
626 "boundary condition specification needs same dimension as input array" ) ;
627
628 // check shape compatibility
629
630 if ( output.shape() != coordinates.shape() )
631 {
632 throw shape_mismatch
633 ( "the shapes of the coordinate array and the output array must match" ) ;
634 }
635
636 // get a suitable type for the b-spline's coefficients
637
638 typedef typename vigra::PromoteTraits < original_type , result_type >
639 :: Promote _cf_type ;
640
641 typedef typename vigra::NumericTraits < _cf_type >
642 :: RealPromote cf_type ;
643
644 // create the bspline object
645
647 spline_type bsp ( input.shape() , degree , bcv ) ;
648
649 // prefilter, taking data in 'input' as knot point data
650
651 bsp.prefilter ( input ) ;
652
653 // since this is a commodity function, we use a 'safe' evaluator.
654 // If maximum performance is needed and the coordinates are known to be
655 // in range, user code should create a 'naked' vspline::evaluator and
656 // use it with vspline::transform.
657 // Note how we pass in 'rc_type', the elementary type of a coordinate.
658 // We want to allow the user to pass float or double coordinates.
659
660 typedef typename vigra::ExpandElementResult < coordinate_type > :: type rc_type ;
661
662 auto ev = vspline::make_safe_evaluator < spline_type , rc_type > ( bsp ) ;
663
664 // call transform(), passing in the evaluator,
665 // the coordinate array and the target array
666
667 transform ( ev , coordinates , output , njobs , p_cancel ) ;
668}
669
670// next we have code for evaluation of b-splines over grids of coordinates.
671// This code lends itself to some optimizations, since part of the weight
672// generation used in the evaluation process is redundant, and by
673// precalculating all redundant values and referring to the precalculated
674// values during the evaluation a good deal of time can be saved - provided
675// that the data involved are nD.
676
677// TODO: as in separable convolution, it might be profitable here to apply
678// weights for one axis to the entire array, then repeat with the other axes
679// in turn. storing, modifying and rereading the array may still
680// come out faster than the rather expensive DDA needed to produce the
681// value with weighting in all dimensions applied at once, as the code
682// below does (by simply applying the weights in the innermost eval
683// class evaluator offers). The potential gain ought to increase with
684// the dimensionality of the data.
685
686// for evaluation over grids of coordinates, we use a vigra::TinyVector
687// of 1D MultiArrayViews holding the component coordinates for each
688// axis.
689// When default-constructed, this object holds default-constructed
690// MultiArrayViews, which, when assigned another MultiArrayView,
691// will hold another view over the data, rather than copying them.
692// initially I was using a small array of pointers for the purpose,
693// but that is potentially unsafe and does not allow passing strided
694// data.
695
696template < unsigned int dimension , typename rc_ele_type = float >
698vigra::TinyVector
699 < vigra::MultiArrayView < 1 , rc_ele_type > ,
700 dimension
701 > ;
702
703// the implementation of grid-based evaluation is in namespace detail;
704// I assume that this code is not very useful for users and it would
705// only clutter the vspline namespace.
706
707namespace detail
708{
709
710/// we need a 'generator functor' to implement grid_eval using the code
711/// in wielding.h.
712/// this functor precalculates the b-spline evaluation weights corresponding
713/// to the coordinates in the grid and stores them in vectorized format,
714/// to speed up their use as much as possible.
715
716template < typename ET >
718{
720
721 static const size_t dimension = evaluator_type::dimension ;
722 static const size_t vsize = evaluator_type::vsize ;
723 static const size_t channels = evaluator_type::channels ;
724
725 typedef typename evaluator_type::inner_type inner_type ;
726 typedef typename evaluator_type::math_ele_type weight_type ;
727
728 typedef typename evaluator_type::out_type out_type ;
729 typedef typename evaluator_type::out_ele_type out_ele_type ;
730 typedef typename evaluator_type::out_nd_ele_type out_nd_ele_type ;
731 typedef typename evaluator_type::out_v out_v ;
732 typedef typename evaluator_type::out_ele_v out_ele_v ;
733 typedef typename evaluator_type::out_nd_ele_v out_nd_ele_v ;
734
735 typedef typename vspline::vector_traits
736 < weight_type , vsize > :: type weight_v ;
737 typedef typename evaluator_type::rc_ele_type rc_type ;
738 typedef vigra::MultiArrayView
739 < dimension , typename evaluator_type::trg_type > target_type ;
740 typedef vigra::TinyVector < size_t , dimension > shape_type ;
741
742 typedef typename vspline::vector_traits
743 < int , vsize > :: type ofs_ele_v ;
744
746
747 const int spline_order ; // order of the b-spline
748 const shape_type shape ; // shape of result array
749 const grid_spec_type & grid ; // the grid coordinates
750 const evaluator_type & itp ; // b-spline evaluator
751 const size_t axis ; // axis to process, other axes will vary
752 const size_t aggregates ; // number of full weight/offset packets
753 const size_t rest ; // number leftover single coordinates
754 const size_t w_pack_sz ; // number of weight vectors in a packet
755
756 int socket_offset ; // cumulated offset for axes != axis
757
758 // keeps track of ownership of allocated data: it's set to 'this'
759 // when the buffers are allocated in the c'tor. If default copy
760 // construction or default assignment take place, memory_guard
761 // and this don't match anymore and the destructor does not free
762 // the data.
763
765
766 // dimension X pointers to sets of weights. Each set of weights
767 // consists of spline_order single weights, and there are as many
768 // sets as there are corresponding coordinates in the grid spec.
769
771
772 // for the dimension 'axis', we keep a reshuffled copy of the
773 // weights, so that we have groups of spline_order weight vectors
774 // followed by single weight sets which don't fit a vectorized
775 // weight pack (the latter might be dropped)
776
778
779 // per dimesnion, one offset per coordinate:
780
781 int * grid_ofs [ dimension ] ;
782
783 // We'll count the vectorized packets up, this state variable
784 // holds the next packet's number
785
787
788 // tail_crd will hold the nD coordinate of the first coordinate
789 // along the grid coordinates for axis which did not fit into
790 // a full packet, so that single-value processing cas start there
791 // after peeling
792
794
795 // set of weights for a single coordinate
796
797 vigra::MultiArray < 2 , weight_type > weight ;
798
799 // have storage ready for one set of vectorized weights
800
802 = typename vspline::allocator_traits < weight_v > :: type ;
803
804 vigra::MultiArray < 2 , weight_v , allocator_t > vweight ;
805
806 // reset is called when starting another line. The packet number
807 // is reset to zero, tail_crd is set to the first unvectorized
808 // position (starting point after peeling is complete). Then those
809 // components of the coordinate 'crd' which are *not* 'axis' are
810 // broadcast to the corresponding vectors in 'vweight', which
811 // holds the current packet; they will remain constant for the
812 // whole line, whereas the row in vweight corresponding to 'axis'
813 // will receive fresh values for each (vectorized) iteration.
814 // The extra argument _aggregates will usually be the same as
815 // 'aggregates', but a smaller number may be specified here to
816 // process subsets (used for 1D operation, TODO needs testing)
817
818 void reset ( shape_type crd , size_t _aggregates )
819 {
820 // we require starting coordinates to be a multiple of vsize for
821 // the processing axis, so that we can fetch vectorized weights,
822 // which have been stored in batches of vsize, starting at 0.
823 // If we'd allow 'odd' starting coordinates, we'd have to first
824 // process a few single coordinates before switching to vectors,
825 // which would complicate the logic.
826
827 assert ( crd [ axis ] % vsize == 0 ) ;
828
829 current_pack = crd [ axis ] / vsize ;
830 tail_crd [ axis ] = crd [ axis ] + _aggregates * vsize ;
831 socket_offset = 0 ;
832
833 for ( size_t d = 0 ; d < dimension ; d++ )
834 {
835 if ( d != axis )
836 {
837 tail_crd [ d ] = crd [ d ] ;
838 socket_offset += grid_ofs [ d ] [ crd [ d ] ] ;
839 for ( size_t k = 0 ; k < spline_order ; k++ )
840 {
841 vweight [ vigra::Shape2 ( k , d ) ]
842 = weight [ vigra::Shape2 ( k , d ) ]
843 = grid_weight [ d ] [ crd [ d ] * spline_order + k ] ;
844 }
845 }
846 }
847 }
848
849 // fetch_weights copies a set of vectorized weights from
850 // 'shuffled_weights' (where it has been put in vectorized form
851 // to be picked up by fast SIMD loads, rather than having to
852 // deinterleave the data from grid_weight [ axis ])
853
855 {
856 auto p_wgt = shuffled_weights + current_pack * w_pack_sz ;
857 auto w_line = vweight.bindAt ( 1 , axis ) ;
858 for ( auto & wvr : w_line )
859 {
860 wvr.load ( p_wgt ) ;
861 p_wgt += vsize ;
862 }
863 }
864
865 // the c'tor sets up all invariant data of the generator. The most
866 // important of these are the precalculated weights, which allow for
867 // fast evaluation, avoiding the weight generation which is otherwise
868 // needed for every evaluation locus.
869 // Note that grev_generator objects may be default-copied, retaining their
870 // validity.
871
873 const evaluator_type & _itp ,
874 size_t _axis ,
875 shape_type _shape
876 )
877 : grid ( _grid ) ,
878 itp ( _itp ) ,
879 axis ( _axis ) ,
880 shape ( _shape ) ,
881 aggregates ( _shape [ _axis ] / vsize ) ,
882 rest ( _shape [ _axis ] % vsize ) ,
883 w_pack_sz ( _itp.inner.get_order() * vsize ) ,
884 spline_order ( _itp.inner.get_order() ) ,
885 vweight ( vigra::Shape2 ( _itp.inner.get_order() , dimension ) ) ,
886 weight ( vigra::Shape2 ( _itp.inner.get_order() , dimension ) ) ,
887 memory_guard ( this )
888 {
889 // get some metrics from the b-spline evaluator needed to translate
890 // coordinates to offsets in memory
891
892 vigra::TinyVector < int , dimension >
893 estride ( itp.inner.get_estride() ) ;
894
895 // allocate space for the per-axis weights and offsets
896
897 for ( int d = 0 ; d < dimension ; d++ )
898 {
899 grid_weight[d] = new weight_type [ spline_order * shape [ d ] ] ;
900 grid_ofs[d] = new int [ shape [ d ] ] ;
901 }
902
903 // fill in the weights and offsets, using the interpolator's
904 // split() to split the coordinates received in grid_coordinate,
905 // the interpolator's obtain_weights() method to produce the
906 // weight components, and the strides of the coefficient array
907 // to convert the integral parts of the coordinates into offsets.
908
909 for ( int d = 0 ; d < dimension ; d++ )
910 {
911 int select ; // integral part of coordinate
912 rc_type tune ; // delta, so that select + tune == crd [ d ]
913
914 // split all coordinates received in the grid spec into
915 // integral and remainder part, then use the remainder part
916 // to obtain a set of weights. The integral part of the
917 // coordinates is transformed to an offset in memory.
918
919 for ( int c = 0 ; c < shape [ d ] ; c++ )
920 {
921 itp.inner.split
922 ( grid [ d ] [ c ] , select , tune ) ;
923
924 itp.inner.obtain_weights
925 ( grid_weight [ d ] + spline_order * c , d , tune ) ;
926
927 grid_ofs [ d ] [ c ] = select * estride [ d ] ;
928 }
929 }
930
931 // reshuffle the weights along axis 'axis' to vectorized order
932 // so that the evaluation can pick them up with SIMD loads.
933 // This might be coded more efficiently, but since this
934 // calculation is done only once for a whole grid, it's no big deal.
935
937
938 if ( shape [ axis ] >= vsize )
939 {
940 for ( size_t a = 0 ; a < aggregates ; a++ )
941 {
942 auto source = grid_weight[axis] + a * w_pack_sz ;
943 auto target = shuffled_weights + a * w_pack_sz ;
944
945 for ( size_t e = 0 ; e < vsize ; e++ )
946 {
947 for ( size_t k = 0 ; k < spline_order ; k++ )
948 {
949 target [ vsize * k + e ] = source [ e * spline_order + k ] ;
950 }
951 }
952 }
953 }
954
955 // fill up with weights which didn't fit a pack. might be omitted.
956
957 for ( size_t i = aggregates * w_pack_sz ;
958 i < spline_order * shape [ axis ] ;
959 i++ )
960 shuffled_weights [ i ] = grid_weight [ axis ] [ i ] ;
961 }
962
963 // evaluate, using the current packet of weights/offsets. This
964 // blindly trust that calling code keeps track of the number of
965 // packets which can be processed during the peeling phase.
966
967 template < typename = std::enable_if < ( vsize > 1 ) > >
968 void eval ( out_v & result )
969 {
970 // pick up the varying weights into the current package
971
972 fetch_weights() ;
973
974 // load current set offsets for 'axis'
975
976 ofs_ele_v select ;
977
978 select.load ( grid_ofs [ axis ] + current_pack * vsize ) ;
979
980 // add offsets for the remaining axes; these are broadcast,
981 // there is only a single offset instead of a whole vector of
982 // offsets needed for 'axis'.
983
984 select += socket_offset ;
985
986 // a bit of type artistry to provide a reference to an nD
987 // result type, even if the data are single-channel:
988 // trg_t is an nD type even for single-channel data, so
989 // vector_traits produces an nD vectorized type.
990 // The inner evaluator which is used for evaluation uses
991 // 'synthetic' data types for all data, and we want to
992 // use it for evaluation, so we have to present 'result'
993 // in proper guise.
994
995 typedef typename inner_type::trg_type trg_t ;
996 typedef typename vspline::vector_traits
997 < trg_t , vsize > :: type trg_v ;
998
999 trg_v & nd_result = reinterpret_cast < trg_v & > ( result ) ;
1000
1001 // finally we perfrom the evaluation
1002
1003 itp.inner.eval ( select , vweight , nd_result ) ;
1004
1005 // and advance current_pack for the next cycle
1006
1007 current_pack++ ;
1008 }
1009
1010 // here we have the single-value evaluation which is used after
1011 // all full vectorized packs have been processed
1012
1013 void eval ( out_type & result )
1014 {
1015 auto c = tail_crd [ axis ] ;
1016
1017 int select = socket_offset + grid_ofs [ axis ] [ c ] ;
1018
1019 for ( int k = 0 ; k < spline_order ; k++ )
1020 {
1021 weight [ vigra::Shape2 ( k , axis ) ] =
1022 grid_weight [ axis ] [ c * spline_order + k ] ;
1023 }
1024
1025 // move to the 'synthetic' type
1026
1027 typedef typename inner_type::trg_type trg_t ;
1028 trg_t & nd_result = reinterpret_cast < trg_t & > ( result ) ;
1029 itp.inner.eval ( select , weight , nd_result ) ;
1030
1031 // count up coordinate along 'axis'
1032
1033 ++ tail_crd [ axis ] ;
1034 }
1035
1036 // evaluation into a buffer. Here the data for one line are produced
1037 // all at once.
1038
1039 void eval ( vigra::MultiArrayView < 1 , out_ele_v > & vresult ,
1040 vigra::MultiArrayView < 1 , out_type > & leftover )
1041 {
1042 assert ( vresult.shape(0) == aggregates * channels ) ;
1043 assert ( leftover.shape(0) == rest ) ;
1044
1045 // TODO: would be nice to simply have a MultiArrayView of
1046 // aggregates * out_v, but that crashes
1047 // hence the detour via the nD type and storing (and reloading)
1048 // individual vectors
1049
1050 out_v vr ;
1051 out_nd_ele_v & ndvr = reinterpret_cast < out_nd_ele_v & > ( vr ) ;
1052
1053 for ( size_t a = 0 ; a < aggregates ; a++ )
1054 {
1055 eval ( vr ) ;
1056 for ( size_t ch = 0 ; ch < channels ; ch++ )
1057 vresult [ a * channels + ch ] = ndvr [ ch ] ;
1058 }
1059 for ( size_t a = 0 ; a < rest ; a++ )
1060 {
1061 eval ( leftover[a] ) ;
1062 }
1063 }
1064
1065 void eval ( vigra::MultiArrayView < 1 , out_type > & result )
1066 {
1067 assert ( result.shape(0) == shape [ axis ] ) ;
1068
1069 for ( size_t a = 0 ; a < shape [ axis ] ; a++ )
1070 {
1071 eval ( result[a] ) ;
1072 }
1073 }
1074
1075 // TODO: I also want an evaluator which only applies the weights along
1076 // 'axis', and no weights pertaining to the other axes - to obtain data
1077 // which are 'partially evaluated' along 'axis'. If this process is
1078 // repeated for each axis, the result should be the same. This should
1079 // be more efficient if evaluation loci in the grid are 'close' so that
1080 // multiple full evaluations share coefficients, and the advantage
1081 // should increase with dimensionality.
1082
1084 {
1085 // clean up. memory_guard is only set in the c'tor, so copied objects
1086 // will not have matching memory_guard and this. Only the object which
1087 // has initially allocated the memory will free it.
1088
1089 if ( memory_guard == this )
1090 {
1091 for ( int d = 0 ; d < dimension ; d++ )
1092 {
1093 delete[] grid_weight[d] ;
1094 delete[] grid_ofs[d] ;
1095 }
1096 delete[] shuffled_weights ;
1097 }
1098 }
1099} ;
1100
1101/// given the generator functor 'grev_generator' (above), performing
1102/// the grid_eval becomes trivial: construct the generator and pass it
1103/// to the wielding code. The signature looks complex, but the types
1104/// needed can all be inferred by ATD - callers just need to pass a
1105/// grid_spec object, a b-spline evaluator and a target array to accept
1106/// the result - plus, optionally, a pointer to a cancellation atomic,
1107/// which, if given, may be set by the calling code to gracefully abort
1108/// the calculation at the next convenient occasion.
1109/// Note that grid_eval is specific to b-spline evaluation and will not
1110/// work with any evaluator type which is not a 'plain', 'raw' b-spline
1111/// evaluator. So, even functors gained by calling vspline's factory
1112/// functions (make_evaluator, make_safe_evaluator) will *not* work here,
1113/// since they aren't of type vspline::evaluator (they are in fact grok_type).
1114/// To use arbitrary functors on a grid, use gen_grid_eval, below.
1115
1116template < typename functor_type >
1117void transform ( std::true_type , // functor is a b-spline evaluator
1118 grid_spec < functor_type::dim_in ,
1119 typename functor_type::in_ele_type > & grid ,
1120 const functor_type & functor ,
1121 vigra::MultiArrayView < functor_type::dim_in ,
1122 typename functor_type::out_type >
1123 & result ,
1124 int njobs = vspline::default_njobs ,
1125 vspline::atomic < bool > * p_cancel = 0
1126 )
1127{
1128 // create the generator functor
1129
1131 functor ,
1132 0 ,
1133 result.shape() ) ;
1134
1135 // and 'wield' it
1136
1137 wielding::generate_wield ( gg , result , njobs , p_cancel ) ;
1138}
1139
1140/// grid_eval_functor is used for 'generalized' grid evaluation, where
1141/// the functor passed in is not a bspline evaluator. For this general
1142/// case, we can't precalculate anything to speed things up, all we
1143/// need to do is pick the grid values and put them together to form
1144/// arguments for calls to the functor.
1145
1146template < typename _inner_type ,
1147 typename ic_type = int >
1150 < typename vspline::canonical_type
1151 < ic_type , _inner_type::dim_in > ,
1152 typename _inner_type::out_type ,
1153 _inner_type::vsize >
1154{
1155 typedef _inner_type inner_type ;
1156
1158 enum { dimension = inner_type::dim_in } ;
1159
1160 typedef typename vspline::canonical_type
1162
1163 typedef typename inner_type::out_type out_type ;
1164
1165 typedef typename inner_type::in_ele_type rc_type ;
1166
1167 typedef typename vspline::unary_functor
1169
1170 static_assert ( std::is_integral < ic_type > :: value ,
1171 "grid_eval_functor: must use integral coordinates" ) ;
1172
1174
1175 typedef grid_spec < inner_type::dim_in ,
1176 typename inner_type::in_ele_type > grid_spec_t ;
1177
1179
1181 const inner_type & _inner )
1182 : grid ( _grid ) ,
1183 inner ( _inner )
1184 { } ;
1185
1186 void eval ( const in_type & c , out_type & result ) const
1187 {
1188 typename inner_type::in_type cc ;
1189
1190 // for uniform access, we use reinterpretations of the coordinates
1191 // as nD types, even if they are only 1D. This is only used to
1192 // fill in 'cc', the cordinate to be fed to 'inner'.
1193
1194 typedef typename base_type::in_nd_ele_type nd_ic_type ;
1195 typedef typename inner_type::in_nd_ele_type nd_rc_type ;
1196
1197 const nd_ic_type & nd_c ( reinterpret_cast < const nd_ic_type & > ( c ) ) ;
1198 nd_rc_type & nd_cc ( reinterpret_cast < nd_rc_type & > ( cc ) ) ;
1199
1200 for ( int d = 0 ; d < dimension ; d++ )
1201 nd_cc [ d ] = grid [ d ] [ nd_c[d] ] ;
1202
1203 inner.eval ( cc , result ) ;
1204 }
1205
1206 template < typename = std::enable_if < ( vsize > 1 ) > >
1207 void eval ( const typename base_type::in_v & c ,
1208 typename base_type::out_v & result ) const
1209 {
1210 typename inner_type::in_v cc ;
1211
1212 typedef typename base_type::in_nd_ele_v nd_ic_v ;
1213 typedef typename inner_type::in_nd_ele_v nd_rc_v ;
1214
1215 const nd_ic_v & nd_c ( reinterpret_cast < const nd_ic_v & > ( c ) ) ;
1216 nd_rc_v & nd_cc ( reinterpret_cast < nd_rc_v & > ( cc ) ) ;
1217
1218 // TODO: we might optimize in two ways:
1219 // if the grid data are contiguous, we can issue a gather,
1220 // and if the coordinates above dimension 0 are equal for all e,
1221 // we can assign a scalar to nd_cc[d] for d > 0.
1222
1223 for ( int d = 0 ; d < dimension ; d++ )
1224 for ( int e = 0 ; e < vsize ; e++ )
1225 nd_cc[d][e] = grid[d][ nd_c[d][e] ] ;
1226
1227 inner.eval ( cc , result ) ;
1228 }
1229} ;
1230
1231/// generalized grid evaluation. The production of result values from
1232/// input values is done by an instance of grid_eval_functor, see above.
1233/// The template argument, ev_type, has to be a functor (usually this
1234/// will be a vspline::unary_functor). If the functor's in_type has
1235/// dim_in components, grid_spec must also contain dim_in 1D arrays,
1236/// since ev's input is put together by picking a value from each
1237/// of the arrays grid_spec points to. The result obviously has to have
1238/// as many dimensions.
1239
1240template < typename functor_type >
1241void transform ( std::false_type , // functor is not a b-spline evaluator
1242 grid_spec < functor_type::dim_in ,
1243 typename functor_type::in_ele_type > & grid ,
1244 const functor_type & functor ,
1245 vigra::MultiArrayView < functor_type::dim_in ,
1246 typename functor_type::out_type >
1247 & result ,
1248 int njobs = vspline::default_njobs ,
1249 vspline::atomic < bool > * p_cancel = 0
1250 )
1251{
1252 grid_eval_functor < functor_type > gev ( grid , functor ) ;
1253
1254 vspline::transform ( gev , result , njobs , p_cancel ) ;
1255}
1256
1257} ; // namespace detail
1258
1259/// This overload of 'transform' takes a vspline::grid_spec as it's
1260/// first parameter. This object defines a grid where each grid point
1261/// is made up from components taken from the per-axis grid values.
1262/// Let's say the grid_spec holds
1263///
1264/// { 1 , 2 } and { 3 , 4 },
1265///
1266/// then we get grid points
1267///
1268/// ( 1 , 3 ) , ( 2 , 3 ) , ( 1 , 4 ) , ( 2 , 4 )
1269///
1270/// The grid points are taken as input for the functor, and the results
1271/// are stored in 'result'. So with the simple example above, result
1272/// would have to be a 2D array and after the transform it would contain
1273///
1274/// { { f ( ( 1 , 3 ) ) , f ( ( 2 , 3 ) ) } ,
1275///
1276/// { f ( ( 1 , 4 ) ) , f ( ( 2 , 4 ) ) } }
1277///
1278/// The grid specification obviously takes up less space than the whole
1279/// grid would occupy, and if the functor is a b-spline evaluator,
1280/// additional performance gain is possible by precalculating the
1281/// evaluation weights. The distinction is made here by dispatching to
1282/// two specific overloads in namespace detail, which provide the
1283/// implementation. Note that only functors of type vspline::evaluator
1284/// will qualify as b-spline evaluators. The functors produced by the
1285/// factory functions 'make_evaluator' and 'make_safe_evaluator' will
1286/// *not* qualify; they are of vspline::grok_type and even though they
1287/// have a b-spline evaluator 'somewhere inside' this can't be accessed
1288/// due to the type erasure.
1289///
1290/// See the remarks on the two trailing parameters given with the two-array
1291/// overload of transform.
1292
1293template < typename functor_type >
1294void transform ( grid_spec < functor_type::dim_in ,
1295 typename functor_type::in_ele_type > & grid ,
1296 const functor_type & functor ,
1297 vigra::MultiArrayView < functor_type::dim_in ,
1298 typename functor_type::out_type >
1299 & result ,
1300 int njobs ,
1301 vspline::atomic < bool > * p_cancel ,
1302 std::false_type ) // is not 1D
1303{
1304 // make sure the grid specification has enough coordinates
1305
1306 for ( int d = 0 ; d < functor_type::dim_in ; d++ )
1307 assert ( grid[d].size() >= result.shape ( d ) ) ;
1308
1309 using uses_spline_t = typename
1310 std::is_base_of < bspline_evaluator_tag , functor_type > :: type ;
1311
1312 detail::transform ( uses_spline_t() , grid , functor , result ,
1313 njobs , p_cancel ) ;
1314}
1315
1316/// overload of grid-based transform for 1D grids. Obviously, this is
1317/// a case for using 'transform' directly taking the single set of
1318/// per-axis grid values as input, and this overload is here only for
1319/// completeness' sake - 'normal' user code will use an array-based
1320/// transform straight away.
1321
1322template < typename functor_type >
1324 typename functor_type::in_ele_type > & grid ,
1325 const functor_type & functor ,
1326 vigra::MultiArrayView < 1 ,
1327 typename functor_type::out_type >
1328 & result ,
1329 int njobs ,
1330 vspline::atomic < bool > * p_cancel ,
1331 std::true_type ) // is 1D
1332{
1333 assert ( grid[0].size() >= result.shape ( 0 ) ) ;
1334 vspline::transform ( functor , grid[0] , result , njobs , p_cancel ) ;
1335}
1336
1337// this is the dispatch routine, differentiating between the 1D and the
1338// nd case. The two variants are above.
1339
1340template < typename functor_type >
1341void transform ( grid_spec < functor_type::dim_in ,
1342 typename functor_type::in_ele_type > & grid ,
1343 const functor_type & functor ,
1344 vigra::MultiArrayView < functor_type::dim_in ,
1345 typename functor_type::out_type >
1346 & result ,
1347 int njobs = vspline::default_njobs ,
1348 vspline::atomic < bool > * p_cancel = 0 )
1349{
1350 static const bool is_1d ( functor_type::dim_in == 1 ) ;
1351
1352 transform ( grid , functor , result , njobs , p_cancel ,
1353 std::integral_constant < bool , is_1d > () ) ;
1354}
1355
1356/// for backward compatibility, we provide 'grid_eval'. This is deprecated
1357/// and will eventually go away, new code should use 'transform' above.
1358
1359template < typename functor_type >
1360void grid_eval ( grid_spec < functor_type::dim_in ,
1361 typename functor_type::in_ele_type > & grid ,
1362 const functor_type & functor ,
1363 vigra::MultiArrayView < functor_type::dim_in ,
1364 typename functor_type::out_type >
1365 & result ,
1366 int njobs = vspline::default_njobs ,
1367 vspline::atomic < bool > * p_cancel = 0 )
1368{
1369 transform ( grid , functor , result , njobs , p_cancel ) ;
1370}
1371
1372/// for backward compatibility, we provide 'gen_grid_eval'. This is deprecated
1373/// and will eventually go away, new code should use 'transform' above.
1374
1375template < typename functor_type >
1376void gen_grid_eval ( grid_spec < functor_type::dim_in ,
1377 typename functor_type::in_ele_type > & grid ,
1378 const functor_type & functor ,
1379 vigra::MultiArrayView < functor_type::dim_in ,
1380 typename functor_type::out_type >
1381 & result ,
1382 int njobs = vspline::default_njobs ,
1383 vspline::atomic < bool > * p_cancel = 0 )
1384{
1385 transform ( grid , functor , result , njobs , p_cancel ) ;
1386}
1387
1388/// restore restores the original data from the b-spline coefficients.
1389/// This is done efficiently using a separable convolution, the kernel
1390/// is simply a unit-spaced sampling of the basis function.
1391/// Since the filter uses internal buffering, using this routine
1392/// in-place is safe - meaning that 'target' may be bspl.core itself.
1393/// math_type, the data type for performing the actual maths on the
1394/// buffered data, and the type the data are converted to when they
1395/// are placed into the buffer, can be chosen, but I could not detect
1396/// any real benefits from using anything but the default, which is to
1397/// leave the data in their 'native' type.
1398///
1399/// an alternative way to restore is running an index-based
1400/// transform with an evaluator for the spline. This is much
1401/// less efficient, but the effect is the same:
1402///
1403/// auto ev = vspline::make_evaluator ( bspl ) ;
1404/// vspline::transform ( ev , target ) ;
1405///
1406/// Note that vsize, the vectorization width, can be passed explicitly.
1407/// If Vc is in use and math_ele_type can be used with hardware
1408/// vectorization, the arithmetic will be done with Vc::SimdArrays
1409/// of the given size. Otherwise 'goading' will be used: the data are
1410/// presented in simd_type of vsize math_ele_type, hoping that the
1411/// compiler may autovectorize the operation.
1412///
1413/// 'math_ele_type', the type used for arithmetic inside the filter,
1414/// defaults to the vigra RealPromote type of value_type's elementary.
1415/// This ensures appropriate treatment of integral-valued splines.
1416
1417// TODO hardcoded default vsize
1418
1419template < unsigned int dimension ,
1420 typename value_type ,
1421 typename math_ele_type
1422 = typename vigra::NumericTraits
1423 < typename vigra::ExpandElementResult
1424 < value_type > :: type
1425 > :: RealPromote ,
1426 size_t vsize
1428 >
1430 vigra::MultiArrayView < dimension , value_type > & target )
1431{
1432 if ( target.shape() != bspl.core.shape() )
1433 throw shape_mismatch
1434 ( "restore: spline's core shape and target array shape must match" ) ;
1435
1436 if ( bspl.spline_degree < 2 )
1437 {
1438 // we can handle the degree 0 and 1 cases very efficiently,
1439 // since we needn't apply a filter at all. This is an
1440 // optimization, the filter code would still perform
1441 // correctly without it.
1442
1443 if ( (void*) ( bspl.core.data() ) != (void*) ( target.data() ) )
1444 {
1445 // operation is not in-place, copy data to target
1446 target = bspl.core ;
1447 }
1448 return ;
1449 }
1450
1451 // first assemble the arguments for the filter
1452
1453 int degree = bspl.spline_degree ;
1454 int headroom = degree / 2 ;
1455 int ksize = headroom * 2 + 1 ;
1456
1457 // KFJ 2019-09-18 now using new convenience function 'get_kernel'
1458
1459 vigra::MultiArray < 1 , xlf_type > kernel ( ksize ) ;
1460 get_kernel ( degree , kernel ) ;
1461
1462// xlf_type kernel [ ksize ] ;
1463//
1464// // pick the precomputed basis function values for the kernel.
1465// // Note how the values in precomputed_basis_function_values
1466// // (see poles.h) are provided at half-unit steps, hence the
1467// // index acrobatics.
1468//
1469// for ( int k = - headroom ; k <= headroom ; k++ )
1470// {
1471// int pick = 2 * std::abs ( k ) ;
1472// kernel [ k + headroom ]
1473// = vspline_constants
1474// ::precomputed_basis_function_values [ degree ]
1475// [ pick ] ;
1476// }
1477
1478 // the arguments have to be passed one per axis. While most
1479 // arguments are the same throughout, the boundary conditions
1480 // may be different for each axis.
1481
1482 std::vector < vspline::fir_filter_specs > vspecs ;
1483
1484 for ( int axis = 0 ; axis < dimension ; axis++ )
1485 {
1486 vspecs.push_back
1488 ( bspl.bcv [ axis ] , ksize , headroom , kernel.data() ) ) ;
1489 }
1490
1491 // KFJ 2018-05-08 with the automatic use of vectorization the
1492 // distinction whether math_ele_type is 'vectorizable' or not
1493 // is no longer needed: simdized_type will be a Vc::SimdArray
1494 // if possible, a vspline::simd_type otherwise.
1495
1496 typedef typename vspline::convolve
1498 math_ele_type ,
1499 vsize
1500 > filter_type ;
1501
1502 // now we have the filter's type, create an instance and
1503 // use it to affect the restoration of the original data.
1504
1506 < value_type , value_type , dimension , filter_type >
1507 ( bspl.core , target , vspecs ) ;
1508}
1509
1510/// overload of 'restore' writing the result of the operation back to
1511/// the array which is passed in. This looks like an in-place operation,
1512/// but the data are in fact moved to a buffer stripe by stripe, then
1513/// the arithmetic is done on the buffer and finally the buffer is
1514/// written back. This is repeated for each dimension of the array.
1515
1516template < int dimension ,
1517 typename value_type ,
1518 typename math_ele_type
1519 = typename vigra::NumericTraits
1520 < typename vigra::ExpandElementResult
1521 < value_type > :: type
1522 > :: RealPromote ,
1526{
1527 restore < dimension , value_type , math_ele_type , vsize >
1528 ( bspl , bspl.core ) ;
1529}
1530
1531} ; // end of namespace vspline
1532
1533#endif // VSPLINE_TRANSFORM_H
vigra::TinyVector< float, 2 > coordinate_type
Definition: ca_correct.cc:107
vspline::bspline< pixel_type, 2 > spline_type
Definition: ca_correct.cc:111
separable convolution of nD arrays
double rc_type
Definition: eval.cc:94
@ vsize
Definition: eval.cc:96
code to evaluate uniform b-splines
code to distribute the processing of bulk data to several threads
void transform(std::true_type, grid_spec< functor_type::dim_in, typename functor_type::in_ele_type > &grid, const functor_type &functor, vigra::MultiArrayView< functor_type::dim_in, typename functor_type::out_type > &result, int njobs=vspline::default_njobs, vspline::atomic< bool > *p_cancel=0)
given the generator functor 'grev_generator' (above), performing the grid_eval becomes trivial: const...
Definition: transform.h:1117
Definition: basis.h:79
void reduce(const sink_functor_type &functor, const vigra::MultiArrayView< dimension, typename sink_functor_type::in_type > &input, int njobs=vspline::default_njobs, vspline::atomic< bool > *p_cancel=0)
reduce function processing by value. sink_functor_type is a functor taking values (which are loaded f...
Definition: transform.h:286
void get_kernel(const int &degree, vigra::MultiArrayView< 1, target_type > &kernel, const bool &odd=true)
this function deposits the reconstruction kernel in the array 'kernel'. This kernel can be used to co...
Definition: basis.h:589
void transform(const unary_functor_type &functor, const vigra::MultiArrayView< dimension, typename unary_functor_type::in_type > &input, vigra::MultiArrayView< dimension, typename unary_functor_type::out_type > &output, int njobs=vspline::default_njobs, vspline::atomic< bool > *p_cancel=0)
implementation of two-array transform using wielding::coupled_wield.
Definition: transform.h:211
void filter(const vigra::MultiArrayView< D, in_type > &input, vigra::MultiArrayView< D, out_type > &output, types ... args)
vspline::filter is the common entry point for filter operations in vspline. This routine does not yet...
Definition: filter.h:1495
typename vigra::ExpandElementResult< T > ::type ET
using definition for the 'elementary type' of a type via vigra's ExpandElementResult mechanism....
Definition: common.h:110
void apply(const unary_functor_type &ev, vigra::MultiArrayView< dimension, typename unary_functor_type::out_type > &output, int njobs=vspline::default_njobs, vspline::atomic< bool > *p_cancel=0)
we code 'apply' as a special variant of 'transform' where the output is also used as input,...
Definition: transform.h:531
void restore(const vspline::bspline< value_type, dimension > &bspl, vigra::MultiArrayView< dimension, value_type > &target)
restore restores the original data from the b-spline coefficients. This is done efficiently using a s...
Definition: transform.h:1429
void gen_grid_eval(grid_spec< functor_type::dim_in, typename functor_type::in_ele_type > &grid, const functor_type &functor, vigra::MultiArrayView< functor_type::dim_in, typename functor_type::out_type > &result, int njobs=vspline::default_njobs, vspline::atomic< bool > *p_cancel=0)
for backward compatibility, we provide 'gen_grid_eval'. This is deprecated and will eventually go awa...
Definition: transform.h:1376
const int default_njobs
Definition: multithread.h:220
vigra::TinyVector< vspline::bc_code, dimension > bcv_type
a type for a set of boundary condition codes, one per axis
Definition: transform.h:554
typename std::conditional< N==1, ET< T >, vigra::TinyVector< ET< T >, N > > ::type canonical_type
produce the 'canonical type' for a given type T. If T is single-channel, this will be the elementary ...
Definition: common.h:225
typename vector_traits< T, N > ::type simdized_type
this alias is used as a shorthand to pick the vectorized type for a given type T and a size N from 'v...
Definition: vector.h:459
void remap(const vigra::MultiArrayView< cf_dimension, original_type > &input, const vigra::MultiArrayView< trg_dimension, coordinate_type > &coordinates, vigra::MultiArrayView< trg_dimension, result_type > &output, bcv_type< bcv_dimension > bcv=bcv_type< bcv_dimension >(MIRROR), int degree=3, int njobs=vspline::default_njobs, vspline::atomic< bool > *p_cancel=0)
Implementation of 'classic' remap, which directly takes an array of values and remaps it,...
Definition: transform.h:600
void grid_eval(grid_spec< functor_type::dim_in, typename functor_type::in_ele_type > &grid, const functor_type &functor, vigra::MultiArrayView< functor_type::dim_in, typename functor_type::out_type > &result, int njobs=vspline::default_njobs, vspline::atomic< bool > *p_cancel=0)
for backward compatibility, we provide 'grid_eval'. This is deprecated and will eventually go away,...
Definition: transform.h:1360
std::atomic< T > atomic
Definition: multithread.h:224
@ MIRROR
Definition: common.h:72
vigra::TinyVector< vigra::MultiArrayView< 1, rc_ele_type >, dimension > grid_spec
Definition: transform.h:701
void generate_wield(const functor_type functor, vigra::MultiArrayView< dimension, typename functor_type::out_type > &output, int njobs=vspline::default_njobs, vspline::atomic< bool > *p_cancel=0)
generate_wield uses a generator function to produce data. Inside vspline, this is used for grid_eval,...
Definition: wielding.h:2152
int ic_type
Definition: interleave.h:98
precalculated prefilter poles and basis function values
vspline creates vigra::MultiArrays of vectorized types. As long as the vectorized types are Vc::SimdA...
Definition: common.h:267
class bspline now builds on class bspline_base, adding coefficient storage, while bspline_base provid...
Definition: bspline.h:499
view_type core
Definition: bspline.h:566
void prefilter(vspline::xlf_type boost=vspline::xlf_type(1), int njobs=vspline::default_njobs)
prefilter converts the knot point data in the 'core' area into b-spline coefficients....
Definition: bspline.h:815
const bcv_type bcv
Definition: bspline.h:212
class convolve provides the combination of class fir_filter above with a vector-friendly buffer....
Definition: convolve.h:387
we need a 'generator functor' to implement grid_eval using the code in wielding.h....
Definition: transform.h:718
vigra::MultiArrayView< dimension, typename evaluator_type::trg_type > target_type
Definition: transform.h:739
vspline::vector_traits< weight_type, vsize >::type weight_v
Definition: transform.h:736
vigra::MultiArray< 2, weight_type > weight
Definition: transform.h:797
static const size_t vsize
Definition: transform.h:722
vspline::vector_traits< int, vsize >::type ofs_ele_v
Definition: transform.h:743
evaluator_type::math_ele_type weight_type
Definition: transform.h:726
grev_generator(const vspline::grid_spec< dimension, rc_type > &_grid, const evaluator_type &_itp, size_t _axis, shape_type _shape)
Definition: transform.h:872
evaluator_type::rc_ele_type rc_type
Definition: transform.h:737
void reset(shape_type crd, size_t _aggregates)
Definition: transform.h:818
typename vspline::allocator_traits< weight_v > ::type allocator_t
Definition: transform.h:802
static const size_t dimension
Definition: transform.h:721
vigra::TinyVector< size_t, dimension > shape_type
Definition: transform.h:740
evaluator_type::out_ele_v out_ele_v
Definition: transform.h:732
evaluator_type::inner_type inner_type
Definition: transform.h:725
vigra::MultiArray< 2, weight_v, allocator_t > vweight
Definition: transform.h:804
void eval(out_v &result)
Definition: transform.h:968
vspline::grid_spec< dimension, rc_type > grid_spec_type
Definition: transform.h:745
evaluator_type::out_ele_type out_ele_type
Definition: transform.h:729
evaluator_type::out_v out_v
Definition: transform.h:731
void eval(out_type &result)
Definition: transform.h:1013
void eval(vigra::MultiArrayView< 1, out_type > &result)
Definition: transform.h:1065
evaluator_type::out_type out_type
Definition: transform.h:728
evaluator_type::out_nd_ele_type out_nd_ele_type
Definition: transform.h:730
void eval(vigra::MultiArrayView< 1, out_ele_v > &vresult, vigra::MultiArrayView< 1, out_type > &leftover)
Definition: transform.h:1039
evaluator_type::out_nd_ele_v out_nd_ele_v
Definition: transform.h:733
const grid_spec_type & grid
Definition: transform.h:749
static const size_t channels
Definition: transform.h:723
const evaluator_type & itp
Definition: transform.h:750
weight_type * grid_weight[dimension]
Definition: transform.h:770
grid_eval_functor is used for 'generalized' grid evaluation, where the functor passed in is not a bsp...
Definition: transform.h:1154
grid_eval_functor(grid_spec_t &_grid, const inner_type &_inner)
Definition: transform.h:1180
void eval(const in_type &c, out_type &result) const
Definition: transform.h:1186
inner_type::in_ele_type rc_type
Definition: transform.h:1165
vspline::canonical_type< ic_type, dimension > in_type
Definition: transform.h:1161
vspline::unary_functor< in_type, out_type, vsize > base_type
Definition: transform.h:1168
grid_spec< inner_type::dim_in, typename inner_type::in_ele_type > grid_spec_t
Definition: transform.h:1176
void eval(const typename base_type::in_v &c, typename base_type::out_v &result) const
Definition: transform.h:1207
inner_type::out_type out_type
Definition: transform.h:1163
fir_filter_specs holds the parameters for a filter performing a convolution along a single axis....
Definition: convolve.h:93
shape mismatch is the exception which is thrown if the shapes of an input array and an output array d...
Definition: common.h:297
class unary_functor provides a functor object which offers a system of types for concrete unary funct...
vigra::TinyVector< in_ele_type, dim_in > in_nd_ele_type
vector_traits< IN, vsize >::nd_ele_v in_nd_ele_v
vector_traits< IN, vsize >::type in_v
vectorized in_type and out_type. vspline::vector_traits supplies these types so that multidimensional...
vector_traits< OUT, vsize >::type out_v
with the definition of 'simd_traits', we can proceed to implement 'vector_traits': struct vector_trai...
Definition: vector.h:344
vs_adapter wraps a vspline::unary_functor to produce a functor which is compatible with the wielding ...
Definition: wielding.h:1960
same procedure for a vspline::sink_type
Definition: wielding.h:2002
Implementation of vspline::transform.