vspline 1.1.0
Generic C++11 Code for Uniform B-Splines
prefilter.h
Go to the documentation of this file.
1/************************************************************************/
2/* */
3/* vspline - a set of generic tools for creation and evaluation */
4/* of uniform b-splines */
5/* */
6/* Copyright 2015 - 2023 by Kay F. Jahnke */
7/* */
8/* The git repository for this software is at */
9/* */
10/* https://bitbucket.org/kfj/vspline */
11/* */
12/* Please direct questions, bug reports, and contributions to */
13/* */
14/* kfjahnke+vspline@gmail.com */
15/* */
16/* Permission is hereby granted, free of charge, to any person */
17/* obtaining a copy of this software and associated documentation */
18/* files (the "Software"), to deal in the Software without */
19/* restriction, including without limitation the rights to use, */
20/* copy, modify, merge, publish, distribute, sublicense, and/or */
21/* sell copies of the Software, and to permit persons to whom the */
22/* Software is furnished to do so, subject to the following */
23/* conditions: */
24/* */
25/* The above copyright notice and this permission notice shall be */
26/* included in all copies or substantial portions of the */
27/* Software. */
28/* */
29/* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND */
30/* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES */
31/* OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND */
32/* NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT */
33/* HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, */
34/* WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING */
35/* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR */
36/* OTHER DEALINGS IN THE SOFTWARE. */
37/* */
38/************************************************************************/
39
40/*! \file prefilter.h
41
42 \brief Code to create the coefficient array for a b-spline.
43
44 Note: the bulk of the code was factored out to filter.h, while this text still
45 outlines the complete filtering process.
46
47 B-spline coefficients can be generated in two ways (that I know of): the first
48 is by solving a set of equations which encode the constraints of the spline.
49 A good example of how this is done can be found in libeinspline. I term it
50 the 'linear algebra approach'. In this implementation, I have chosen what I
51 call the 'DSP approach'. In a nutshell, the DSP approach looks at the b-spline's
52 reconstruction as a convolution of the coefficients with a specific kernel. This
53 kernel acts as a low-pass filter. To counteract the effect of this filter and
54 obtain the input signal from the convolution of the coefficients, a high-pass
55 filter with the inverse transfer function to the low-pass is used. This high-pass
56 has infinite support, but can still be calculated precisely within the bounds of
57 the arithmetic precision the CPU offers, due to the properties it has.
58
59 I recommend [CIT2000] for a formal explanation. At the core of my prefiltering
60 routines there is code from Philippe Thevenaz' accompanying code to this paper,
61 with slight modifications translating it to C++ and making it generic.
62 The greater part of this file deals with 'generifying' the process and to
63 employing multithreading and the CPU's vector units to gain speed.
64
65 This code makes heavy use of vigra, which provides handling of multidimensional
66 arrays and efficient handling of aggregate types - to only mention two of it's
67 many qualities. Explicit vectorization is done with Vc, which allowed me to code
68 the horizontal vectorization I use in a generic fashion. If Vc is not available,
69 the code falls back to presenting the data so that autovectorization becomes
70 very likely - a technique I call 'goading'.
71
72 In another version of this code I used vigra's BSplineBase class to obtain prefilter
73 poles. This required passing the spline degree/order as a template parameter. Doing it
74 like this allows to make the Poles static members of the solver, but at the cost of
75 type proliferation. Here I chose not to follow this path and pass the spline order as a
76 parameter to the spline's constructor, thus reducing the number of solver specializations
77 and allowing automated testing with loops over the degree. This variant may be slightly
78 slower. The prefilter poles I use are precalculated externally with gsl/blas and polished
79 in high precision to provide the most precise data possible. this avoids using
80 vigra's polynomial root code which failed for high degrees when I used it.
81
82 [CIT2000] Interpolation Revisited by Philippe Thévenaz, Member,IEEE, Thierry Blu, Member, IEEE, and Michael Unser, Fellow, IEEE in IEEE TRANSACTIONS ON MEDICAL IMAGING, VOL. 19, NO. 7, JULY 2000,
83*/
84
85#ifndef VSPLINE_PREFILTER_H
86#define VSPLINE_PREFILTER_H
87
88#include <limits>
89#include "common.h"
90#include "poles.h"
91#include "filter.h"
92
93namespace vspline {
94
95using namespace std ;
96using namespace vigra ;
97
98/// overall_gain is a helper routine:
99/// Simply executing the filtering code by itself will attenuate the signal. Here
100/// we calculate the gain which, pre-applied to the signal, will cancel this effect.
101/// While this code was initially part of the filter's constructor, I took it out
102/// to gain some flexibility by passing in the gain as a parameter.
103///
104/// Note that higher-degree splines need filtering with some poles which are *very*
105/// small numerically. This is a problem: The data get 'squashed', since there are
106/// mathematical operations between attenuated and unattenuated values. So for high
107/// spline degrees, float data aren't suitable, and even doubles and long doubles
108/// suffer from squashing and lose precision.
109///
110/// Note also how we perform the arithmetics in this routine in the highest precision
111/// available. Calling code will cast the product down to the type it uses for maths.
112
113static xlf_type overall_gain ( const int & nbpoles ,
114 const xlf_type * const pole )
115{
116 xlf_type lambda = 1 ;
117
118 for ( int k = 0 ; k < nbpoles ; k++ )
119
120 lambda *= ( 1 - pole[k] ) * ( 1 - 1 / pole[k] ) ;
121
122 return lambda ;
123}
124
125/// overload of overall_gain taking the spline's degree
126
127static xlf_type overall_gain ( const int & spline_degree )
128{
129 if ( spline_degree < 2 )
130 return 1 ;
131 assert ( spline_degree <= vspline_constants::max_degree ) ;
132 return overall_gain ( spline_degree / 2 ,
133 vspline_constants::precomputed_poles [ spline_degree ] ) ;
134}
135
136/// structure to hold specifications for an iir_filter object.
137/// This set of parameters has to be passed through from
138/// the calling code through the multithreading code to the worker threads
139/// where the filter objects are finally constructed. Rather than passing
140/// the parameters via some variadic mechanism, it's more concise and
141/// expressive to contain them in a structure and pass that around.
142/// The filter itself inherits its specification type, and if the code
143/// knows the handler's type, it can derive the spec type. This way the
144/// argument passing can be formalized, allowing for uniform handling of
145/// several different filter types with the same code. Here we have the
146/// concrete parameter set needed for b-spline prefiltering. We'll pass
147/// one set of 'specs' per axis; it contains:
148/// - the boundary condition for this axis
149/// - the number of filter poles (see poles.h)
150/// - a pointer to npoles poles
151/// - the acceptable tolerance
152
153// TODO: KFJ 2018-03-21 added another member 'boost' to the filter specs.
154// This value is used as a factor on 'gain', resulting in the signal
155// being amplified by this factor at no additional computational cost,
156// which might be desirable when pulling integral signals up to the
157// maximal dynamic range. But beware: there are some corner cases with
158// splines holding integral data which may cause wrong results
159// if 'boost' is too large. Have a look at int_spline.cc and also the
160// comments above _process_1d in filter.h
161
163{
165 int npoles ;
166 const xlf_type * pole ;
169
171 int _npoles ,
172 const xlf_type * _pole ,
173 xlf_type _tolerance ,
174 xlf_type _boost = xlf_type ( 1 )
175 )
176 : bc ( _bc ) ,
177 npoles ( _npoles ) ,
178 pole ( _pole ) ,
179 tolerance ( _tolerance ) ,
180 boost ( _boost )
181 { } ;
182} ;
183
184/// class iir_filter implements an n-pole forward/backward recursive filter
185/// to be used for b-spline prefiltering. It inherits from the 'specs'
186/// class for easy initialization.
187
188template < typename in_type ,
189 typename out_type = in_type ,
190 typename _math_type = out_type >
192: public iir_filter_specs
193{
194 typedef _math_type math_type ;
195
196 typedef vigra::MultiArrayView < 1 , in_type > in_buffer_type ;
197 typedef vigra::MultiArrayView < 1 , out_type > out_buffer_type ;
198
199 /// typedef the fully qualified type for brevity, to make the typedefs below
200 /// more legible
201
203
204 xlf_type gain ;
205 std::vector < int > horizon ;
206
207 // we handle the polymorphism internally, working with method pointers.
208 // this saves us having to set up a base class with virtual member functions
209 // and inheriting from it.
210
211 typedef void ( filter_type::*p_solve ) ( const in_buffer_type & input ,
212 out_buffer_type & output ) const ;
213 typedef math_type ( filter_type::*p_icc ) ( const in_buffer_type & buffer , int k ) const ;
214 typedef math_type ( filter_type::*p_iccx ) ( const out_buffer_type & buffer , int k ) const ;
215 typedef math_type ( filter_type::*p_iacc ) ( const out_buffer_type & buffer , int k ) const ;
216
217 // these are the method pointers used:
218
219 p_solve _p_solve ; ///< pointer to the solve method
220 p_icc _p_icc ; ///< pointer to calculation of initial causal coefficient (from in_)
221 p_iccx _p_iccx ; ///< pointer to calculation of initial causal coefficient (from out_)
222 p_iacc _p_iacc ; ///< pointer to calculation of initial anticausal coefficient
223
224public:
225
226 // this filter runs over the data several times and stores the result
227 // of each run back to be picked up by the next run. This has certain
228 // implications: if out_type is an integral type, using it to store
229 // intermediates will produce quantization errors with every run.
230 // this flag signals to the wielding code in filter.h that intermediates
231 // need to be stored, so it can avoid the problem by providing a buffer
232 // in a 'better' type as output ('output' is used to store intermediates)
233 // and converting the data back to the 'real' output afterwards.
234
235 static const bool is_single_pass { false } ;
236
237 /// calling code may have to set up buffers with additional
238 /// space around the actual data to allow filtering code to
239 /// 'run up' to the data, shedding margin effects in the
240 /// process. For an IIR filter, this is theoretically
241 /// infinite , but since we usually work to a specified precision,
242 /// we can pass 'horizon' - horizon[0] containing the largest
243 /// of the horizon values.
244
245 int get_support_width ( ) const
246 {
247 if ( npoles )
248 return horizon [ 0 ] ;
249
250 // TODO quick fix. I think this case never occurs, since the filtering
251 // code is avoided for npoles < 1
252
253 return 64 ;
254 }
255
256 /// solve() takes two buffers, one to the input data and one to the output space.
257 /// The containers must have the same size. It's safe to use solve() in-place.
258
259 void solve ( const in_buffer_type & input , out_buffer_type & output )
260 {
261 assert ( input.size ( ) == output.size ( ) ) ;
262 ( this->*_p_solve ) ( input , output ) ;
263 }
264
265 /// for in-place operation we use the same filter routine.
266
267 void solve ( out_buffer_type & data )
268 {
269 ( this->*_p_solve ) ( data , data ) ;
270 }
271
272// I use adapted versions of P. Thevenaz' code to calculate the initial causal and
273// anticausal coefficients for the filter. The code is changed just a little to work
274// with an iterator instead of a C vector.
275
276private:
277
278/// The code for mirrored BCs is adapted from P. Thevenaz' code, the other routines are my
279/// own doing, with aid from a digest of spline formulae I received from P. Thevenaz and which
280/// were helpful to verify the code against a trusted source.
281///
282/// note how, in the routines to find the initial causal coefficient, there are two different
283/// cases: first the 'accelerated loop', which is used when the theoretically infinite sum of
284/// terms has reached sufficient precision , and the 'full loop', which implements the mathematically
285/// precise representation of the limes of the infinite sum towards an infinite number of terms,
286/// which happens to be calculable due to the fact that the absolute value of all poles is < 1 and
287///
288/// lim n a
289/// sum a * q ^ k = ---
290/// n->inf k=0 1-q
291///
292/// first are mirror BCs. This is mirroring 'on bounds',
293/// f ( -x ) == f ( x ) and f ( n-1 - x ) == f (n-1 + x)
294///
295/// note how mirror BCs are equivalent to requiring the first derivative to be zero in the
296/// linear algebra approach. Obviously with mirrored data this has to be the case, the location
297/// where mirroring occurs is always an extremum. So this case covers 'FLAT' BCs as well
298///
299/// the initial causal coefficient routines are templated by buffer type, because depending
300/// on the circumstances, they may be used either on the input or the output.
301
302// TODO format to vspline standard
303
304/// we use accessor classes to access the input and output buffers.
305/// To access an input buffer (which remains constant), we use
306/// 'as_math_type' which simply provides the ith element cast to
307/// math_type. This makes for legible, concise code. We return
308/// const math_type from operator[] to make sure X[..] won't be
309/// accidentally assigned to.
310
311template < typename buffer_type >
312struct as_math_type
313{
314 const buffer_type & c ;
315
316 as_math_type ( const buffer_type & _c )
317 : c ( _c )
318 { } ;
319
320 const math_type operator[] ( int i ) const
321 {
322 return math_type ( c [ i ] ) ;
323 }
324} ;
325
326/// the second helper class, as_target, is meant for output
327/// buffers. Here we need to read as well as write. Writing is
328/// rare, so I use a method 'store' in preference to doing artistry
329/// with a proxy. We return const math_type from operator[] to make
330/// sure X[..] won't be accidentally assigned to.
331
332template < typename buffer_type >
333struct as_target
334{
335 buffer_type & x ;
336
337 as_target ( buffer_type & _x )
338 : x ( _x )
339 { } ;
340
341 const math_type operator[] ( int i ) const
342 {
343 return math_type ( x [ i ] ) ;
344 }
345
346 void store ( const math_type & v , const int & i )
347 {
348 x [ i ] = typename buffer_type::value_type ( v ) ;
349 }
350} ;
351
352template < class buffer_type >
353math_type icc_mirror ( const buffer_type & _c , int k ) const
354{
355 int M = _c.size ( ) ;
356 as_math_type < buffer_type > c ( _c ) ;
357
358 math_type z = math_type ( pole[k] ) ;
359 math_type zn , z2n , iz ;
360 math_type Sum ;
361 int n ;
362
363 if ( horizon[k] < M )
364 {
365 /* accelerated loop */
366 zn = z ;
367 Sum = c[0] ;
368 for ( n = 1 ; n < horizon[k] ; n++ )
369 {
370 Sum += zn * c[n] ;
371 zn *= z ;
372 }
373 }
374 else
375 {
376 /* full loop */
377 zn = z ;
378 iz = math_type ( 1.0 ) / z ;
379 z2n = math_type ( pow ( xlf_type ( pole[k] ) , xlf_type ( M - 1 ) ) ) ;
380 Sum = c[0] + z2n * c[M - 1] ;
381 z2n *= z2n * iz ;
382 for ( n = 1 ; n <= M - 2 ; n++ )
383 {
384 Sum += ( zn + z2n ) * c[n] ;
385 zn *= z ;
386 z2n *= iz ;
387 }
388 Sum /= ( math_type ( 1.0 ) - zn * zn ) ;
389 }
390 return ( Sum ) ;
391}
392
393/// the initial anticausal coefficient routines are always called with the output buffer,
394/// so they needn't be templated like the icc routines.
395///
396/// I still haven't understood the 'magic' which allows to calculate the initial anticausal
397/// coefficient from just two results of the causal filter, but I assume it's some exploitation
398/// of the symmetry of the data. This code is adapted from P. Thevenaz'.
399
400math_type iacc_mirror ( const out_buffer_type & _c , int k ) const
401{
402 int M = _c.size ( ) ;
403 as_math_type < out_buffer_type > c ( _c ) ;
404
405 math_type z = math_type ( pole[k] ) ;
406
407 return ( math_type ( z / ( z * z - math_type ( 1.0 ) ) ) * ( c [ M - 1 ] + z * c [ M - 2 ] ) ) ;
408}
409
410/// next are 'antimirrored' BCs. This is the same as 'natural' BCs: the signal is
411/// extrapolated via point mirroring at the ends, resulting in point-symmetry at the ends,
412/// which is equivalent to the second derivative being zero, the constraint used in
413/// the linear algebra approach to calculate 'natural' BCs:
414///
415/// f ( x ) - f ( 0 ) == f ( 0 ) - f ( -x ) ;
416/// f ( x+n-1 ) - f ( n-1 ) == f ( n-1 ) - f (n-1-x)
417
418template < class buffer_type >
419math_type icc_natural ( const buffer_type & _c , int k ) const
420{
421 int M = _c.size ( ) ;
422 as_math_type < buffer_type > c ( _c ) ;
423
424 math_type z = math_type ( pole[k] ) ;
425 math_type zn , z2n , iz ;
426 math_type Sum , c02 ;
427 int n ;
428
429 // f ( x ) - f ( 0 ) == f ( 0 ) - f (-x)
430 // f ( -x ) == 2 * f ( 0 ) - f (x)
431
432 if ( horizon[k] < M )
433 {
434 c02 = c[0] + c[0] ;
435 zn = z ;
436 Sum = c[0] ;
437 for ( n = 1 ; n < horizon[k] ; n++ )
438 {
439 Sum += zn * ( c02 - c[n] ) ;
440 zn *= z ;
441 }
442 return ( Sum ) ;
443 }
444 else {
445 zn = z ;
446 iz = math_type ( 1.0 ) / z ;
447 z2n = math_type ( pow ( xlf_type ( pole[k] ) , xlf_type ( M - 1 )) ) ;
448 Sum = math_type ( ( math_type ( 1.0 ) + z ) / ( math_type ( 1.0 ) - z ) )
449 * ( c[0] - z2n * c[M - 1] ) ;
450 z2n *= z2n * iz ; // z2n == z^2M-3
451 for ( n = 1 ; n <= M - 2 ; n++ )
452 {
453 Sum -= ( zn - z2n ) * c[n] ;
454 zn *= z ;
455 z2n *= iz ;
456 }
457 return ( Sum / ( math_type ( 1.0 ) - zn * zn )) ;
458 }
459}
460
461/// I still haven't understood the 'magic' which allows to calculate the initial anticausal
462/// coefficient from just two results of the causal filter, but I assume it's some exploitation
463/// of the symmetry of the data. This code is adapted from P. Thevenaz' formula.
464
465math_type iacc_natural ( const out_buffer_type & _c , int k ) const
466{
467 int M = _c.size ( ) ;
468 as_math_type < out_buffer_type > c ( _c ) ;
469
470 math_type z = math_type ( pole[k] ) ;
471
472 return - math_type ( z / ( ( math_type ( 1.0 ) - z ) * ( math_type ( 1.0 ) - z ) ) ) * ( c [ M - 1 ] - z * c [ M - 2 ] ) ;
473}
474
475/// next are reflective BCs. This is mirroring 'between bounds':
476///
477/// f ( -1 - x ) == f ( x ) and f ( n + x ) == f (n-1 - x)
478///
479/// I took Thevenaz' routine for mirrored data as a template and adapted it.
480/// 'reflective' BCs have some nice properties which make them more suited than mirror BCs in
481/// some situations:
482/// - the artificial discontinuity is 'pushed out' half a unit spacing
483/// - the extrapolated data are just as long as the source data
484/// - they play well with even splines
485
486template < class buffer_type >
487math_type icc_reflect ( const buffer_type & _c , int k ) const
488{
489 int M = _c.size ( ) ;
490 as_math_type < buffer_type > c ( _c ) ;
491
492 math_type z = math_type ( pole[k] ) ;
493 math_type zn , z2n , iz ;
494 math_type Sum ;
495 int n ;
496
497 if ( horizon[k] < M )
498 {
499 zn = z ;
500 Sum = c[0] ;
501 for ( n = 0 ; n < horizon[k] ; n++ )
502 {
503 Sum += zn * c[n] ;
504 zn *= z ;
505 }
506 return ( Sum ) ;
507 }
508 else
509 {
510 zn = z ;
511 iz = math_type ( 1.0 ) / z ;
512 z2n = math_type ( pow ( xlf_type ( pole[k] ) , xlf_type ( 2 * M )) ) ;
513 Sum = 0 ;
514 for ( n = 0 ; n < M - 1 ; n++ )
515 {
516 Sum += ( zn + z2n ) * c[n] ;
517 zn *= z ;
518 z2n *= iz ;
519 }
520 Sum += ( zn + z2n ) * c[n] ;
521 return c[0] + Sum / ( math_type ( 1.0 ) - zn * zn ) ;
522 }
523}
524
525/// I still haven't understood the 'magic' which allows to calculate the initial anticausal
526/// coefficient from just one result of the causal filter, but I assume it's some exploitation
527/// of the symmetry of the data. I have to thank P. Thevenaz for his formula which let me code:
528
529math_type iacc_reflect ( const out_buffer_type & _c , int k ) const
530{
531 int M = _c.size ( ) ;
532 as_math_type < out_buffer_type > c ( _c ) ;
533
534 math_type z = math_type ( pole[k] ) ;
535
536 return c[M - 1] / ( math_type ( 1.0 ) - math_type ( 1.0 ) / z ) ;
537}
538
539/// next is periodic BCs. so, f ( x ) = f (x+N)
540///
541/// Implementing this is more straightforward than implementing the various mirrored types.
542/// The mirrored types are, in fact, also periodic, but with a period twice as large, since they
543/// repeat only after the first reflection. So especially the code for the full loop is more complex
544/// for mirrored types. The down side here is the lack of symmetry to exploit, which made me code
545/// a loop for the initial anticausal coefficient as well.
546
547template < class buffer_type >
548math_type icc_periodic ( const buffer_type & _c , int k ) const
549{
550 int M = _c.size ( ) ;
551 as_math_type < buffer_type > c ( _c ) ;
552
553 math_type z = math_type ( pole[k] ) ;
554 math_type zn ;
555 math_type Sum ;
556 int n ;
557
558 if ( horizon[k] < M )
559 {
560 zn = z ;
561 Sum = c[0] ;
562 for ( n = M - 1 ; n > ( M - horizon[k] ) ; n-- )
563 {
564 Sum += zn * c[n] ;
565 zn *= z ;
566 }
567 }
568 else
569 {
570 zn = z ;
571 Sum = c[0] ;
572 for ( n = M - 1 ; n > 0 ; n-- )
573 {
574 Sum += zn * c[n] ;
575 zn *= z ;
576 }
577 Sum /= ( math_type ( 1.0 ) - zn ) ;
578 }
579 return Sum ;
580}
581
582math_type iacc_periodic ( const out_buffer_type & _c , int k ) const
583{
584 int M = _c.size ( ) ;
585 as_math_type < out_buffer_type > c ( _c ) ;
586
587 math_type z = math_type ( pole[k] ) ;
588 math_type zn ;
589 math_type Sum ;
590
591 if ( horizon[k] < M )
592 {
593 zn = z ;
594 Sum = c[M-1] * z ;
595 for ( int n = 0 ; n < horizon[k] ; n++ )
596 {
597 zn *= z ;
598 Sum += zn * c[n] ;
599 }
600 Sum = -Sum ;
601 }
602 else
603 {
604 zn = z ;
605 Sum = c[M-1] ;
606 for ( int n = 0 ; n < M - 1 ; n++ )
607 {
608 Sum += zn * c[n] ;
609 zn *= z ;
610 }
611 Sum = z * Sum / ( zn - math_type ( 1.0 ) ) ;
612 }
613 return Sum ;
614}
615
616/// guess the initial coefficient. This tries to minimize the effect
617/// of starting out with a hard discontinuity as it occurs with zero-padding,
618/// while at the same time requiring little arithmetic effort
619///
620/// for the forward filter, we guess an extrapolation of the signal to the left
621/// repeating c[0] indefinitely, which is cheap to compute:
622
623template < class buffer_type >
624math_type icc_guess ( const buffer_type & _c , int k ) const
625{
626 as_math_type < buffer_type > c ( _c ) ;
627
628 return c[0] * math_type ( 1.0 / ( 1.0 - pole[k] ) ) ;
629}
630
631// for the backward filter , we assume mirror BC, which is also cheap to compute:
632
633math_type iacc_guess ( const out_buffer_type & c , int k ) const
634{
635 return iacc_mirror ( c , k ) ;
636}
637
638template < class buffer_type >
639math_type icc_identity ( const buffer_type & _c , int k ) const
640{
641 as_math_type < buffer_type > c ( _c ) ;
642
643 return c[0] ;
644}
645
646math_type iacc_identity ( const out_buffer_type & _c , int k ) const
647{
648 int M = _c.size ( ) ;
649 as_math_type < out_buffer_type > c ( _c ) ;
650
651 return c[M-1] ;
652}
653
654/// now we come to the solving, or prefiltering code itself.
655/// The code is adapted from P. Thevenaz' code.
656///
657/// I use a 'carry' element, 'X', to carry the result of the recursion
658/// from one iteration to the next instead of using the direct implementation
659/// of the recursion formula, which would read the previous value of the
660/// recursion from memory by accessing x[n-1], or x[n+1], respectively.
661
662void solve_gain_inlined ( const in_buffer_type & _c ,
663 out_buffer_type & _x ) const
664{
665 int M = _c.size ( ) ;
666 assert ( _x.size ( ) == M ) ;
667 as_math_type < in_buffer_type > c ( _c ) ;
668 as_target < out_buffer_type > x ( _x ) ;
669
670 if ( M == 1 )
671 {
672 x.store ( c[0] , 0 ) ;
673 return ;
674 }
675
676 assert ( M > 1 ) ;
677
678 // use a buffer of one math_type for the recursion (see below)
679
680 math_type X ;
681 math_type p = math_type ( pole[0] ) ;
682
683 // use first filter pole, applying overall gain in the process
684 // of consuming the input.
685 // Note that the application of the gain is performed during the processing
686 // of the first (maybe the only) pole of the filter, instead of running a separate
687 // loop over the input to apply it before processing starts.
688
689 // note how the gain is applied to the initial causal coefficient. This is
690 // equivalent to first applying the gain to the input and then calculating
691 // the initial causal coefficient from the processed input.
692
693 X = math_type ( gain ) * ( this->*_p_icc ) ( _c , 0 ) ;
694 x.store ( X , 0 ) ;
695
696 /* causal recursion */
697 // the gain is applied to each input value as it is consumed
698
699 for ( int n = 1 ; n < M ; n++ )
700 {
701 // KFJ 2019-02-12 tentative use of fma
702#ifdef USE_FMA
703 math_type cc = math_type ( gain ) * c[n] ;
704 X = fma ( X , p , cc ) ;
705#else
706 X = math_type ( gain ) * c[n] + p * X ;
707#endif
708 x.store ( X , n ) ;
709 }
710
711 // now the input is used up and won't be looked at any more; all subsequent
712 // processing operates on the output.
713
714 /* anticausal initialization */
715
716 X = ( this->*_p_iacc ) ( _x , 0 ) ;
717 x.store ( X , M - 1 ) ;
718
719 /* anticausal recursion */
720 for ( int n = M - 2 ; 0 <= n ; n-- )
721 {
722 X = p * ( X - x[n] ) ;
723 x.store ( X , n ) ;
724 }
725
726 // for the remaining poles, if any, don't apply the gain
727 // and process the result from applying the first pole
728
729 for ( int k = 1 ; k < npoles ; k++ )
730 {
731 p = math_type ( pole[k] ) ;
732 /* causal initialization */
733 X = ( this->*_p_iccx ) ( _x , k ) ;
734 x.store ( X , 0 ) ;
735
736 /* causal recursion */
737 for ( int n = 1 ; n < M ; n++ )
738 {
739 // KFJ 2019-02-12 tentative use of fma
740#ifdef USE_FMA
741 math_type xx = x[n] ;
742 X = fma ( X , p , xx ) ;
743#else
744 X = x[n] + p * X ;
745#endif
746 x.store ( X , n ) ;
747 }
748
749 /* anticausal initialization */
750 X = ( this->*_p_iacc ) ( _x , k ) ;
751 x.store ( X , M - 1 ) ;
752
753 /* anticausal recursion */
754 for ( int n = M - 2 ; 0 <= n ; n-- )
755 {
756 X = p * ( X - x[n] ) ;
757 x.store ( X , n ) ;
758 }
759 }
760}
761
762/// solve_identity is used for spline degrees 0 and 1. In this case
763/// there are no poles to apply, but if the operation is not in-place
764/// and/or there is a 'boost' factor which is different from 1, the
765/// data are copied and/or amplified with 'boost'.
766
767void solve_identity ( const in_buffer_type & _c ,
768 out_buffer_type & _x ) const
769{
770 int M = _c.size ( ) ;
771 assert ( _x.size ( ) == M ) ;
772 as_math_type < in_buffer_type > c ( _c ) ;
773 as_target < out_buffer_type > x ( _x ) ;
774
775 if ( boost == xlf_type ( 1 ) )
776 {
777 // boost is 1, check if operation is not in-place
778 if ( ( void* ) ( _c.data ( ) ) != ( void* ) ( _x.data ( ) ) )
779 {
780 // operation is not in-place, copy input to output
781 for ( int n = 0 ; n < M ; n++ )
782 {
783 x.store ( c[n] , n ) ;
784 }
785 }
786 }
787 else
788 {
789 // we have a boost factor, so we apply it.
790 math_type factor = math_type ( boost ) ;
791
792 for ( int n = 0 ; n < M ; n++ )
793 {
794 x.store ( factor * c[n] , n ) ;
795 }
796 }
797}
798
799/// The last bit of work left is the constructor. This simply passes
800/// the specs to the base class constructor, as iir_filter inherits
801/// from the specs type.
802
803public:
804
805 iir_filter ( const iir_filter_specs & specs )
806 : iir_filter_specs ( specs )
807{
808 // TODO we have a problem if the gain is getting very large, as it happens
809 // for high spline degrees. The iir_filter attenuates the signal to next-to-nothing,
810 // then it's amplified back to the previous amplitude. This degrades the signal,
811 // most noticeably when the numeric type is lo-fi, since there are operations involving
812 // both the attenuated and unattenuated data ('squashing').
813
814 if ( npoles < 1 )
815 {
816 // zero poles means there's nothing to do but possibly
817 // copying the input to the output, which solve_identity
818 // will do if the operation isn't in-place.
819 _p_solve = & filter_type::solve_identity ;
820 return ;
821 }
822
823 // calculate the horizon for each pole, this is the number of iterations
824 // the filter must perform on a unit pulse to decay below 'tolerance'
825
826 // If tolerance is 0 (or negative) we set 'horizon' to MAX_INT. This
827 // will have the effect of making it larger than M, or at least so
828 // large that there won't be a difference between the accelerated and
829 // the full loop. We might use a smaller value which still guarantees
830 // the complete decay.
831
832 for ( int i = 0 ; i < npoles ; i++ )
833 {
834 if ( tolerance > 0 )
835 horizon.push_back ( ceil ( log ( tolerance )
836 / log ( std::abs ( pole[i] ) ) ) ) ;
837 else
838 horizon.push_back ( INT_MAX ) ; // TODO quick fix, think about it
839 }
840
841 // contrary to my initial implementation I use per-axis gain instead of
842 // cumulating the gain for all axes. This may perform slightly worse, but
843 // is more stable numerically and simplifies the code.
844
845 gain = boost * vspline::overall_gain ( npoles , pole ) ;
846 _p_solve = & filter_type::solve_gain_inlined ;
847
848 // while the forward/backward IIR iir_filter in the solve_... routines is the same for all
849 // boundary conditions, the calculation of the initial causal and anticausal coefficients
850 // depends on the boundary conditions and is handled by a call through a method pointer
851 // in the solve_... routines. Here we fix these method pointers:
852
853 if ( bc == MIRROR )
854 {
855 _p_icc = & filter_type::icc_mirror<in_buffer_type> ;
856 _p_iccx = & filter_type::icc_mirror<out_buffer_type> ;
857 _p_iacc = & filter_type::iacc_mirror ;
858 }
859 else if ( bc == NATURAL )
860 {
861 _p_icc = & filter_type::icc_natural<in_buffer_type> ;
862 _p_iccx = & filter_type::icc_natural<out_buffer_type> ;
863 _p_iacc = & filter_type::iacc_natural ;
864 }
865 else if ( bc == PERIODIC )
866 {
867 _p_icc = & filter_type::icc_periodic<in_buffer_type> ;
868 _p_iccx = & filter_type::icc_periodic<out_buffer_type> ;
869 _p_iacc = & filter_type::iacc_periodic ;
870 }
871 else if ( bc == REFLECT )
872 {
873 _p_icc = & filter_type::icc_reflect<in_buffer_type> ;
874 _p_iccx = & filter_type::icc_reflect<out_buffer_type> ;
875 _p_iacc = & filter_type::iacc_reflect ;
876 }
877 else if ( bc == ZEROPAD )
878 {
879 _p_icc = & filter_type::icc_identity<in_buffer_type> ;
880 _p_iccx = & filter_type::icc_identity<out_buffer_type> ;
881 _p_iacc = & filter_type::iacc_identity ;
882 }
883 else if ( bc == GUESS )
884 {
885 _p_icc = & filter_type::icc_guess<in_buffer_type> ;
886 _p_iccx = & filter_type::icc_guess<out_buffer_type> ;
887 _p_iacc = & filter_type::iacc_guess ;
888 }
889 else
890 {
891 throw not_supported ( "boundary condition not supported by vspline::filter" ) ;
892 }
893}
894
895} ; // end of class iir_filter
896
897/// class to provide b-spline prefiltering, using 'iir_filter' above.
898/// The actual filter object has to interface with the data handling
899/// routine ('present', see filter.h). So this class functions as an
900/// adapter, combining the code needed to set up adequate buffers
901/// and creation of the actual IIR filter itself.
902/// The interface to the data handling routine is provided by
903/// inheriting from class buffer_handling
904
905// KFJ 2019-04-16 added default for _vsize template argument
906
907template < template < typename , size_t > class _vtype ,
908 typename _math_ele_type ,
909 size_t _vsize =
912: public buffer_handling < _vtype , _math_ele_type , _vsize > ,
913 public vspline::iir_filter < _vtype < _math_ele_type , _vsize > >
914{
915 // provide this type for queries
916
917 typedef _math_ele_type math_ele_type ;
918
919 // we'll use a few types from the buffer_handling type
920
922
923 using typename buffer_handling_type::vtype ;
926
927 // instances of class bspl_prefilter hold the buffer:
928
930 = typename vspline::allocator_traits < vtype > :: type ;
931
932 vigra::MultiArray < 1 , vtype , allocator_t > buffer ;
933
934 // the filter's 'solve' routine has the workhorse code to filter
935 // the data inside the buffer:
936
937 typedef _vtype < _math_ele_type , _vsize > simdized_math_type ;
939 using filter_type::solve ;
940
941 // by defining arg_type, we allow code to infer what type of
942 // argument initializer the filter takes
943
945
946 // the constructor invokes the filter's constructor,
947 // sets up the buffer and initializes the buffer_handling
948 // component to use the whole buffer to accept incoming and
949 // provide outgoing data.
950
951 bspl_prefilter ( const iir_filter_specs & specs , size_t size )
952 : filter_type ( specs ) ,
953 buffer ( size )
954 {
955 // operate in-place and use the whole buffer to receive and
956 // deliver data
957
958 init ( buffer , buffer ) ;
959 } ;
960
961 // operator() simply delegates to the filter's 'solve' routine,
962 // which filters the data in the buffer.
963
965 {
966 solve ( buffer , buffer ) ;
967 }
968
969 // factory function to provide a filter with the same set of
970 // parameters, but possibly different data types. this is used
971 // for processing of 1D data, where the normal buffering mechanism
972 // may be sidestepped
973
974 template < typename in_type ,
975 typename out_type = in_type ,
976 typename math_type = out_type >
979 {
981 ( specs ) ;
982 }
983
984} ;
985
986/// amplify is used to copy input to output, optionally applying
987/// 'boost' in the process. If the operation is in-place and 'boost'
988/// is 1, 'amplify' returns prematurely.
989
990template < unsigned int dimension ,
991 typename in_value_type ,
992 typename out_value_type ,
993 typename math_ele_type >
994void amplify ( const vigra::MultiArrayView
995 < dimension , in_value_type > & input ,
996 vigra::MultiArrayView
997 < dimension , out_value_type > & output ,
998 math_ele_type boost = 1 ,
999 int njobs = vspline::default_njobs
1000 )
1001{
1002 // if the operation is in-place and boost is 1,
1003 // there is nothing to do.
1004
1005 if ( (void*) ( input.data() ) == (void*) ( output.data() )
1006 && boost == math_ele_type ( 1 ) )
1007 return ;
1008
1009 assert ( input.size() == output.size() ) ;
1010
1011 // set up variables to orchestrate the batchwise processing
1012 // of the data by multithread's payload routine
1013
1014 std::ptrdiff_t batch_size = 1024 ;
1015 std::ptrdiff_t total_size = input.size() ;
1016 vspline::atomic < std::ptrdiff_t > tickets ( total_size ) ;
1017
1018 // the payload routine will process batches of up to batch_size
1019 // and continue to do so until the input is exhausted
1020
1021 std::function < void() > payload =
1022 [&]()
1023 {
1024 // start and end index for the batches to be processed;
1025 // these are fetched from 'tickets' in the caller, above
1026
1027 std::ptrdiff_t lo , hi ;
1028
1030 ( tickets , batch_size , total_size , lo , hi ) )
1031 {
1032 if ( boost == math_ele_type ( 1 ) )
1033 {
1034 while ( lo < hi )
1035 {
1036 output[lo] = out_value_type ( input[lo] ) ;
1037 ++lo ;
1038 }
1039 }
1040 else
1041 {
1042 while ( lo < hi )
1043 {
1044 output[lo] = out_value_type ( input[lo] * boost ) ;
1045 ++lo ;
1046 }
1047 }
1048 }
1049 } ;
1050
1051 // launch njobs workers executing the payload routine
1052
1053 vspline::multithread ( payload , njobs ) ;
1054}
1055
1056/// 'prefilter' handles b-spline prefiltering for the whole range of
1057/// acceptable input and output. It combines two bodies of code to
1058/// achieve this goal:
1059/// - the b-spline filtering code above
1060/// - 'wielding' code in filter.h, which is not specific to b-splines.
1061///
1062/// Note that vsize , the vectorization width, can be passed explicitly.
1063/// If Vc is in use and math_ele_type can be used with hardware
1064/// vectorization, the arithmetic will be done with Vc::SimdArrays
1065/// of the given size. Otherwise 'goading' will be used: the data are
1066/// presented in TinyVectors of vsize math_ele_type, hoping that the
1067/// compiler may autovectorize the operation.
1068
1069// KFJ 2018-12-20 added default for math_ele_type, static_cast to
1070// int for bcv's dimension, default for 'tolerance' - so now the
1071// prototype matches that of the functions in general_filter.h
1072
1073template < unsigned int dimension ,
1074 typename in_value_type ,
1075 typename out_value_type ,
1076 typename math_ele_type =
1078 < in_value_type , out_value_type > ,
1079 size_t vsize =
1081 >
1082void prefilter ( const
1083 vigra::MultiArrayView
1084 < dimension ,
1085 in_value_type > & input ,
1086 vigra::MultiArrayView
1087 < dimension ,
1088 out_value_type > & output ,
1089 vigra::TinyVector
1090 < bc_code ,
1091 static_cast < int > ( dimension ) > bcv ,
1092 int degree ,
1093 xlf_type tolerance
1094 = std::numeric_limits < math_ele_type > :: epsilon(),
1095 xlf_type boost = xlf_type ( 1 ) ,
1096 int njobs = default_njobs )
1097{
1098 if ( degree <= 1 )
1099 {
1100 // if degree is <= 1, there is no filter to apply, but we may need
1101 // to apply 'boost' and/or copy input to output. We use 'amplify'
1102 // for the purpose, which multithreads the operation (if it is at
1103 // all necessary). I found this is (slightly) faster than doing the
1104 // job in a single thread - the process is mainly memory-bound, so
1105 // the gain is moderate.
1106
1107 amplify < dimension , in_value_type , out_value_type , math_ele_type >
1108 ( input , output , math_ele_type ( boost ) ) ;
1109
1110 return ;
1111 }
1112
1113 std::vector < vspline::iir_filter_specs > vspecs ;
1114
1115 // package the arguments to the filter; one set of arguments
1116 // per axis of the data
1117
1118 auto poles = vspline_constants::precomputed_poles [ degree ] ;
1119
1120 for ( int axis = 0 ; axis < dimension ; axis++ )
1121 {
1122 vspecs.push_back
1124 ( bcv [ axis ] , degree / 2 , poles , tolerance , 1 ) ) ;
1125 }
1126
1127 // 'boost' is only applied to dimension 0, since it is meant to
1128 // affect the whole data set just once, not once per axis.
1129
1130 vspecs [ 0 ] . boost = boost ;
1131
1132 // KFJ 2018-05-08 with the automatic use of vectorization the
1133 // distinction whether math_ele_type is 'vectorizable' or not
1134 // is no longer needed: simdized_type will be a Vc::SimdArray
1135 // if possible, a vspline::simd_type otherwise.
1136
1137 typedef typename vspline::bspl_prefilter
1139 math_ele_type ,
1140 vsize
1141 > filter_type ;
1142
1143 // now call the 'wielding' code in filter.h
1144
1146 < in_value_type , out_value_type , dimension , filter_type >
1147 ( input , output , vspecs ) ;
1148}
1149
1150} ; // namespace vspline
1151
1152#endif // VSPLINE_PREFILTER_H
buffer_handling provides services needed for interfacing with a buffer of simdized/goading data....
Definition: filter.h:227
void init(vigra::MultiArrayView< 1, vtype > &_in_window, vigra::MultiArrayView< 1, vtype > &_out_window)
Definition: filter.h:237
class iir_filter implements an n-pole forward/backward recursive filter to be used for b-spline prefi...
Definition: prefilter.h:193
void solve(const in_buffer_type &input, out_buffer_type &output)
solve() takes two buffers, one to the input data and one to the output space. The containers must hav...
Definition: prefilter.h:259
static const bool is_single_pass
Definition: prefilter.h:235
void solve(out_buffer_type &data)
for in-place operation we use the same filter routine.
Definition: prefilter.h:267
iir_filter(const iir_filter_specs &specs)
The last bit of work left is the constructor. This simply passes the specs to the base class construc...
Definition: prefilter.h:805
int get_support_width() const
calling code may have to set up buffers with additional space around the actual data to allow filteri...
Definition: prefilter.h:245
definitions common to all files in this project, utility code
@ vsize
Definition: eval.cc:96
generic implementation of separable filtering for nD arrays
const xlf_type *const precomputed_poles[]
Definition: poles.h:1902
Definition: basis.h:79
void prefilter(const vigra::MultiArrayView< dimension, in_value_type > &input, vigra::MultiArrayView< dimension, out_value_type > &output, vigra::TinyVector< bc_code, static_cast< int >(dimension) > bcv, int degree, xlf_type tolerance=std::numeric_limits< math_ele_type > ::epsilon(), xlf_type boost=xlf_type(1), int njobs=default_njobs)
'prefilter' handles b-spline prefiltering for the whole range of acceptable input and output....
Definition: prefilter.h:1082
void filter(const vigra::MultiArrayView< D, in_type > &input, vigra::MultiArrayView< D, out_type > &output, types ... args)
vspline::filter is the common entry point for filter operations in vspline. This routine does not yet...
Definition: filter.h:1495
const int default_njobs
Definition: multithread.h:220
int multithread(std::function< void() > payload, std::size_t nr_workers=default_njobs)
multithread uses a thread pool of worker threads to perform a multithreaded operation....
Definition: multithread.h:412
typename vigra::NumericTraits< promote_ele_type< T1, T2 > > ::RealPromote common_math_ele_type
Definition: common.h:192
bool fetch_range_ascending(vspline::atomic< index_t > &source, const index_t &count, const index_t &total, index_t &low, index_t &high)
fetch_range_ascending also uses an atomic initialized to the total number of indexes to be distribute...
Definition: multithread.h:336
long double xlf_type
Definition: common.h:102
void amplify(const vigra::MultiArrayView< dimension, in_value_type > &input, vigra::MultiArrayView< dimension, out_value_type > &output, math_ele_type boost=1, int njobs=vspline::default_njobs)
amplify is used to copy input to output, optionally applying 'boost' in the process....
Definition: prefilter.h:994
typename vector_traits< T, N > ::type simdized_type
this alias is used as a shorthand to pick the vectorized type for a given type T and a size N from 'v...
Definition: vector.h:459
std::atomic< T > atomic
Definition: multithread.h:224
bc_code
This enumeration is used for codes connected to boundary conditions. There are two aspects to boundar...
Definition: common.h:71
@ GUESS
Definition: common.h:78
@ NATURAL
Definition: common.h:75
@ REFLECT
Definition: common.h:74
@ PERIODIC
Definition: common.h:73
@ MIRROR
Definition: common.h:72
@ ZEROPAD
Definition: common.h:77
precalculated prefilter poles and basis function values
vspline creates vigra::MultiArrays of vectorized types. As long as the vectorized types are Vc::SimdA...
Definition: common.h:267
class to provide b-spline prefiltering, using 'iir_filter' above. The actual filter object has to int...
Definition: prefilter.h:914
buffer_handling< _vtype, _math_ele_type, _vsize > buffer_handling_type
Definition: prefilter.h:921
iir_filter_specs arg_type
Definition: prefilter.h:944
vspline::iir_filter< simdized_math_type > filter_type
Definition: prefilter.h:938
_math_ele_type math_ele_type
Definition: prefilter.h:917
_vtype< _math_ele_type, _vsize > simdized_math_type
Definition: prefilter.h:937
bspl_prefilter(const iir_filter_specs &specs, size_t size)
Definition: prefilter.h:951
void solve(const in_buffer_type &input, out_buffer_type &output)
solve() takes two buffers, one to the input data and one to the output space. The containers must hav...
Definition: prefilter.h:259
vigra::MultiArray< 1, vtype, allocator_t > buffer
Definition: prefilter.h:932
typename vspline::allocator_traits< vtype > ::type allocator_t
Definition: prefilter.h:930
static vspline::iir_filter< in_type, out_type, math_type > get_raw_filter(const iir_filter_specs &specs)
Definition: prefilter.h:978
structure to hold specifications for an iir_filter object. This set of parameters has to be passed th...
Definition: prefilter.h:163
iir_filter_specs(vspline::bc_code _bc, int _npoles, const xlf_type *_pole, xlf_type _tolerance, xlf_type _boost=xlf_type(1))
Definition: prefilter.h:170
const xlf_type * pole
Definition: prefilter.h:166
vspline::bc_code bc
Definition: prefilter.h:164
exception which is thrown if an opertion is requested which vspline does not support
Definition: common.h:307
with the definition of 'simd_traits', we can proceed to implement 'vector_traits': struct vector_trai...
Definition: vector.h:344