vspline 1.1.0
Generic C++11 Code for Uniform B-Splines
Classes | Namespaces | Macros | Typedefs | Functions
simd_type.h File Reference

SIMD type using small loops. More...

#include <iostream>

Go to the source code of this file.

Classes

struct  vspline::simd_type< _value_type, _vsize >
 class template simd_type provides a fixed-size container type for small sets of fundamentals which are stored in a C vector. The type offers arithmetic capabilities which are implemented by using loops over the elements in the vector, expecting that the compiler will autovectorize these loops into 'proper' SIMD code. The interface of this type is modelled to be compatible with Vc's SimdArray. Unfortunately, Vc::SimdArray requires additional template arguments, so at times it's difficult to use the two types instead of each other. The interface compatibility does not mean that the arithmetic will produce the same results - this is intended but neither tested nor enforced. More...
 
struct  vspline::simd_type< _value_type, _vsize >::masked_type
 
struct  std::allocator_traits< vspline::simd_type< T, N > >
 

Namespaces

namespace  vspline
 

Macros

#define BUILD_FROM_CONTAINER(SIZE_TYPE, VSZ)
 
#define BROADCAST_STD_FUNC(FUNC)
 
#define BROADCAST_STD_FUNC2(FUNC)
 
#define BROADCAST_STD_FUNC3(FUNC)
 
#define INTEGRAL_ONLY
 
#define BOOL_ONLY
 
#define OPEQ_FUNC(OPFUNC, OPEQ, CONSTRAINT)
 
#define C_PROMOTE(A, B)
 
#define OP_FUNC(OPFUNC, OP, CONSTRAINT)
 
#define OP_FUNC(OPFUNC, OP, CONSTRAINT)
 
#define COMPARE_FUNC(OPFUNC, OP)
 
#define OPEQ_FUNC(OPFUNC, OPEQ, CONSTRAINT)
 
#define CLAMP(FNAME, REL)
 

Typedefs

template<typename T >
using vspline::is_scalar = typename std::integral_constant< bool, std::is_fundamental< T > ::value||std::is_same< T, bool > ::value > ::type
 

Functions

template<typename T , std::size_t vsize>
bool vspline::any_of (simd_type< T, vsize > arg)
 
template<typename T , std::size_t vsize>
bool vspline::all_of (simd_type< T, vsize > arg)
 
template<typename T , std::size_t vsize>
bool vspline::none_of (simd_type< T, vsize > arg)
 

Detailed Description

SIMD type using small loops.

SIMD type derived from std::simd.

vspline can use Vc for explicit vectorization, and at the time of this writing, this is usually the best option. But Vc is not available everywhere, or it's use may be unwanted. To help with such situations, vspline defines it's own 'SIMD' type, which is implemented as a simple C vector and small loops operating on it. If these constructs are compiled with compilers capable of autovectorization (and with the relevent flags activating use of SIMD instruction sets like AVX) the resulting code will oftentimes be 'proper' SIMD code, because the small loops are presented so that the compiler can easily recognize them as potential clients of loop vectorization. I call this technique 'goading': By presenting the data flow in deliberately vector-friendly format, the compiler is more likely to 'get it'.

class template simd_type is designed to provide an interface similar to Vc::SimdArray, to be able to use it as a drop-in replacement. It aims to provide those SIMD capabilities which are actually used by vspline and is not a complete replacement for Vc::SimdArray.

Wherever possible, the code is as simple as possible, avoiding frills and trickery which might keep the compiler from recognizing potentially auto-vectorizable constructs. The resulting code is - in my limited experience - often not too far from explicit SIMD code. Some constructs do actually produce binary which is en par with code using Vc, namely such code which does not use gather, scatter or masked operations. So b-spline prefiltering, restoration of original data, and general filtering is very fast, while code involving b-spline evaluation shows a speed penalty, since vectorized b-spline evaluation (as coded in vspline) relies massively on gather operations of a kind which seem not to be auto-vectorized into binary gather commands - this is my guess, I have not investigated the binary closely.

The code presented here adds some memory access functions which are not present in Vc::SimdArray, namely strided load/store operations and load/store using functors.

Note that I use clang++ most of the time, and the code has evolved to produce fast binary with clang++. Your mileage will vary with other compilers.

Class vspline::simd_type is actually quite similar to vigra::TinyVector which also stores in a plain C array and provides arithmetic. But that type is quite complex, using CRTP with a base class, explicitly coding loop unrolling, catering for deficient compilers and using vigra's sophisticated type promotion mechanism. vspline::simd_type on the other hand is stripped down to the bare essentials, to make the code as simple as possible, in the hope that 'goading' will indeed work. It replaces vspline's previous SIMD type, vspline::simd_tv, which was derived from vigra::TinyVector.

One word of warning: the lack of type promotion requires you to pick a value_type of sufficient precision and capacity for the intended operation. In other words: you won't get an int when multiplying two shorts.

Note also that this type is intended for horizontal vectorization, and you'll get the best results when picking a vector size which is a small-ish power of two - preferably at least the number of values of the given value_type which a register of the intended vector ISA will contain.

vspline uses TinyVectors of SIMD data types, but their operations are coded with loops over the TinyVector's elements throughout vspline's code base. In vspline's opt directory, you can find 'xel_of_vector.h', which can provide overloads for all operator functions involving TinyVectors of vspline::simd_type - or, more generally, small aggregates of vector data. Please see this header's comments for more detailed information.

Note also that throughout vspline, there is almost no explicit use of vspline::simd_type. vspline picks appropriate SIMD data types with mechanisms 'one level up', coded in vector.h. vector.h checks if use of Vc is possible and whether Vc can vectorize a given type, and produces a 'simdized type', which you mustn't confuse with a simd_type.

To use this header, an implementation of std::simd has to be installed, and the -std=c++17 option is needed as well. It has been tried with clang++ and g++; you'll need a recent version.

Definition in file simd_type.h.

Macro Definition Documentation

◆ BOOL_ONLY

#define BOOL_ONLY
Value:
static_assert ( std::is_same < value_type , bool > :: value , \
"this operation is only allowed for booleans" ) ;

Definition at line 544 of file simd_type.h.

◆ BROADCAST_STD_FUNC

#define BROADCAST_STD_FUNC (   FUNC)
Value:
friend simd_type FUNC ( simd_type arg ) \
{ \
simd_type result ; \
for ( size_type i = 0 ; i < vsize ; i++ ) \
result [ i ] = std::FUNC ( arg [ i ] ) ; \
return result ; \
}
@ vsize
Definition: eval.cc:96

Definition at line 470 of file simd_type.h.

◆ BROADCAST_STD_FUNC2

#define BROADCAST_STD_FUNC2 (   FUNC)
Value:
friend simd_type FUNC ( simd_type arg1 , \
simd_type arg2 ) \
{ \
simd_type result ; \
for ( size_type i = 0 ; i < vsize ; i++ ) \
result [ i ] = std::FUNC ( arg1 [ i ] , arg2 [ i ] ) ; \
return result ; \
}

Definition at line 497 of file simd_type.h.

◆ BROADCAST_STD_FUNC3

#define BROADCAST_STD_FUNC3 (   FUNC)
Value:
friend simd_type FUNC ( simd_type arg1 , \
simd_type arg2 , \
simd_type arg3 ) \
{ \
simd_type result ; \
for ( size_type i = 0 ; i < vsize ; i++ ) \
result [ i ] = FUNC ( arg1 [ i ] , arg2 [ i ] , arg3[i] ) ; \
return result ; \
}

Definition at line 515 of file simd_type.h.

◆ BUILD_FROM_CONTAINER

#define BUILD_FROM_CONTAINER (   SIZE_TYPE,
  VSZ 
)
Value:
template < typename U , template < typename , SIZE_TYPE > class V > \
simd_type & operator= ( const V < U , VSZ > & rhs ) \
{ \
for ( size_type i = 0 ; i < vsize ; i++ ) \
(*this) [ i ] = rhs [ i ] ; \
return *this ; \
} \
template < typename U , template < typename , SIZE_TYPE > class V > \
simd_type ( const V < U , VSZ > & ini ) \
{ \
*this = ini ; \
}
const int VSZ
Definition: mandelbrot.cc:68

Definition at line 247 of file simd_type.h.

◆ C_PROMOTE

#define C_PROMOTE (   A,
 
)
Value:
typename std::conditional \
< std::is_same < A , B > :: value , \
A , \
decltype ( std::declval < A > () \
+ std::declval < B > () ) \
> :: type

Definition at line 591 of file simd_type.h.

◆ CLAMP

#define CLAMP (   FNAME,
  REL 
)
Value:
simd_type FNAME ( simd_type threshold ) const \
{ \
simd_type result ( threshold ) ; \
for ( std::size_t i = 0 ; i < vsize ; i++ ) \
{ \
if ( (*this) [ i ] REL threshold ) \
result [ i ] = (*this) [ i ] ; \
} \
return result ; \
}

Definition at line 873 of file simd_type.h.

◆ COMPARE_FUNC

#define COMPARE_FUNC (   OPFUNC,
  OP 
)

Definition at line 707 of file simd_type.h.

◆ INTEGRAL_ONLY

#define INTEGRAL_ONLY
Value:
static_assert ( std::is_integral < value_type > :: value , \
"this operation is only allowed for integral types" ) ;

Definition at line 540 of file simd_type.h.

◆ OP_FUNC [1/2]

#define OP_FUNC (   OPFUNC,
  OP,
  CONSTRAINT 
)

Definition at line 663 of file simd_type.h.

◆ OP_FUNC [2/2]

#define OP_FUNC (   OPFUNC,
  OP,
  CONSTRAINT 
)
Value:
simd_type OPFUNC() const \
{ \
simd_type help ; \
for ( size_type i = 0 ; i < vsize ; i++ ) \
help [ i ] = OP (*this) [ i ] ; \
return help ; \
}

Definition at line 663 of file simd_type.h.

◆ OPEQ_FUNC [1/2]

#define OPEQ_FUNC (   OPFUNC,
  OPEQ,
  CONSTRAINT 
)
Value:
simd_type & OPFUNC ( value_type rhs ) \
{ \
CONSTRAINT \
for ( size_type i = 0 ; i < vsize ; i++ ) \
(*this) [ i ] OPEQ rhs ; \
return *this ; \
} \
simd_type & OPFUNC ( simd_type rhs ) \
{ \
CONSTRAINT \
for ( size_type i = 0 ; i < vsize ; i++ ) \
(*this) [ i ] OPEQ rhs [ i ] ; \
return *this ; \
}

Definition at line 786 of file simd_type.h.

◆ OPEQ_FUNC [2/2]

#define OPEQ_FUNC (   OPFUNC,
  OPEQ,
  CONSTRAINT 
)
Value:
simd_type & OPFUNC ( value_type rhs ) \
{ \
CONSTRAINT \
for ( size_type i = 0 ; i < vsize ; i++ ) \
{ \
if ( whether [ i ] ) \
whither [ i ] OPEQ rhs ; \
} \
return whither ; \
} \
simd_type & OPFUNC ( simd_type rhs ) \
{ \
CONSTRAINT \
for ( size_type i = 0 ; i < vsize ; i++ ) \
{ \
if ( whether [ i ] ) \
whither [ i ] OPEQ rhs [ i ] ; \
} \
return whither ; \
}

Definition at line 786 of file simd_type.h.