/************************************************************************/
/*                                                                      */
/*    vspline - a set of generic tools for creation and evaluation      */
/*              of uniform b-splines                                    */
/*                                                                      */
/*            Copyright 2015 - 2018 by Kay F. Jahnke                    */
/*                                                                      */
/*    The git repository for this software is at                        */
/*                                                                      */
/*    https://bitbucket.org/kfj/vspline                                 */
/*                                                                      */
/*    Please direct questions, bug reports, and contributions to        */
/*                                                                      */
/*    kfjahnke+vspline@gmail.com                                        */
/*                                                                      */
/*    Permission is hereby granted, free of charge, to any person       */
/*    obtaining a copy of this software and associated documentation    */
/*    files (the "Software"), to deal in the Software without           */
/*    restriction, including without limitation the rights to use,      */
/*    copy, modify, merge, publish, distribute, sublicense, and/or      */
/*    sell copies of the Software, and to permit persons to whom the    */
/*    Software is furnished to do so, subject to the following          */
/*    conditions:                                                       */
/*                                                                      */
/*    The above copyright notice and this permission notice shall be    */
/*    included in all copies or substantial portions of the             */
/*    Software.                                                         */
/*                                                                      */
/*    THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND    */
/*    EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES   */
/*    OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND          */
/*    NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT       */
/*    HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,      */
/*    WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING      */
/*    FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR     */
/*    OTHER DEALINGS IN THE SOFTWARE.                                   */
/*                                                                      */
/************************************************************************/

/*! \file unary_functor.h

    \brief interface definition for unary functors

    vspline's evaluation and remapping code relies on a unary functor template
    which is used as the base for vspline::evaluator and also constitutes the
    type of object accepted by most of the functions in transform.h.

    This template produces functors which are meant to yield a single output
    for a single input, where both the input and output types may be single
    types or vigra::TinyVectors, and their elementary types may be vectorized.
    The functors are expected to provide methods named eval() which are capable
    of performing the required functionality. These eval routines take both
    their input and output by reference - the input is taken by const &, and the
    output as plain &. The result type of the eval routines is void. While
    such unary functors can be hand-coded, the class template 'unary_functor'
    provides services to create such functors in a uniform way, with a specifc
    system of associated types and some convenience code. Using unary_functor
    is meant to facilitate the creation of the unary functors used in vspline.
    
    Using unary_functor generates objects which can be easily combined into
    more complex unary functors, a typical use would be to 'chain' two
    unary_functors, see class template 'chain_type' below, which also provides
    an example for the use of unary_functor.
    
    class unary_functor takes three template arguments:
    
    - the argument type, IN
    
    - the result type, OUT
    
    - the number of fundamentals (float, int etc.) in a vector, _vsize
    
    The vectorized argument and result type are deduced from IN, OUT and
    _vsize by querying vspline::vector_traits. When using Vc (-DUSE_VC),
    these types will be Vc::SimdArrays if the elementary type can be used
    to form a SimdArray. Otherwise vspline provides a fallback type emulating
    vectorization: vspline::simd_tv. This fallback type emulates just enough
    of SimdArray's capabilities to function as a replacement inside vspline's
    body of code.
    
    So where is eval() or operator()? Not in class unary_functor. The actual
    functionality is provided by the derived class. There is deliberately no
    code concerning evaluation in class unary_functor. My initial implementation
    had pure virtual functions to define the interface for evaluation, but this
    required explicitly providing the overloads in the derived class. Simply
    omitting any reference to evaluation allows the derived class to accomplish
    evaluation with a template if the code is syntactically the same for vectorized
    and unvectorized operation. To users of concrete functors inheriting from
    unary_functor this makes no difference. The only drawback is that it's not
    possible to perform evaluation via a base class pointer or reference. But
    this is best avoided anyway because it degrades performance. If the need arises
    to have several unary_functors with the same template signature share a common
    type, there's a mechanism to make the internals opaque by 'grokking'.
    grokking provides a wrapper around a unary_functor which hides it's type,
    vspline::grok_type directly inherits from unary_functor and the only template
    arguments are IN, OUT and _vsize. This hurts performance a little - just as
    calling via a base class pointer/reference would, but the code is outside
    class unary_functor and therefore only activated when needed.
    
    Class vspline::evaluator is itself coded as a vspline::unary_functor and can
    serve as another example for the use of the code in this file.
    
    Before the introduction of vspline::simd_tv, vectorization was done with
    Vc or not at all. Now vspline::vector_traits will produce Vc types if
    possible and vspline::simd_tv otherwise. This breaks code relying on
    the fallback to scalar without Vc, and it also breaks code that assumes that
    Vc is the sole method of vectorization.
    
    Extant code written for use with Vc should function as before as long as
    USE_VC is defined. It may be possible now to use such code even without Vc.
    This depends on how much of Vc::SimdArray's functionality is used. If such
    code runs without Vc, it may still not perform well and possibly even worse
    than scalar code.
*/

#ifndef VSPLINE_UNARY_FUNCTOR_H
#define VSPLINE_UNARY_FUNCTOR_H

#include <functional>
#include "common.h"

namespace vspline {

/// we derive all vspline::unary_functors from this empty class, to have
/// a common base type for all of them. This enables us to easily check if
/// a type is a vspline::unary_functor without having to wrangle with
/// unary_functor's template arguments.
 
template < size_t _vsize >
struct unary_functor_tag { } ;

/// class unary_functor provides a functor object which offers a system
/// of types for concrete unary functors derived from it. If vectorization
/// isn't used, this is trivial, but with vectorization in use, we get
/// vectorized types derived from plain IN and OUT via query of
/// vspline::vector_traits.
///
/// class unary_functor itself does not provide operator(), this is left to
/// the concrete functors inheriting from unary_functor. It is expected
/// that the derived classes provide evaluation capability, either as a
/// method template or as (overloaded) method(s) 'eval'. eval is to be coded
/// as taking it's first argument as a const&, and writing it's result to
/// it's second argument, which it receives by reference. eval's return type
/// is void. Inside vspline, classes derived from unary_functor do provide
/// operator(), so instances of these objects can be called with function
/// call syntax as well.
///
/// Why not lay down an interface with a pure virtual function eval()
/// which derived classes would need to override? Suppose you had, in
/// unary_functor,
///
/// virtual void eval ( const in_type & in , out_type & out ) = 0 ;
///
/// Then, in a derived class, you'd have to provide an override with this
/// signature. Initially, this seems reasonable enough, but if you want to
/// implement eval() as a member function template in the derived class, you
/// still would have to provide the override (calling an instantiated version
/// of your template), because your template won't be recognized as a viable
/// way to override the pure virtual base class member function. Since
/// providing eval as a template is common (oftentimes vectorized and
/// unvectorized code are the same) I've decided against having virtual eval
/// routines, to avoid the need for explicitly overriding them in derived
/// classes which provide eval() as a template.
///
/// How about providing operator() in unary_functor? We might add the derived
/// class to the template argument list and use unary_functor with CRP. I've
/// decideded against this and instead provide callability as a mixin to be
/// used as needed. This keeps the complexity of unary_functor-derived objects
/// low, adding the extra capability only where it's deemed appropriate. For
/// the mixin, see class 'callable' further down.
///
/// With no virtual member functions, class unary_functor becomes very simple,
/// which is desirable from a design standpoint, and also makes unary_functors
/// smaller, avoiding the creation of the virtual function table.
///
/// The type system used in unary_functor is taken from vspline::vector_traits,
/// additionally prefixing the types with in_ and out_, for input and output
/// types. The other elements of the type names are the same as in
/// vector_traits.

template < typename IN ,    // argument or input type
           typename OUT ,   // result type
           size_t _vsize = vspline::vector_traits < IN > :: size
         >
struct unary_functor
: public unary_functor_tag < _vsize >
{
  // number of fundamentals in simdized data. If vsize is 1, the vectorized
  // types will 'collapse' to the unvectorized types.

  enum { vsize = _vsize } ;

  // number of dimensions. This may well be different for IN and OUT.

  enum { dim_in = vspline::vector_traits < IN > :: dimension } ;
  enum { dim_out = vspline::vector_traits < OUT > :: dimension } ;
  
  // typedefs for incoming (argument) and outgoing (result) type. These two types
  // are non-vectorized types, like vigra::TinyVector < float , 2 >. Since such types
  // consist of elements of the same type, the corresponding vectorized type can be
  // easily automatically determined.
  
  typedef IN in_type ;
  typedef OUT out_type ;
  
  // elementary types of same. we rely on vspline::vector_traits to provide
  // these types.
  
  typedef typename vspline::vector_traits < IN > :: ele_type in_ele_type ;
  typedef typename vspline::vector_traits < OUT > :: ele_type out_ele_type ;
  
  // 'synthetic' types for input and output. These are always TinyVectors,
  // possibly of only one element, of the elementary type of in_type/out_type.
  // On top of providing a uniform container type (the TinyVector) the
  // synthetic type is also 'unaware' of any specific meaning the 'true'
  // input/output type may have, and arithmetic operations on the synthetic
  // types won't clash with arithmetic defined for the 'true' types.

  typedef vigra::TinyVector < in_ele_type , dim_in > in_nd_ele_type ;
  typedef vigra::TinyVector < out_ele_type , dim_out > out_nd_ele_type ;
  
  // for vectorized operation, we need a few extra typedefs. I use a _v
  // suffix instead of the _type suffix above to indicate vectorized types.
  // If vsize is 1, the _v types simply collapse to the unvectorized
  // types, having them does no harm, but it's not safe to assume that,
  // for example, in_v and in_type are in fact different types.

  /// a simdized type of the elementary type of result_type,
  /// which is used for coefficients and results. this is fixed via
  /// the traits class vector_traits (in vector.h). Note how we derive
  /// this type using vsize from the template argument, not what
  /// vspline::vector_traits deems appropriate for ele_type - though
  /// both numbers will be the same in most cases.
  
  typedef typename vector_traits < IN , vsize > :: ele_v in_ele_v ;
  typedef typename vector_traits < OUT , vsize > :: ele_v out_ele_v ;
  
  // 'synthetic' types for simdized input and output. These are always
  // TinyVectors, possibly of only one element, of the simdized input
  // and output type.

  typedef typename vector_traits < IN , vsize > :: nd_ele_v in_nd_ele_v ;
  typedef typename vector_traits < OUT , vsize > :: nd_ele_v out_nd_ele_v ;
  
  /// vectorized in_type and out_type. vspline::vector_traits supplies these
  /// types so that multidimensional/multichannel data come as vigra::TinyVectors,
  /// while 'singular' data won't be made into TinyVectors of one element.
  
  typedef typename vector_traits < IN , vsize > :: type in_v ;
  typedef typename vector_traits < OUT , vsize > :: type out_v ;

  /// vsize wide vector of ints, used for gather/scatter indexes
  
  typedef typename vector_traits < int , vsize > :: ele_v ic_v ;

} ;

// KFJ 2018-07-14
// To deal with an issue with cppyy, which has trouble processing
// templated operator() into an overloaded callable, I introduce this
// mixin, which specifically provides two distinct operator() overloads.
// This is also a better way to introduce the callable quality, since on
// the side of the derived class it requires only inheriting from the
// mixin, rather than the verbose templated operator() I used previously.
// This is still experimental.

/// mixin 'callable' is used with CRP: it serves as additional base to
/// unary functors which are meant to provide operator() and takes the
/// derived class as it's first template argument, followed be the
/// argument types and vectorization width, so that the parameter and
/// return type for operator() and - if vsize is greater than 1 - it's
/// vectorized overload can be produced.
/// This formulation has the advantage of not having to rely on the
/// 'out_type_of' mechanism I was using before and provides precisely
/// the operator() overload(s) which are appropriate.

template < class derived ,
           typename IN , typename OUT , size_t vsize >
struct callable
{
  // using a cl_ prefix here for the vectorized types to avoid a name
  // clash with class unary_functor, which also defines in_v, out_v
  
  typedef typename vector_traits < IN , vsize > :: type cl_in_v ;
  typedef typename vector_traits < OUT , vsize > :: type cl_out_v ;
  
  OUT operator() ( const IN & in )
  {
    auto self = static_cast < derived * > ( this ) ;
    OUT out ;
    self->eval ( in , out ) ;
    return out ;
  }
  
  template < typename = std::enable_if < ( vsize > 1 ) > >
  cl_out_v operator() ( const cl_in_v & in )
  {
    auto self = static_cast < derived * > ( this ) ;
    cl_out_v out ;
    self->eval ( in , out ) ;
    return out ;
  }
} ;

/// class chain_type is a helper class to pass one unary functor's result
/// as argument to another one. We rely on T1 and T2 to provide a few of the
/// standard types used in unary functors. Typically, T1 and T2 will both be
/// vspline::unary_functors, but the type requirements could also be fulfilled
/// manually.
///
/// Note how callability is introduced via the mixin 'vspline::callable'.
/// The inheritance definition looks confusing, the template arg list reads as:
/// 'the derived class, followed by the arguments needed to determine the
/// call signature(s)'. See vspline::callable above.

template < typename T1 ,
           typename T2 >
struct chain_type
: public vspline::unary_functor < typename T1::in_type ,
                                  typename T2::out_type ,
                                  T1::vsize > ,
  public vspline::callable
         < chain_type < T1 , T2 > ,
           typename T1::in_type ,
           typename T2::out_type ,
           T1::vsize
         >
{
  // definition base_type
  
  enum { vsize = T1::vsize } ;
  
  typedef vspline::unary_functor < typename T1::in_type ,
                                   typename T2::out_type ,
                                   vsize > base_type ;

  using typename base_type::in_type ;
  using typename base_type::out_type ;
  using typename base_type::in_v ;
  using typename base_type::out_v ;

  // we require both functors to share the same vectorization width
          
  static_assert ( T1::vsize == T2::vsize ,
                  "can only chain unary_functors with the same vector width" ) ;

  static_assert ( std::is_same < typename T1::out_type , typename T2::in_type > :: value ,
                  "chain: output of first functor must match input of second functor" ) ;

  typedef typename T1::out_type intermediate_type ;
  typedef typename T1::out_v intermediate_v ;
  
  // hold the two functors by value

  const T1 t1 ;
  const T2 t2 ;
  
  // the constructor initializes them

  chain_type ( const T1 & _t1 , const T2 & _t2 )
  : t1 ( _t1 ) ,
    t2 ( _t2 )
    { } ;

  // the actual eval needs a bit of trickery to determine the type of
  // the intermediate type from the type of the first argument.

  void eval ( const in_type & argument ,
                    out_type & result ) const
  {
    intermediate_type intermediate ;
    t1.eval ( argument , intermediate ) ; // evaluate first functor into intermediate
    t2.eval ( intermediate , result ) ;   // feed it as input to second functor
  }

  template < typename = std::enable_if < ( vsize > 1 ) > >
  void eval ( const in_v & argument ,
                    out_v & result ) const
  {
    intermediate_v intermediate ;
    t1.eval ( argument , intermediate ) ; // evaluate first functor into intermediate
    t2.eval ( intermediate , result ) ;   // feed it as input to second functor
  }
  
} ;

/// chain is a factory function yielding the result of chaining
/// two unary_functors.

template < class T1 , class T2 >
vspline::chain_type < T1 , T2 >
chain ( const T1 & t1 , const T2 & t2 )
{
  return vspline::chain_type < T1 , T2 > ( t1 , t2 ) ;
}

/// using operator overloading, we can exploit operator+'s semantics
/// to chain several unary functors. We need to specifically enable
/// this for types derived from unary_functor_tag to avoid a catch-all
/// situation.

template < typename T1 ,
           typename T2 ,
           typename =
             std::enable_if
             <    std::is_base_of
                  < vspline::unary_functor_tag < T2::vsize > ,
                    T1
                  > :: value
               && std::is_base_of
                  < vspline::unary_functor_tag < T1::vsize > ,
                    T2
                  > :: value
             >
         >
vspline::chain_type < T1 , T2 >
operator+ ( const T1 & t1 , const T2 & t2 )
{
  return vspline::chain ( t1 , t2 ) ;
}

// /// sometimes, vectorized code for a vspline::unary_functor is not at hand
// /// for some specific evaluation. class broadcast_type can broadcast unvectorized
// /// evaluation code, so that vectorized data can be procesed with this code,
// /// albeit less efficiently.
// 
// template < class inner_type , size_t _vsize >
// struct broadcast_type
// : public vspline::unary_functor < typename inner_type::in_type ,
//                                   typename inner_type::out_type ,
//                                   _vsize >
// {
//   // definition of in_type, out_type, vsize and base_type
//   
//   typedef typename inner_type::in_type in_type ;
//   typedef typename inner_type::out_type out_type ;
//   typedef typename inner_type::in_ele_type in_ele_type ;
//   typedef typename inner_type::out_ele_type out_ele_type ;
//   enum { dim_in = inner_type::dim_in } ;
//   enum { dim_out = inner_type::dim_out } ;
//   enum { vsize = _vsize } ;
//   
//   typedef vspline::unary_functor
//           < in_type , out_type , vsize > base_type ;
//           
//   using typename base_type::in_v ;
//   using typename base_type::out_v ;
//   
//   const inner_type inner ;
//   
//   broadcast_type ( const inner_type & _inner )
//   : inner ( _inner )
//   { } ;
//   
//   /// single-value evaluation simply delegates to inner
// 
//   void eval ( const in_type & in , out_type & out ) const
//   {
//     inner.eval ( in , out ) ;
//   }
//   
//   // vector_traits for in_type and out_type
//   
//   typedef typename vspline::vector_traits < in_type , _vsize > iv_traits ;
//   typedef typename vspline::vector_traits < out_type , _vsize > ov_traits ;
//   
//   // now we implement the actual broadcast
// 
//   /// vectorized evaluation takes an 'in_v' as it's argument and stores
//   /// it's output to an 'out_v'. We specifically enable this routine
//   /// for cases where 'vsize', the vectorization width, is greater than
//   /// one. This is necessary, since with Vc enabled, vsize may still
//   /// be one - for example if if the data type requires it - and not
//   /// using the enable_if statement would produce a second 'eval'
//   /// overload with identical signature to the one above.
//   
//   template < typename = std::enable_if < ( vsize > 1 ) > >
//   void eval ( const in_v & in ,
//                     out_v & out ) const
//   {
//     // we want TinyVectors even if there is only one channel.
//     // this way we can iterate over the channels.
// 
//     typedef typename iv_traits::nd_ele_v iv_type ;
//     typedef typename ov_traits::nd_ele_v ov_type ;
//     
//     // we reinterpret input and output as nD types. in_v/out_v are
//     // plain SIMD types if in_type/out_type is fundamental; here we
//     // want a TinyVector of one element in this case.
//     
//     const iv_type & iv ( reinterpret_cast < const iv_type & > ( in ) ) ;
//     ov_type & ov ( reinterpret_cast < ov_type & > ( out ) ) ;
//     
//     // now the broadcast:
//     
//     // we use the equivalent reinterpretation to un-simdized
//     // input/output data
//     
//     typename iv_traits::nd_ele_type i ;
//     typename ov_traits::nd_ele_type o ;
//     
//     in_type  & iref ( reinterpret_cast < in_type & >  ( i ) ) ;
//     out_type & oref ( reinterpret_cast < out_type & > ( o ) ) ;
//     
//     for ( int e = 0 ; e < _vsize ; e++ )
//     {
//       // extract the eth input value from the simdized input
//       
//       for ( int d = 0 ; d < iv_traits::dimension ; d++ )
//         i[d] = iv[d][e] ;
//       
//       // process it with eval, passing the eval-compatible references
//       
//       inner.eval ( iref , oref ) ;
//       
//       // now distribute eval's result to the SIMD output
//       
//       for ( int d = 0 ; d < ov_traits::dimension ; d++ )
//         ov[d][e] = o[d] ;
//     }
//   }
// 
// } ;
// 
// /// type of a std::function for unvectorized evaluation:
// 
// template < class IN , class OUT >
// using eval_type = std::function < void ( const IN & , OUT & ) > ;
// 
// /// helper class hold_type holds a single-element evaluation function
// 
// template < class IN , class OUT >
// struct hold_type
// : public vspline::unary_functor < IN , OUT , 1 >
// {
//   const eval_type < IN , OUT > eval ;
//   
//   hold_type ( eval_type < IN , OUT > _eval )
//   : eval ( _eval )
//   { } ;
// } ;
// 
// /// factory function to create a broadcast_type from another vspline::unary_functor
// /// This will pick the other functor's unvectorized eval routine and broadcast it,
// /// The vectorized eval routine of the other functor (if present) is ignored.
// 
// template < class inner_type , size_t _vsize >
// broadcast_type < inner_type , _vsize > broadcast ( const inner_type & inner )
// {
//   return broadcast_type < inner_type , _vsize > ( inner ) ;
// }
// 
// /// factory function to create a broadcast_type from a std::function
// /// implementing the unvectorized evaluation.
// /// to broadcast a single-value evaluation function, we package it
// /// in a hold_type, which broadcast can handle.
// 
// template < class IN , class OUT , size_t _vsize >
// broadcast_type < hold_type < IN , OUT > , _vsize >
// broadcast ( eval_type < IN , OUT > _eval )
// {
//   return broadcast_type < hold_type < IN , OUT > , _vsize >
//     ( hold_type < IN , OUT > ( _eval ) ) ;
// }

/// eval_wrap is a helper function template, wrapping an 'ordinary'
/// function which returns some value given some input in a void function
/// taking input as const reference and writing output to a reference,
/// which is the signature used for evaluation in vspline::unary_functors.

template < class IN , class OUT >
std::function < void ( const IN& , OUT& ) >
eval_wrap ( std::function < OUT ( const IN& ) > f )
{
  return [f] ( const IN& in , OUT& out ) { out = f ( in ) ; } ;
}

/// class grok_type is a helper class wrapping a vspline::unary_functor
/// so that it's type becomes opaque - a technique called 'type erasure',
/// here applied to vspline::unary_functors with their specific
/// capability of providing both vectorized and unvectorized operation
/// in one common object.
///
/// While 'grokking' a unary_functor may degrade performance slightly,
/// the resulting type is less complex, and when working on complex
/// constructs involving several unary_functors, it can be helpful to
/// wrap the whole bunch into a grok_type for some time to make compiler
/// messages more palatable. I even suspect that the resulting functor,
/// which simply delegates to two std::functions, may optimize better at
/// times than a more complex functor in the 'grokkee'.
///
/// Performance aside, 'grokking' a vspline::unary_functor produces a
/// simple, consistent type that can hold *any* unary_functor with the
/// given input and output type(s), so it allows to hold and use a
/// variety of (intrinsically differently typed) functors at runtime
/// via a common handle which is a vspline::unary_functor itself and
/// can be passed to the transform-type routines. With unary_functors
/// being first-class, copyable objects, this also makes it possible
/// to pass around unary_functors between different TUs where user
/// code can provide new functors at will which can simply be used
/// without having to recompile to make their type known, at the cost
/// of a call through a std::function.
///
/// grok_type also provides a convenient way to introduce functors into
/// vspline. Since the functionality is implemented with std::functions,
/// we allow direct initialization of these std::functions on top of
/// 'grokking' the capabilities of another unary_functor via lambda
/// expressions. 'Ordinary' functions can also be grokked.
///
/// For grok_type objects where _vsize is greater 1, there are
/// constructor overloads taking only a single function. These
/// constructors broadcast the unvectorized function to process
/// vectorized data, providing a quick way to produce code which
/// runs with vector data, albeit less efficiently than true vector
/// code.
///
/// finally, for convenience, grok_type also provides operator(),
/// to use the grok_type object with function call syntax, and it
/// also provides the common 'eval' routine(s), just like any other
/// unary_functor.

template < typename IN ,    // argument or input type
           typename OUT ,   // result type
           size_t _vsize = vspline::vector_traits < IN > :: size
         >
struct grok_type
: public vspline::unary_functor < IN , OUT , _vsize > ,
  public vspline::callable < grok_type < IN , OUT , _vsize > ,
                             IN , OUT , _vsize >
{
  typedef vspline::unary_functor < IN , OUT , _vsize > base_type ;

  enum { vsize = _vsize } ;
  using typename base_type::in_type ;
  using typename base_type::out_type ;
  using typename base_type::in_v ;
  using typename base_type::out_v ;
  
  typedef std::function < void ( const in_type & , out_type & ) > eval_type ;
  typedef std::function < out_type ( const in_type & ) > call_type ;

  eval_type _ev ;

  // given these types, we can define the types for the std::function
  // we will use to wrap the grokkee's evaluation code in.

  typedef std::function < void ( const in_v & , out_v & ) > v_eval_type ;
  
  // this is the class member holding the std::function:

  v_eval_type _v_ev ;

  // we also define a std::function type using 'normal' call/return syntax

  typedef std::function < out_v ( const in_v & ) > v_call_type ;

  /// we provide a default constructor so we can create an empty
  /// grok_type and assign to it later. Calling the empty grok_type's
  /// eval will result in an exception.

  grok_type() { } ;
  
  /// direct initialization of the internal evaluation functions
  /// this overload, with two arguments, specifies the unvectorized
  /// and the vectorized evaluation function explicitly.
  
  grok_type ( const eval_type & fev ,
              const v_eval_type & vfev )
  : _ev ( fev ) ,
    _v_ev ( vfev )
  { } ;
  
  /// constructor taking a call_type and a v_call_type,
  /// initializing the two std::functions _ev and _v_ev
  /// with wrappers around these arguments which provide
  /// the 'standard' vspline evaluation functor signature

  grok_type ( call_type f , v_call_type vf )
  : _ev ( eval_wrap ( f ) )
  , _v_ev ( eval_wrap ( vf ) )
  { } ;
    
  /// constructor from 'grokkee' using lambda expressions
  /// to initialize the std::functions _ev and _v_ev.
  /// we enable this if grokkee_type is a vspline::unary_functor

  template < class grokkee_type ,
             typename std::enable_if
              < std::is_base_of
                < vspline::unary_functor_tag < vsize > ,
                  grokkee_type
                > :: value ,
                int
              > :: type = 0
            >
  grok_type ( grokkee_type grokkee )
  : _ev ( [ grokkee ] ( const IN & in , OUT & out )
            { grokkee.eval ( in , out ) ; } )
  , _v_ev ( [ grokkee ] ( const in_v & in , out_v & out )
            { grokkee.eval ( in , out ) ; } )
  { } ;
    
//   /// constructor taking only an unvectorized evaluation function.
//   /// this function is broadcast, providing evaluation of SIMD types
//   /// with non-vector code, which is less efficient.
// 
//   grok_type ( const eval_type & fev )
//   : _ev ( fev ) ,
//     _v_ev ( [ fev ] ( const in_v & in , out_v & out )
//             { vspline::broadcast < IN , OUT , vsize > (fev)
//                        .eval ( in , out ) ; } )
//   { } ;
//   
//   /// constructor taking only one call_type, which is also broadcast,
//   /// since the call_type std::function is wrapped to provide a
//   /// std::function with vspline's standard evaluation functor signature
//   /// and the result is fed to the single-argument functor above.
// 
//   grok_type ( const call_type & f )
//   : grok_type ( eval_wrap ( f ) )
//   { } ;
  
  /// unvectorized evaluation. This is delegated to _ev.

  void eval ( const IN & i , OUT & o ) const
  {
    _ev ( i , o ) ;
  }
  
  /// vectorized evaluation function template
  /// the eval overload above will catch calls with (in_type, out_type)
  /// while this overload will catch vectorized evaluations.

  template < typename = std::enable_if < ( vsize > 1 ) > >
  void eval ( const in_v & i , out_v & o ) const
  {
    _v_ev ( i , o ) ;
  }
  
} ;

/// specialization of grok_type for _vsize == 1
/// this is the only possible specialization if vectorization is not used.
/// here we don't use _v_ev but only the unvectorized evaluation.

template < typename IN ,    // argument or input type
           typename OUT     // result type
         >
struct grok_type < IN , OUT , 1 >
: public vspline::unary_functor < IN , OUT , 1 > ,
  public vspline::callable < grok_type < IN , OUT , 1 > ,
                             IN , OUT , 1 >
{
  typedef vspline::unary_functor < IN , OUT , 1 > base_type ;

  enum { vsize = 1 } ;
  using typename base_type::in_type ;
  using typename base_type::out_type ;
  using typename base_type::in_v ;
  using typename base_type::out_v ;
  
  typedef std::function < void ( const in_type & , out_type & ) > eval_type ;
  typedef std::function < out_type ( const in_type & ) > call_type ;

  eval_type _ev ;

  grok_type() { } ;
  
  template < class grokkee_type ,
             typename std::enable_if
              < std::is_base_of
                < vspline::unary_functor_tag < 1 > ,
                  grokkee_type
                > :: value ,
                int
              > :: type = 0
           >
  grok_type ( grokkee_type grokkee )
  : _ev ( [ grokkee ] ( const IN & in , OUT & out )
            { grokkee.eval ( in , out ) ; } )
  { } ;
  
  grok_type ( const eval_type & fev )
  : _ev ( fev )
  { } ;
  
  grok_type ( call_type f )
  : _ev ( eval_wrap ( f ) )
  { } ;
    
  void eval ( const IN & i , OUT & o ) const
  {
    _ev ( i , o ) ;
  }
  
} ;

/// grok() is the corresponding factory function, wrapping grokkee
/// in a vspline::grok_type.

template < class grokkee_type >
vspline::grok_type < typename grokkee_type::in_type ,
                     typename grokkee_type::out_type ,
                     grokkee_type::vsize >
grok ( const grokkee_type & grokkee )
{
  return vspline::grok_type < typename grokkee_type::in_type ,
                              typename grokkee_type::out_type ,
                              grokkee_type::vsize >
                  ( grokkee ) ;
}

/// amplify_type amplifies it's input with a factor. If the data are
/// multi-channel, the factor is multi-channel as well and the channels
/// are amplified by the corresponding elements of the factor.
/// I added this class to make work with integer-valued splines more
/// comfortable - if these splines are prefiltered with 'boost', the
/// effect of the boost has to be reversed at some point, and amplify_type
/// does just that when you use 1/boost as the 'factor'.

template < class _in_type ,
           class _out_type = _in_type ,
           class _math_type = _in_type ,
           size_t _vsize = vspline::vector_traits < _in_type > :: vsize >
struct amplify_type
: public vspline::unary_functor < _in_type , _out_type , _vsize > ,
  public vspline::callable
         < amplify_type < _in_type , _out_type , _math_type , _vsize > ,
           _in_type ,
           _out_type ,
           _vsize
         >
{
  typedef typename
    vspline::unary_functor < _in_type , _out_type , _vsize > base_type ;
  
  enum { vsize = _vsize } ;
  enum { dimension = base_type::dim_in } ;
  
  // TODO: might assert common dimensionality
  
  using typename base_type::in_type ;
  using typename base_type::out_type ;
  using typename base_type::in_v ;
  using typename base_type::out_v ;
  using typename base_type::in_nd_ele_v ;
  using typename base_type::out_nd_ele_v ;

  typedef _math_type math_type ;
  
  typedef typename vigra::ExpandElementResult < math_type > :: type
    math_ele_type ;
    
  typedef vigra::TinyVector < math_ele_type , dimension > math_nd_ele_type ;
  
  typedef typename vspline::vector_traits < math_ele_type , vsize > :: type
    math_ele_v ;
  
  const math_type factor ;
  
  // constructors initialize factor. If dimension is greater than 1,
  // we have two constructors, one taking a TinyVector, one taking
  // a single value for all dimensions.

  template < typename = std::enable_if < ( dimension > 1 ) > >
  amplify_type ( const math_type & _factor )
  : factor ( _factor )
  { } ;
  
  amplify_type ( const math_ele_type & _factor )
  : factor ( _factor )
  { } ;
  
  void eval ( const in_type & in , out_type & out ) const
  {
    out = out_type ( math_type ( in ) * factor ) ;
  }
  
  template < typename = std::enable_if < ( vsize > 1 ) > >
  void eval ( const in_v & in , out_v & out ) const
  {
    // we take a view to the arguments as TinyVectors, even if
    // the data are 'singular'

    const in_nd_ele_v & _in
      = reinterpret_cast < in_nd_ele_v const & > ( in ) ;
      
    const math_nd_ele_type & _factor
      = reinterpret_cast < math_nd_ele_type const & > ( factor ) ;
    
    out_nd_ele_v & _out
      = reinterpret_cast < out_nd_ele_v & > ( out ) ;
    
    // and perform the application of the factor element-wise

    for ( int i = 0 ; i < dimension ; i++ )
      vspline::assign ( _out[i] , math_ele_v ( _in[i] ) * _factor[i] ) ;
  }
  
} ;

/// flip functor produces it's input with component order reversed.
/// This can be used to deal with situations where coordinates in
/// the 'wrong' order have to be fed to a functor expecting the opposite
/// order and should be a fast way of doing so, since the compiler can
/// likely optimize it well.
/// I added this class to provide simple handling of incoming NumPy
/// coordinates, which are normally in reverse order of vigra coordinates

template < typename _in_type ,
           size_t _vsize = vspline::vector_traits < _in_type > :: vsize >
struct flip
: public vspline::unary_functor < _in_type , _in_type , _vsize > ,
  public vspline::callable
         < flip < _in_type , _vsize > ,
           _in_type ,
           _in_type ,
           _vsize
         >
{
  typedef typename vspline::unary_functor
                     < _in_type , _in_type , _vsize > base_type ;
                     
  enum { vsize = _vsize } ;
  enum { dimension = base_type::dim_in } ;
  
  using typename base_type::in_type ;
  using typename base_type::out_type ;
  using typename base_type::in_v ;
  using typename base_type::out_v ;
  using typename base_type::in_nd_ele_type ;
  using typename base_type::out_nd_ele_type ;
  using typename base_type::in_nd_ele_v ;
  using typename base_type::out_nd_ele_v ;
  
  void eval ( const in_type & in_ , out_type & out ) const
  {
    // we need a copy of 'in' in case _in == out
    
    in_type in ( in_ ) ;
    
    // we take a view to the arguments as TinyVectors, even if
    // the data are 'singular'

    const in_nd_ele_type & _in
      = reinterpret_cast < in_nd_ele_type const & > ( in ) ;
      
    out_nd_ele_type & _out
      = reinterpret_cast < out_nd_ele_type & > ( out ) ;
      
    for ( int e = 0 ; e < dimension ; e++ )
      _out [ e ] = _in [ dimension - e - 1 ] ;
  }
  
  template < typename = std::enable_if < ( vsize > 1 ) > >
  void eval ( const in_v & in_ , out_v & out ) const
  {
    // we need a copy of 'in' in case _in == out
    
    in_v in ( in_ ) ;
    
    // we take a view to the arguments as TinyVectors, even if
    // the data are 'singular'

    const in_nd_ele_v & _in
      = reinterpret_cast < in_nd_ele_v const & > ( in ) ;
      
    out_nd_ele_v & _out
      = reinterpret_cast < out_nd_ele_v & > ( out ) ;
    
    for ( int e = 0 ; e < dimension ; e++ )
      vspline::assign ( _out [ e ] , _in [ dimension - e - 1 ] ) ;
  }
  
} ;

} ; // end of namespace vspline

#endif // VSPLINE_UNARY_FUNCTOR_H

