Template Metaprogramming Patterns

martin-adams-522797-unsplash

SFINAE – Substitution Failure is Not An Error

Template meta programming is fun. Lot of simple algorithms can be done at compile time and in very powerful way. The code uses a template meta-programming pattern called as SFINAE – Substitution failure is not an error.

Following are three examples for common problems like Fibonacci, Factorial and GCD calculations using SFINAE.

#include <iostream>

// find fibonacci using template meta-programming
template <long long N>
struct fibonacci{
  static constexpr long long value = fibonacci<N-1>::value + fibonacci<N-2>::value;
};

template<>
struct fibonacci<1>{
  static constexpr long long value = 1;
};

template<>
struct fibonacci<0>{
  static constexpr long long value = 0;
};

// end code for fibonacci
// -------------------------------------------------------------------------------------------
// find gcd using template meta programming
template<unsigned M, unsigned N>
struct gcd{
  static constexpr auto value = gcd<N,M%N>::value;
};

template<unsigned M>
struct gcd<M,0>{
  static_assert(M!=0);
  static constexpr auto value = M;
};
// -------------------------------------------------------------------------------------------
// factorial using template meta-programming
template<long long N>
struct factorial{
  static constexpr long long value = N*(factorial<N-1>::value);
};

template<>
struct factorial<0>{
  static constexpr long long value = 1;
};

int main(){
  std::cout << gcd<15,5>::value << '\n'; // 5
  std::cout << fibonacci<80>::value << '\n'; // 23416728348467685
  std::cout << factorial<20>::value << '\n'; // 2432902008176640000
}

More information about SINAF

 

CRTP – Curiously Recurring Template Pattern

The CRTP pattern was discovered  by an accident. There is another name for this pattern,  “Static Polymorphism”.The base class is generally templatized class and derived class is subclassed from base class with type as templatized class. The reason this is called as CRTP is because a method from derived class is called from base class statically.  The definition is rather complicated but the code should clear it.

#include <iostream>

struct Exchange{
  std::string name;
};

template <typename T>
struct Order{
  int transact(Exchange * exchange, size_t quantity){
    return static_cast<T *>(this)->transact(quantity);
  }
};

struct BuyOrder: Order<BuyOrder>{
  int transact(Exchange * exchange, size_t quantity){
    buy();
    return quantity;
  }

  void buy(){
    // code for buy goes here..
  }
};

struct SellOrder: Order<SellOrder>{
  int transact(Exchange * exchange, size_t quantity){
    sell();
    return quantity;
  }

  void sell(){
    // code for sell goes here..
  }
};

int main(){
  Exchange e;
  BuyOrder bo;
  bo.transact(e, 100);
  SellOrder so;
  so.transact(e, 100);
}

 

Modern c++ coding Thread pool

First attempt is to code the thread pool is using simple conditional variables and mutex. Once the multi-producer, multi-consumer thread pool is working as expected, its easy to convert into lock free buffer.

The ​​Threadpool class is a struct with following member variables.

  1. struct Work with following member variables
    1. _work to store the lambda/function pointer
    2. _args_ary to store the array
    3. is_processed boolean flag.
  2. std::array of a fixed size to hold the work.
  3. static const variable to indicate the max_size of work array.
  4. head_idx and tail_idx to guide the next index.
  5. a condition variable cv
  6. mutex m

There are 2 main methods for enqueuing and popping the items out and working on it. Since its a circular queue, next index calculated by adding one to current index and taking remainder from max_size. i.e.

next_idx = (current_index + 1 ) % max_size

Here is an example of SPSC – single producer single consumer locked queue. The queue is bounded and blocking. Means, if either the producer or the consumer is slow, it is going to block using the condition variable and mutex.

#include <iostream>
#include <thread>
#include <array>
#include <functional>
#include <vector>
#include <random>
#include <chrono>

using func_type = std::function<int(int,int)>;
using args_ary = std::array<int,2>;

struct ThreadPool{
  // Enqueue function to get the function and arguments to the function.
  // consumer can invoke function with args passed on to it. 
  bool enqueue(func_type fn, args_ary args){
    size_t next_idx = (head_idx + 1) % max_size;

    std::unique_lock<std::mutex> lk(mtx);
    cv.wait(lk,[&](){
      return work[next_idx].processed();
    });

    Work w{fn, args};
    work[next_idx] = w;
    head_idx = next_idx;
    cv.notify_all();
    return true;
  }

  // process or pop function to pop the latest record and process it.
  bool process(){
    size_t next_idx = (tail_idx + 1) % max_size;
    std::unique_lock<std::mutex> lk(mtx);
    cv.wait(lk,[&](){
      return !work[next_idx].processed();
    });

    Work w = work[next_idx];
    work[next_idx].set_processed();

    if(!w.invalid()){
      int output = w.process();
      std::cout << output << '\n';
    }
    tail_idx = next_idx;
    cv.notify_all();
    return true;
}

private:
  struct Work{
    private:
      func_type _work;
      args_ary _args;
      bool is_processed = false;

    public:
      Work():is_processed(true){};
      Work(func_type work, args_ary args):_work{work},_args{args},is_processed(false){};
      bool processed(){
        return is_processed;
      }
      int process(){
        return _work(_args[0], _args[1]);
      }
      bool invalid(){
        return _work == nullptr;
      }
      void set_processed(){
        is_processed = true;
      }
   };

   static const size_t max_size = 1000;
   std::array<Work,max_size> work;
   size_t head_idx = 0, tail_idx = 0;
   std::condition_variable cv;
  std::mutex mtx;
};

int random_num(){
  std::mt19937 rng;
  rng.seed(std::random_device()());
  std::uniform_int_distribution<std::mt19937::result_type> dist(1,10000); // distribution in range [1,10000]
  return dist(rng);
}

int main(){
  ThreadPool tp;

  std::thread producer1([&](){
    int count = 0;
    while(true){
      std::function<int(int,int)> fn;
      if (count % 3 == 0){
        fn = [](int a, int b){ std::cout << "Adding \t\t" << a << '\t'<< b << '\t'; return a+b;};
      }else if(count % 3 == 1){
        fn = [](int a, int b){ std::cout << "Subtracting \t" << a << '\t'<< b << '\t'; return a-b;};
      }else{
        fn = [](int a, int b){std::cout << "Multiplying \t" << a << '\t'<< b << '\t'; return a*b;};
      }
      tp.enqueue(fn, args_ary{random_num(),random_num()});
      std::this_thread::sleep_for(std::chrono::milliseconds(10));
      ++count;
     }
  });

  std::thread consumer1([&](){
    while(true){
      tp.process();
    }
  });

  if(producer1.joinable()){
    producer1.join();
  }

  if(consumer1.joinable()){
    consumer1.join();
  }
  return 0;
}

Here is a complete working example for MPMC queue.

#include <iostream>
#include <thread>
#include <array>
#include <functional>
#include <vector>
#include <random>
#include <chrono>

using func_type = std::function<int(int,int)>;
using args_ary = std::array<int,2>;

struct ThreadPool{
  ThreadPool(){
    for(int i = 0; i < max_size; ++i){
      work[i] = Work();
    }
  }

  bool enqueue(func_type fn, args_ary args){
    size_t next_idx = (head_idx + 1) % max_size;
    while(work[next_idx].processed()){
      std::unique_lock<std::mutex> lk(mtx);
      cv.wait(lk,[&](){
        return work[next_idx].processed();
      });

      Work w = work[next_idx];
      if(w.invalid() || w.processed()){
        work[next_idx].set_work(fn);
        work[next_idx].set_args(args);
        work[next_idx].reset();
        cv.notify_all();
        break;
      }
   }
   head_idx = next_idx;
   return true;
}

  bool process(int remainder, int total_threads){
    size_t next_idx = (tail_idx + 1) % max_size;

    std::unique_lock<std::mutex> lk(mtx);
    cv.wait(lk,[&](){
      return work[next_idx].unprocessed();
    });

    Work w = work[next_idx];
    work[next_idx].set_processed();

    if(!w.invalid()){
      int output = w.process();
      std::cout << output << '\n';
    }
    tail_idx = next_idx;
    cv.notify_all();
    return true;
  }

  private:
    struct Work{
    private:
      func_type _work;
      args_ary _args;
      bool is_processed = false;

    public:
      Work():is_processed(true){}; 
      int process(){
        return _work(_args[0], _args[1]);
      }
    
    bool invalid(){
      return _work == nullptr;
    }
    
    void set_processed(){
      is_processed = true;
    }

    void set_work(func_type work){
      _work = work;
    }

    void set_args(args_ary args){
      _args = args;
    }
    void reset(){
      is_processed = false;
    }

    bool unprocessed(){
      return is_processed == false;
    }

    bool processed(){
      return is_processed == true;
    }
};

  static const size_t max_size = 1000;
  std::array<Work,max_size> work;
  size_t head_idx = 0, tail_idx = 0;
  std::condition_variable cv;
  std::mutex mtx;
};

int random_num(){
  std::mt19937 rng;
  rng.seed(std::random_device()());
  std::uniform_int_distribution<std::mt19937::result_type> dist(1,10000); // distribution in range [1,10000]
  return dist(rng);
}

int main(){
  ThreadPool tp;
  auto adder = [](int a, int b){ std::cout << "Adding \t\t" << a << '\t'<< b << '\t'; return a+b;};
  auto subtractor = [](int a, int b){ std::cout << "Subtracting \t" << a << '\t'<< b << '\t'; return a-b;};
  auto multiplicator = [](int a, int b){std::cout << "Multiplying \t" << a << '\t'<< b << '\t'; return a*b;};

  std::thread producer1([&](){
    while(true){
      tp.enqueue(adder, args_ary{random_num(),random_num()});
      //std::this_thread::sleep_for(std::chrono::milliseconds(10));
    } 
  });

  std::thread producer2([&](){
    while(true){
      tp.enqueue(subtractor, args_ary{random_num(),random_num()}); 
      //std::this_thread::sleep_for(std::chrono::milliseconds(10));
  } 
});

std::thread producer3([&](){
  while(true){
    tp.enqueue(multiplicator, args_ary{random_num(),random_num()}); 
      //std::this_thread::sleep_for(std::chrono::milliseconds(10));
  } 
});

std::thread consumer1([&](){
  while(true){
    tp.process(0, 3);
  }
});

std::thread consumer2([&](){
  while(true){
    tp.process(1, 3);
  }
});

std::thread consumer3([&](){
  while(true){
    tp.process(2, 3);
  }
});


  if(producer1.joinable()){
    producer1.join();
  }

  if(producer2.joinable()){
    producer2.join();
  }

  if(producer3.joinable()){
    producer3.join();
  }

  if(consumer1.joinable()){
    consumer1.join();
  }

  if(consumer2.joinable()){
    consumer2.join();
  }
  if(consumer3.joinable()){
    consumer3.join();
  }

  return 0;
}

Output

Adding      9316 8539 17855
Subtracting 3529 1274 2255
Multiplying 7821 2404 18801684
Adding      9455 8300 17755
Subtracting 9882 5696 4186
Multiplying 202 9795 1978590
Adding      5794 7979 13773
Subtracting 4331 576 3755
Multiplying 963 4548 4379724
Adding      7238 9714 16952
Subtracting 9868 5185 4683
Multiplying 9139 4941 45155799
Adding      6842 7031 13873
Subtracting 3043 2312 731

Lockfree code

A Single producer, single consumer Lockfree queue can be written as follows

#include <iostream>
#include <thread>
#include <array>
#include <functional>
#include <vector>
#include <random>
#include <chrono>

using func_type = std::function<int(int,int)>;
using args_ary = std::array<int,2>;

struct ThreadPool{
  // Constructor
  ThreadPool(){
    for(int i = 0; i < max_size; ++i){
      work[i] = Work();
    }
  }

  bool enqueue(func_type fn, args_ary args){
    size_t current_head_idx = head_idx.load(std::memory_order_acquire);
    std::atomic<size_t> next_idx = (current_head_idx + 1) % max_size;

    Work w = work[next_idx];
    if(w.invalid() || w.processed()){
      work[next_idx].set_work(fn);
      work[next_idx].set_args(args);
      work[next_idx].reset();
      head_idx.store(next_idx, std::memory_order_release);
    }
    return true;
  }

  bool process(){
    size_t current_tail_idx = tail_idx.load(std::memory_order_acquire);
    std::atomic<size_t> next_idx = (current_tail_idx + 1) % max_size;

    if(!work[next_idx].processed()){
      int output = work[next_idx].process();
      std::cout << output << '\n';
      work[next_idx].set_processed();
      tail_idx.store(next_idx);
     }
    return true;
}

  private:
    struct Work{
      private:
        func_type _work;
        args_ary _args;
        bool is_processed = false;

      public:
        Work():is_processed(true){};
        Work(func_type work, args_ary args):_work{work},_args{args},is_processed(false){};
        bool processed(){
          return is_processed == true;
        }

       int process(){
         return _work(_args[0], _args[1]);
       }

       bool invalid(){
         return _work == nullptr;
       }

       void set_processed(){ 
         is_processed = true;
       }

       void set_work(func_type work){
         _work = work;
       }

       void set_args(args_ary args){
         _args = args;
       }
       
       void reset(){
         is_processed = false;
       }
    };

    static const size_t max_size = 1000;
    std::array<Work,max_size> work;
    std::atomic<size_t> head_idx{0}, tail_idx{0};
};

int random_num(){
  std::mt19937 rng;
  rng.seed(std::random_device()());
  std::uniform_int_distribution<std::mt19937::result_type> dist(1,10000); // distribution in range [1,10000]
  return dist(rng);
}

int main(){
  ThreadPool tp;

  std::thread producer1([&](){
    int count = 0;
    while(true){
      std::function<int(int,int)> fn;
      if (count % 3 == 0){
        fn = [](int a, int b){ 
               std::cout << "Adding \t\t" << a << '\t'<< b << '\t'; 
               return a+b;
              };
      }else if(count % 3 == 1){
        fn = [](int a, int b){ 
               std::cout << "Subtracting \t" << a << '\t'<< b << '\t'; 
               return a-b;};
      }else{
        fn = [](int a, int b){
              std::cout << "Multiplying \t" << a << '\t'<< b << '\t'; 
              return a*b;};
        }
       tp.enqueue(fn, args_ary{random_num(),random_num()});
       ++count;
    }
});

  std::thread consumer1([&](){
    while(true){
     tp.process();
    }
  });

  if(producer1.joinable()){
    producer1.join();
  }

  if(consumer1.joinable()){
    consumer1.join();
  }
  return 0;
}

Vector back inserter example + algorithm

The problem is pretty simple: there is a list of time ranges for meetings and need to show the meetings as continuous block in calendar.

70940aec3ea953621db1b8faf619a5ca_2

As per modern c++ I have used std::back_inserter_iterator for vector instead of passing vector by value or by reference. This approach is much cleaner, modern and saves a lot of copying of data compared to old style.

// i/p: [(0, 1), (3, 5), (4, 8), (10, 12), (9, 10)]
// o/p: [(0, 1), (3, 8), (9, 12)]

#include <iostream>
#include <vector>
#include <algorithm>

struct Meeting; // define meeting

using it = std::vector<Meeting>::iterator;
// use an back inserter iterator to pass as function argument instead
// of passing vector by reference or creating vector on heap
using bi = std::back_insert_iterator<std::vector<Meeting>>;

struct Meeting{
  int start_time, end_time;
  // default constructor
  Meeting() = default;
 // parameterized constructor
  Meeting(int start_time, int end_time):start_time{start_time},end_time{end_date}{}
  
  // override < operator used for sorting.
  bool operator <(const Meeting &other) const{
    return start_date < other.start_date;
  }

static void merge(it start, it finish, bi merged_meeting_bi){
  int start_time = start->start_time;
  int end_time = start->end_time;
  while(start < finish){
    start++;
    if(start->start_time > end_time){
      merged_meeting_bi = {start_time, end_time}; // create meeting using parameterized constructor
      start_time = start->start_time; 
      end_time = start->end_time;
    }else{
      end_time = std::max(start->end_time, end_time);
     }
   }
  }
};

int main(){
  std::vector<Meeting> meetings{{0, 1}, {3, 5}, {4, 8}, {10, 12}, {9, 10}};
  std::vector<Meeting> merged_meetings;
  auto bi = std::back_insert_iterator<std::vector<Meeting>>(merged_meetings);
  auto start = meetings.begin();
  auto finish = meetings.end();
  // sort by first time
  std::sort(start, finish);
  
for(const auto &meeting : meetings){
std::cout << meeting.start_time << ' ' << meeting.end_time << '\n';
}
Meeting::merge(start, finish, bi);

for(const auto & meeting: merged_meetings){
std::cout << meeting.start_time << ' ' << meeting.end_time << '\n';
}
return 0;
}

Iterators – look and say sequence algorithm

Look and say sequence is fun little exercise.. here is the code

#include <iostream>
#include <vector>
#include <string>
#include <numeric>

std::string looknsay(int num){
  // initialize vector with vector of ints.
  // add default seed value to it [1].
  std::vector<std::vector<int>> look_n_say_seq{{1}};
  // iterate over until the number n is reached.
  for(int n = 0; n <= num; n++){
    auto latest_sequence = look_n_say_seq.back();
    std::vector<int> next_sequence;
   // keep a pair of iterators to move across the latest sequence from vector.
    auto i = latest_sequence.begin();
    auto j = latest_sequence.begin();
    while(i < latest_sequence.end()){
      for(;*i==*j && j < latest_sequence.end();++j){}
      // distance will give the count
      next_sequence.push_back(std::distance(i, j));
      // *i will give the actual number to iterate on
      next_sequence.push_back(*i);
      i = j;
    }
    // insert newly created sequence into the vector.
    look_n_say_seq.push_back(next_sequence);
  }
  auto latest_sequence = look_n_say_seq.back();
 // accumulate - join vector of ints to make string and return
  return std::accumulate(std::next(latest_sequence.begin()), 
                         latest_sequence.end(),
                          std::to_string(*latest_sequence.begin()),
                          [](std::string s, int i){return s += std::to_string(i);});
}
int main(){
  std::cout << looknsay(0) << '\n';
  std::cout << looknsay(1) << '\n'; // 11
  std::cout << looknsay(2) << '\n'; // 21
  std::cout << looknsay(3) << '\n'; // 1211
  std::cout << looknsay(4) << '\n'; // 111221
  std::cout << looknsay(5) << '\n'; // 312211
}

std::queue operations

Iteration:

Iteration is very basic and important operation of any container type like std::vector, std::deque or std::list.

std::queue<int> q;
for(int i = 0; i <= 10; i++){
  q.push(i);
}
for(int i = 0; i <= 10; i++){
  std::cout << q.pop() << '\n';
}

Shuffle Queue:

Shuffling with extra space (using another deque)can be done as follows:

#include <iostream>
#include <queue>
#include <algorithm>
#include <random>
#include <deque>

int main(){
  std::queue<int> q;
  for(int i = 0; i <= 10; i++){
    q.push(i);
  }
 
 std::deque<int> d;
 for(int i = 0; i <= 10; i++){
  d.push_back(q.pop());
 }

  unsigned seed = std::chrono::system_clock::now().time_since_epoch().count();
  std::shuffle (d.begin(), d.end(), std::default_random_engine(seed));


  std::queue<int> empty;
  std::swap(q, empty);

  std::for_each(d.begin(), d.end(), [&q](const auto & i){q.push(i);});
  std::for_each(d.begin(), d.end(), [&q](const auto & i){std::cout << i << '\n';});
  
  for((int i = 0; i <= 10; i++){
   std::cout << q.pop() << '\n';
  }
  
}

Linked list using smart pointers

mike-alonzo-3347-unsplash

I have observed and code several versions of linked lists and came with my own implementation of linked list. This linked list uses smart pointers and supports operations like insert after, insert before, reverse and print etc for the list.

#include <iostream>
#include <memory>

template <typename T>
struct Node{
  T value;
  bool head = false;
  std::shared_ptr<Node<T>> next;

  Node(): head(true), value(-1){}; // head with -1 value
  Node(T value): head(false), value(value){}; // node

  std::shared_ptr<Node<T>> add_next(T value){
  next = std::make_shared<Node<T>>(value);
  return next;
  }
};

template<typename T>
struct List{
  std::shared_ptr<Node<T>> head;
  std::shared_ptr<Node<T>> tail;

  List(){
   head = std::make_shared<Node<T>>();
   tail = head;
  }

  void add_node(T value){
    tail->add_next(value);
    tail = tail->next; 
  }

 void print(){
   std::cout << std::string(100, '-') << '\n';
   auto current = head;
   while(current != nullptr){
     std::cout << current->value << '\t';
     current = current->next;
   }
   std::cout << '\n' << std::string(100, '-') << '\n';
 }

void reverse(){
  auto current = head;
  std::shared_ptr<Node<T>> next, prev;
  while(current != nullptr){
    next = current->next;
    current->next = prev;
    prev = current;
    current = next;
   }
   head = prev;
 }

void insert_after(T after_value, T value){
  auto current = head;
  while(current != nullptr){
   if(current->value == after_value){
     auto temp_next = current->next;
     auto node = create_node(value);
     current->next = node;
     node->next = temp_next;
     break;
   }
   current = current->next;
  }
 }

void insert_before(T before_value, T value){
  auto current = head;
  std::shared_ptr<Node<T>> prev;
  while(current != nullptr){
   if(current->value == before_value){
     auto node = create_node(value);
     node->next = current;
     prev->next = node;
   }
   prev = current;
   current = current->next;
  }
 }

 std::shared_ptr<Node<T>> create_node(T value){
   return std::make_shared<Node<T>>(value); 
  }
};

int main(){
 List<int> l; // initialize the linked list

 for(int i = 1; i <= 10; i++){
   l.add_node(i);
 }
  l.print();
  l.reverse();
  l.print();
  l.reverse();
  l.insert_after(1, 2);
  l.print(); 
  l.insert_before(8, 18);
  l.print(); 
  return 0;
}

Output:

g++ --std=c++17 linked_list.cpp -g -o linked_list && ./linked_list
-----------------------------------------------------------------------------------------------------
-1 1 2 3 4 5 6 7 8 9 10
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
10 9 8 7 6 5 4 3 2 1 -1
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-1 1 2 2 3 4 5 6 7 8 9 10
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-1 1 2 2 3 4 5 6 7 18 8 9 10
----------------------------------------------------------------------------------------------------

reverse in O(log n) time

The basic framework for divide and conquer algorithm can be used for any operations like reverse, checking if array is sorted etc. The possible implementation of reversing a string/vector is given as follows which takes O(n) time where n is size of collection string/vector.

template<class BidirIt>
void reverse(BidirIt first, BidirIt last)
{
 while ((first != last) && (first != --last)) {
 std::iter_swap(first++, last);
 }
}

Better algorithm for this can be given as follows:

#include <iostream>
#include <algorithm>
#include <string>

template <typename T>
void reverse_rotate(T first, T last){
 if((first == last) || std::next(first) == last) return;
 T middle = first;
 std::advance(middle, std::distance(first, last)/2);
 reverse_rotate(first, middle);
 reverse_rotate(middle, last);
 std::rotate(first, middle, last);
}

int main(){
 std::string s = "This is super cool";
 reverse_rotate(s.begin(), s.end());
 std::cout << s << '\n';
}