Our website uses cookies to enhance your browsing experience.
Accept
to the top
close form

Fill out the form in 2 simple steps below:

Your contact information:

Step 1
Congratulations! This is your promo code!

Desired license type:

Step 2
Team license
Enterprise license
** By clicking this button you agree to our Privacy Policy statement
close form
Request our prices
New License
License Renewal
--Select currency--
USD
EUR
* By clicking this button you agree to our Privacy Policy statement

close form
Free PVS‑Studio license for Microsoft MVP specialists
* By clicking this button you agree to our Privacy Policy statement

close form
To get the licence for your open-source project, please fill out this form
* By clicking this button you agree to our Privacy Policy statement

close form
I am interested to try it on the platforms:
* By clicking this button you agree to our Privacy Policy statement

close form
check circle
Message submitted.

Your message has been sent. We will email you at


If you haven't received our response, please do the following:
check your Spam/Junk folder and click the "Not Spam" button for our message.
This way, you won't miss messages from our team in the future.

>
>
>
The concept of smart pointer static_ptr…

The concept of smart pointer static_ptr<T> in C++

Aug 30 2022
Author:

We have several smart pointers in C++ – 'std::unique_ptr', 'std::shared_ptr', 'std::weak_ptr'.

There are also non-standard smart pointers, for example in boost: intrusive_ptr, local_shared_ptr.

0983_smartpointer/image1.png

We published and translated this article with the copyright holder's permission. The author is Evgeny Shulgin, (email – izaronplatz@gmail.com). The article was originally published on Habr.

In this article we are discussing a new smart pointer type – static_ptr. It is most similar to std::unique_ptr without dynamic allocations.

0983_smartpointer/image2.png

'std::unique_ptr<T>'

std::unique_ptr<T> is a pointer that wraps a non-smart T* pointer. Every C++ developer has probably used this class.

The most popular reason to use this pointer is dynamic polymorphism.

If at a compiling stage we don't "know" the class of the object we will create in a certain execution point, so we won't know the value by which we should increment the stack pointer. Therefore, we can't create this object on the stack — we can only create it in the heap.

Suppose we have a polymorphic class IEngine and its children TSteamEngine, TRocketEngine, TEtherEngine. The object of "some IEngine child known at runtime" is std::unique_ptr<IEngine> in most cases. So, the memory for the object is allocated in the heap.

0983_smartpointer/image3.png

Figure 1. std::unique_ptr<IEngine> with the objects of different size

Allocation of small objects

The heap allocations are for large objects (std::vector with lots of elements, etc.), while the stack is better for small objects.

On Linux, to get the stack size for a process, you can run the following method:

ulimit -s

It displays a low size by default. My systems had 8192 KiB = 8 MiB. While the heap memory enables you to consume gigabytes.

The allocation of too many small objects causes memory fragmentation and affects the CPU cache. You can use memory pool to prevent this. I recommend you to read a nice article about this.

Objects on the stack

How can we create an object similar to std::unique_ptr but completely on the stack?

The C++ library contains std::aligned_storage, which reserves raw memory on the stack. We can use this memory and placement new to make an object of the required class T. But don't forget to ensure that the memory size is not less than sizeof(T).

So, with the overhead of only a few unused bytes on the stack, we can create objects of the derived class.

'sp::static_ptr<T>'

I had intended to create a stack-only version of std::unique_ptr<T>, so I looked for ready-made implementations. The idea seemed to be staring me in the face.

After I thought up the words like stack_ptr, static_ptr, etc. and searched them on GitHub, I finally found a reasonable implementation in the ceph project, in ceph/static_ptr.h. I also discovered some useful ideas there. However, this class is not commonly used in the project, and the implementation has some significant blunders.

The implementation may look as follows: there is a buffer for an object (in the form of std::aligned_storage); and some data that allows us to properly handle the object: to call the destructor of the exact type that static_ptr currently contains.

0983_smartpointer/image4.png

Figure 2. sp::static_ptr<IEngine> with objects of different size (32-byte buffer)

The implementation: how challenging could be 'move'

In this chapter we discuss the step-by-step implementation and lots of its nuances.

I decided to put the static_ptr class inside namespace sp (from static pointer).

Implementations of containers, smart pointers and other things are generally some of the hardest programs on C++, because you should consider things that nobody checks in projects.

Suppose we want to call a move constructor to move bytes from one memory region to another. We could write this as follows:

template <typename T>
struct move_constructer
{
  static void call(T *lhs, T *rhs)
  {
    new (lhs) T(std::move(*rhs));
  }
};

// call `move_constructer<T>::call(dst, src);

But what to do, if the T class does not contain a move constructor?

If there is a chance that the T type has a move assignment operator, we can use it. Otherwise, we have to "break" the compilation.

The newer the C++ standard is, the easier it is to write code for these things. We get the following code (compiled in C++17):

template <typename T>
struct move_constructer
{
  static void call(T *lhs, T *rhs)
  {
    if constexpr (std::is_move_constructible_v<T>)
    {
      new (lhs) T(std::move(*rhs));
    }
    else if constexpr (   std::is_default_constructible_v<T>
                       && std::is_move_assignable_v<T>)
    {
      new (lhs) T();
      *lhs = std::move(*rhs);
    }
    else
    {
      []<bool flag = false>()
      { 
        static_assert(flag, "move constructor disabled");
      }();
    }
  }
};

(on the 10th line a static_assert compilation "break" occurs with a hack)

However, it is better to use the noexcept specifier when it is possible. In C++20, we get such code as simple as possible for now:

template <typename T>
struct move_constructer
{
  static void call(T *lhs, T *rhs)
    noexcept (std::is_nothrow_move_constructible_v<T>)
    requires (std::is_move_constructible_v<T>)
  {
    new (lhs) T(std::move(*rhs));
  }

  static void call(T *lhs, T *rhs)
    noexcept (   std::is_nothrow_default_constructible_v<T>
              && std::is_nothrow_move_assignable_v<T>)
    requires (  !std::is_move_constructible_v<T>
              && std::is_default_constructible_v<T>
              && std::is_move_assignable_v<T>)
  {
    new (lhs) T();
    *lhs = std::move(*rhs);
  }

We can create the move_assigner structure in a similar way. We could also make copy_constructer and copy_assigner, but our implementation doesn't require them. In static_ptr, the copy constructor and copy assignment operator will be deleted (as in unique_ptr).

Quick and dirty implementation of 'std::type_info'

Although static_ptr can store any object, it is better to "know" the exact type of the object static_ptr contains. For example, this would help us call the destructor of this particular object and do other things.

Here's what I've come up with after a few tries— we need to use the ops structure:

struct ops
{
  using binary_func = void(*)(void *dst, void *src);
  using unary_func = void(*)(void *dst);

  binary_func move_construct_func;
  binary_func move_assign_func;
  unary_func destruct_func;
};

And a couple of auxiliary functions to cast void* to T*...

template <typename T, typename Functor>
void call_typed_func(void *dst, void *src)
{
  Functor::call(static_cast<T*>(dst), static_cast<T*>(src));
}

template <typename T>
void destruct_func(void *dst)
{
  static_cast<T*>(dst)->~T();
}

And now we can set each T type to have our own copy of ops:

template <typename T>
static constexpr ops ops_for
{
  .move_construct_func = &call_typed_func<T, move_constructer<T>>,
  .move_assign_func = &call_typed_func<T, move_assigner<T>>,
  .destruct_func = &destruct_func<T>,
};

using ops_ptr = const ops *;

Now static_ptr stores a reference to the ops_for<T>, where T is the class of the object. And static_ptr contains this object.

The "I like to move it, move it" implementation

We can't copy static_ptr - we can only move it to another static_ptr. To choose a move we need to determine the type of both static_ptr objects.

  • If both static_ptr are empty (dst_ops = src_ops = nullptr) — no-op.
  • If both static_ptr contain the same type of objects (dst_ops = src_ops) — use move assign and destroy the object in src.
  • If static_ptr contain different types (dst_ops != src_ops) — destroy the object in dst, use move construct, destroy object in src, assign dst_ops to src_ops.

Here's the implementation of the following method:

// moving objects using ops
static void move_construct(void *dst_buf, ops_ptr &dst_ops,
                           void *src_buf, ops_ptr &src_ops)
{
  if (!src_ops && !dst_ops)
  {
    // both object are nullptr_t, do nothing
    return;
  }
  else if (src_ops == dst_ops)
  {
    // objects have the same type, make move
    (*src_ops->move_assign_func)(dst_buf, src_buf);
    (*src_ops->destruct_func)(src_buf);
    src_ops = nullptr;
  }
  else
  {
    // objects have different type
    // delete the old object
    if (dst_ops)
    {
      (*dst_ops->destruct_func)(dst_buf);
      dst_ops = nullptr;
    }
    // construct the new object
    if (src_ops)
    {
      (*src_ops->move_construct_func)(dst_buf, src_buf);
      (*src_ops->destruct_func)(src_buf);
    }
    dst_ops = src_ops;
    src_ops = nullptr;
  }
}

The buffer size and alignment implementation

Now we need to determine the default buffer size and the alignment, because std::aligned_storage requires these two values.

Obviously, the alignment of the derived class may surpass the alignment of the base class. Therefore, the alignment should be as maximum as possible. The std::max_align_t type helps us to do the following:

static constexpr std::size_t align = alignof(std::max_align_t);

My systems set it to 16, but some non-standard values are also possible.

And the memory from the malloc heap is also aligned to the maximum possible value by default.

The default buffer size can be set to 16 bytes or to sizeof(T), so we need to choose the one whose value will be larger.

template <typename T>
struct static_ptr_traits
{
  static constexpr std::size_t buffer_size =
    std::max(static_cast<std::size_t>(16), sizeof(T));
};

Obviously, we should add a specialization for our custom type, so objects of all derived classes could be stored. It's better to define a macro. It allows us to write code faster. We can create the macro to specify the buffer size for some class:

#define STATIC_PTR_BUFFER_SIZE(Tp, size)               \
namespace sp                                           \
{                                                      \
  template<> struct static_ptr_traits<Tp>              \
  {                                                    \
    static constexpr std::size_t buffer_size = size;   \
  };                                                   \
}

// example:
STATIC_PTR_BUFFER_SIZE(IEngine, 1024)

However, this is not enough for the target size to be "inherited" by all children. We can add one more macro using the std::is_base_of class template:

#define STATIC_PTR_INHERITED_BUFFER_SIZE(Tp, size)        \
namespace sp                                              \
{                                                         \
  template <typename T> requires std::is_base_of_v<Tp, T> \
  struct static_ptr_traits<T>                             \
  {                                                       \
    static constexpr std::size_t buffer_size = size;      \
  };                                                      \
}

// example:
STATIC_PTR_INHERITED_BUFFER_SIZE(IEngine, 1024)

The 'sp::static_ptr<T>' implementation

Now we implement the class. It contains only two fields — a reference to ops and the buffer for an object:

template <typename Base> requires(!std::is_void_v<Base>)
class static_ptr
{
private:
    static constexpr std::size_t buffer_size =
      static_ptr_traits<Base>::buffer_size;
    
    static constexpr std::size_t align = alignof(std::max_align_t);

    // Struct for calling object's operators
    // equals to `nullptr` when `buf_` contains no object
    // equals to `ops_for<T>` when `buf_` contains a `T` object
    ops_ptr ops_;

    // Storage for underlying `T` object
    // this is mutable so that `operator*` and `get()` can
    // be marked const
    mutable std::aligned_storage_t<buffer_size, align> buf_;

    // ...

First, we implement the reset function, which deletes the object. This function is commonly used:

    // destruct the underlying object
    void reset() noexcept(std::is_nothrow_destructible_v<Base>)
    {
      if (ops_)
      {
        (ops_->destruct_func)(&buf_);
        ops_ = nullptr;
      }
    }

Next, we implement basic constructors in the same way as std::unique_ptr:

    // operators, ctors, dtor
    static_ptr() noexcept : ops_ { nullptr } {}

    static_ptr(std::nullptr_t) noexcept : ops_ { nullptr } {}

    static_ptr& operator=(std::nullptr_t)
      noexcept(std::is_nothrow_destructible_v<Base>)
    {
      reset();
      return *this;
    }

Now we implement move constructor and move assignment operator:

    static_ptr(static_ptr &&rhs) : ops_ {  nullptr  }
    {
      move_construct(&buf_, ops_, &rhs.buf_, rhs.ops_);
    }

    static_ptr& operator=(static_ptr &&rhs)
    {
      move_construct(&buf_, ops_, &rhs.buf_, rhs.ops_);
      return *this;
    }

It is better, if we can accept static_ptr of other types. The other type should fit in the buffer and should be inherited from the current type:

  template <typename Derived>
  struct derived_class_check
  {
    static constexpr bool ok = sizeof(Derived) <= buffer_size
                            && std::is_base_of_v<Base, Derived>;
  };

We have to declare all instantiations as a 'friend' class:

  // support static_ptr's conversions of different types
  template <typename T> friend class static_ptr;

Then we need to rewrite the previous two functions as follows:

  template <typename Derived = Base>
  static_ptr(static_ptr<Derived> &&rhs)
    requires(derived_class_check<Derived>::ok)
      : ops_ { nullptr }
  {
    move_construct(&buf_, ops_, &rhs.buf_, rhs.ops_);
  }

  template <typename Derived = Base>
  static_ptr& operator=(static_ptr<Derived> &&rhs)
    requires(derived_class_check<Derived>::ok)
  {
    move_construct(&buf_, ops_, &rhs.buf_, rhs.ops_);
    return *this;
  }

The copy constructor is deleted:

  static_ptr(const static_ptr &) = delete;

  static_ptr& operator=(const static_ptr &) = delete;

The destructor destroys the object in the buffer:

  ~static_ptr()
  {
    reset();
  }

To create an object in the buffer in-place, we implement the emplace function. The existing object will be destructed, a new one will be constructed in the buffer, and the pointer to ops will be updated:

  // in-place (re)initialization
  template <typename Derived = Base, typename ...Args>
  Derived& emplace(Args &&...args)
    noexcept(std::is_nothrow_constructible_v<Derived, Args...>)
    requires(derived_class_check<Derived>::ok)
  {
    reset();
    Derived* derived = new (&buf_) Derived(std::forward<Args>(args)...);
    ops_ = &ops_for<Derived>;
    return *derived;
  }

Then we implement assessors functions in the same way as those that the std::unique_ptr contains:

  // accessors
  Base* get() noexcept
  {
    return ops_ ? reinterpret_cast<Base*>(&buf_) : nullptr;
  }

  const Base* get() const noexcept
  {
    return ops_ ? reinterpret_cast<const Base*>(&buf_) : nullptr;
  }

  Base& operator*() noexcept { return *get(); }
  const Base& operator*() const noexcept { return *get(); }

  Base* operator&() noexcept { return get(); }
  const Base* operator&() const noexcept { return get(); }

  Base* operator->() noexcept { return get(); }
  const Base* operator->() const noexcept { return get(); }

  operator bool() const noexcept { return ops_; }
};

And finally we implement the sp::make_static function similar to the std::make_unique and std::make_shared functions:

template <typename T, class ...Args>
static static_ptr<T> make_static(Args &&...args)
{
  static_ptr<T> ptr;
  ptr.emplace(std::forward<Args>(args)...);
  return ptr;
}

Code is available on GitHub!

How to use sp::static_ptr<T>

It's easier than you would think! I have written unit tests that have the lifetime of the inside objects of static_ptr.

This test also contains typical scenarios for static_ptr and behavior of the objects inside static_ptr objects.

Benchmark

For benchmarks I used the google/benchmark library. You can find the code in the repository.

I described two scenarios, each of them checks the std::unique_ptr and sp::static_ptr class templates:

  • Creating a smart pointer and calling a non-static member function of an object.
  • Iterating over a vector of 128 smart pointers, each one calling a non-static member function of an object.

In the first scenario, the sp::static_ptr should benefit from no dynamic allocation. In the second scenario, the sp::static_ptr should benefit from memory locality. Although, it is obvious that compilers are smart and can fix "bad" scenarios, depending on the optimization flags.

Let's run benchmark in Debug:

***WARNING*** Library was built as DEBUG. Timings may be affected.
--------------------------------------------------------------------------------
Benchmark                           Time               CPU            Iterations
--------------------------------------------------------------------------------
SingleUniquePointer               207 ns            207 ns               3244590
SingleStaticPointer              39.1 ns           39.1 ns              17474886
IteratingOverUniquePointers      3368 ns           3367 ns                204196
IteratingOverStaticPointers      1716 ns           1716 ns                397344
--------------------------------------------------------------------------------

And now, in Release:

--------------------------------------------------------------------------------
Benchmark                           Time               CPU            Iterations
--------------------------------------------------------------------------------
SingleUniquePointer              14.5 ns           14.5 ns              47421573
SingleStaticPointer              3.57 ns           3.57 ns             197401957
IteratingOverUniquePointers       198 ns            198 ns               3573888
IteratingOverStaticPointers       195 ns            195 ns               3627462
--------------------------------------------------------------------------------

So, the sp::static_ptr implementation that is a stack-only analog of std::unique_ptr gains in performance.



Comments (0)

Next comments next comments
close comment form