User Defined Literals

Units

In my post, Strong Typing or Naked Primitives, I created a Size struct whose constructor takes two arguments, a Width object and a Height object. Here are the definitions:

struct Size final
{
public:
    Size(const Width& w, const Height& h) 
        : m_width(w), m_height(h) {}
    Width getWidth() const noexcept 
        { return m_width; }
    Height getHeight() const noexcept 
        { return m_height;}
private:
    Width m_width;
    Height m_height;
};

class Width
{
public:
	explicit Width(const uint32_t width) 
            : m_width(width) {}
	uint32_t getWidth() const noexcept 
            { return m_width; }
        operator uint32_t() { return m_width; }
private:
	uint32_t m_width;
};

class Height
{
public:
	explicit Height(const uint32_t height) 
            : m_height(height) {}
	uint32_t getHeight() const noexcept 
            { return m_height; }
        operator uint32_t() { return m_height; }
private:
	uint32_t m_height;
};

The arguments in the Width and Height constructors are integers, but what are the units that these integers specify? If you guessed pixels, you win a prize! Well, not really.

When documenting Width and Height, I would mention that the constructor arguments are in units of pixels. Even if you saw

Size size(Width(400), Height(300));

somewhere in source code you had to maintain, you would probably assume that 400 and 300 are in units of pixels. But what about Size size(Width(5), Height(3))? Could you tell without viewing either the documentation or the declaration of Height and Width if the units are pixels? Perhaps the units are inches instead.

I could add a second constructor to Width and Height that took doubles as values and in those constructors, convert the inches values input to pixels. Wait a minute, most of the world deals with metric units not Imperial or US customary units. What if I want to input values in centimetres? I can’t simply add another constructor that takes a double as the number of centimetres because there is already a constructor that takes a double argument.

One potential solution would be to have different classes for each unit, such as WidthInPixels, WidthInInches, WidthInCentimeters, and so forth. So then the Size struct would have constructors like this:

Size(const WidthInPixels& wp, const HeightInPixels& hp);
Size(const WidthInPixels& wp, const HeightInInches& hin);
Size(const WidthInInches& win, const HeightInPixels& hp);
Size(const WidthInInches& win, const HeightInInches& hin);

and so forth. The number of constructors for Size goes up as the square of the number of different units. This would quickly get out control.

A second alternative is to use a tag argument to indicate what the units are:

class Pixels{};
class Inches{};
class Centimeters{};
...
Width(Pixels, const uint32_t pixels) {...};
Width(Inches, const double inches) {...};
Width(Centimeters, const double cm) {...};

Assuming the Width and Height classes also had an operator= operator, you would be prevented from doing this:

Width width = 400;

A third alternative might be to use templates, but that gets even more complicated when you want to return the values in specific units.

None of these alternatives can be viewed as ideal solutions.

User-Defined Literals

For built-in, or common variable types, C++ uses a number of prefix and suffix literals to specify precision. For example:

auto i = 5UL;       / unsigned long
auto j = 3LL;       / long long
auto k = 'x';       / char
auto l = L'x';      / w_char_t
auto m = "one"s;    / string of chars
auto n = 6 + 14.3i; / std::complex
auto o = u32"one";  // UTF-32 encoded string

Wouldn’t it be great if you could have single constructor definitions for Width and Height and specify the units with the values? Something like this:

Height h(500pixels);
Height(3.2inches);
Height(12.1cm);

C++11 introduced user-defined literals and C++14 extended them. Here is the way to define user-defined literals:

ReturnType operator "" _Suffix (Parameters) { /* do something */ };

There are a number of rules and restrictions:

  1. ReturnType can be anything including void.
  2. Suffix must be preceded by an underscore. Literals that do not begin with an underscore are reserved for the literal operators supplied by the standard library.
  3. For C++11, there must be a whitespace between the “” and the underscore. C++14 removed that restriction.
  4. For C++11, the first character of Suffix must be lower case. C++14 removed that restriction. If the Suffix is upper case letters, then there must be no whitespace between “” and the underscore.
  5. For C++11, the Suffix cannot be a reserved word. C++14 removed that restriction. Again, there must be no whitespace between “” and the underscore.
  6. Parameters must be built-in types (like integers, floating-point values, char strings, and so forth).
  7. Integer parameters must be specified as unsigned long long and floating-point parameters as long doubleto ensure that all number types are accepted.
  8. User-defined literal definitions should be placed inside a namespace.
  9. Wherever possible, user-defined literals should be marked as constexpr.

Assuming we want to store the values as pixel values, and that there are 96 pixels per inch (MS Windows), here is how we could define literals and use them in our code:

namespace units {
constexpr uint32_t operator "" _pix(unsigned long long pixels)
{
    return static_cast<uint32_t>(pixels);
}

constexpr uint32_t operator "" _in(long double inches)
{
    assert(inches >= 0.0L);
    return static_cast<uint32_t>(inches * 96.0L);
}
}

using namespace units;
Size size(Width(400_pix), Height(100_pix + 3.0_in));
Size size2(Width(350), Height(200));

Notes:

  1. Because the user-defined literals are defined in a namespace above (units namespace), the using namespace units; line is required. You cannot preface the user-defined literal with the namespace name. For example:
    Size size(Width(400units::_pix), Height(100units::pix + 3.0units::_in))

    will not compile.

  2. Width and Height still accept uint32_t values as arguments. This is useful when accepting pixel values from other variables. For example:
    wxSize wxS = ...;
    Size size(wxS.GetWidth(), wxS.GetHeight());

Pros

  1. Values can be specified in different units.

Cons

  1. Use of user-defined literals is limited to constant values. That is the reason I included constexpr in the definitions above. You cannot use them with variables. For example, the following will not compile:
    uint32_t width = 200;
    Size size(Width(width_pix), Height(4.0_in));

    though the following does compile and provide the desired result:

    uint32_t width = 200_pix;
    uint32_t height = 4.0_in;
    Size size(Width(width), Height(height));
  2. User-defined literals are limited to use as suffixes; they cannot be used as prefixes. That is:
    uint32_t width = _pix200;
    uint32_t height = _pix(200);

    both will not compile.

This post has just scratched the surface of user-defined literals.
See the references included in Additional Information, below,for more information on user-defined literals, their limitations, and more examples.

Additional Information

  1. User-defined literals
  2. Modern C++ Features – User-Defined Literals
  3. User defined literals – Part 1, Part 2, Part 3
  4. User defined literals
  5. User-Defined Literals (C++)

Strong Typing or Naked Primitives

Update 1: This post has been updated as the result of comments by legalize. Deletions are indicated by strikethough and additions by text in blue.

Update 2: Added reference 6.

Is C++ Strongly or Weakly Typed?

There are a number of definitions of strong and weak typing. If you are interested, you can look them up using your favourite search engine. You can also see some of the references below. I am not going to add my definitions; I will just say that I think C++ is both strongly and weakly typed, and the programmer can do much to turn those weakly typed parts into strongly typed parts. That is the topic of this post.

Note: There is nothing earth shaking in this post. You will find a number of similar posts on the Internet, with the only differences being the examples. I have written this to help noobs, and to provide background information for future posts.

There is no var data type in C++ like there is in some languages, where the type is simply what appears most appropriate at that point in the code. C++ does have auto, but the type is determined at the time the variable is defined and cannot be implicitly changed. Types can be coerced or converted (cast) into other types (e.g. an integer into a floating point, an integer into a pointer, a double to a floating point number, and so forth). These coercions are explicit rather than implicit so theoretically this does not violate strong typing; it can cause problems, though.

Common Variable Types as Arguments

One place where problems occur is in the use of common variable types as arguments. This and the following two posts will look this problem and at potential solutions.

Look at the example, below.

Example (Weakly TypedInteger Arguments)

I have been creating a C++ library for Vulkan. One of the lower-level classes that I need is Size, a class that encapsulates the width and height of an object. So let’s look at the first iteration for this class (actually a struct):

struct Size final
{
public:
    Size(const uint32_t w, const uint32_t h) 
        : m_width(w), m_height(h) {}
    uint32_t getWidth() const noexcept 
        { return m_width; }
    uint32_t getHeight() const noexcept 
        { return m_height;}
private:
    uint32_t m_width;
    uint32_t m_height;
};

How would this be used? Like this:

Size size(400, 300);

So what is wrong with this? Look at this line of code in six months. Is 400 the width or the height?

The constructor takes two integers (uint32_t values) as input. That’s fine; everyone knows that width is specified before height, right? Well maybe in your world, but there is no such guarantee in mine. If by chance or mistake, the user of this struct specifies the height before the width, then that is just plain wrong. The program will compile, and the error may or may not be caught at runtime.

Example (Strongly TypedClasses as Arguments)

Let’s fix this. To do so, we have to change the argument types in the constructor to indicate that one is a width and the other is a height. Let’s use Width and Height as the argument types:

struct Size final
{
public:
    Size(const Width& w, const Height& h) 
        : m_width(w), m_height(h) {}
    Width getWidth() const noexcept 
        { return m_width; }
    Height getHeight() const noexcept 
        { return m_height;}
private:
    Width m_width;
    Height m_height;
};

and here are the definitions for Width and Height:

class Width
{
public:
	explicit Width(const uint32_t width) 
            : m_width(width) {}
	uint32_t getWidth() const noexcept 
            { return m_width; }
        operator uint32_t() { return m_width; }
private:
	uint32_t m_width;
};

class Height
{
public:
	explicit Height(const uint32_t height) 
            : m_height(height) {}
	uint32_t getHeight() const noexcept 
            { return m_height; }
        operator uint32_t() { return m_height; }
private:
	uint32_t m_height;
};

We create a Size object as follows:

Size size(Width(400), Height(300));

Now there is no confusion; the width is 400 units and Height is 300 units, whatever units is. If the programmer specifies Height before Width, the compiler will catch this and the code will not compile.

Note that, instead, I could add a second constructor to Size that takes Height and then Width as arguments. The program will then compile, and there will still be no confusion as to what the arguments represent.

Conclusions

  1. C++ is both strongly and weakly typed. Using the common variable types as arguments to functions and methods can still cause a number of problems.
  2. By creating classes for weakly typed values, it is possible to make them strongly typed. Replacing these arguments with classes helps both the compiler and the programmer to ensure that arguments to functions and methods are both correct and in the correct order.

References

  1. Strong and Weak Typing
  2. Is C Strongly Typed?
  3. Is C++ Considered Weakly Typed? Why?
  4. Use Stronger Types!
  5. C++ strongly typed typedef
  6. String types for strong interfaces