Strong Typing or Naked Primitives

Update 1: This post has been updated as the result of comments by legalize. Deletions are indicated by strikethough and additions by text in blue.

Update 2: Added reference 6.

Is C++ Strongly or Weakly Typed?

There are a number of definitions of strong and weak typing. If you are interested, you can look them up using your favourite search engine. You can also see some of the references below. I am not going to add my definitions; I will just say that I think C++ is both strongly and weakly typed, and the programmer can do much to turn those weakly typed parts into strongly typed parts. That is the topic of this post.

Note: There is nothing earth shaking in this post. You will find a number of similar posts on the Internet, with the only differences being the examples. I have written this to help noobs, and to provide background information for future posts.

There is no var data type in C++ like there is in some languages, where the type is simply what appears most appropriate at that point in the code. C++ does have auto, but the type is determined at the time the variable is defined and cannot be implicitly changed. Types can be coerced or converted (cast) into other types (e.g. an integer into a floating point, an integer into a pointer, a double to a floating point number, and so forth). These coercions are explicit rather than implicit so theoretically this does not violate strong typing; it can cause problems, though.

Common Variable Types as Arguments

One place where problems occur is in the use of common variable types as arguments. This and the following two posts will look this problem and at potential solutions.

Look at the example, below.

Example (Weakly TypedInteger Arguments)

I have been creating a C++ library for Vulkan. One of the lower-level classes that I need is Size, a class that encapsulates the width and height of an object. So let’s look at the first iteration for this class (actually a struct):

struct Size final
{
public:
    Size(const uint32_t w, const uint32_t h) 
        : m_width(w), m_height(h) {}
    uint32_t getWidth() const noexcept 
        { return m_width; }
    uint32_t getHeight() const noexcept 
        { return m_height;}
private:
    uint32_t m_width;
    uint32_t m_height;
};

How would this be used? Like this:

Size size(400, 300);

So what is wrong with this? Look at this line of code in six months. Is 400 the width or the height?

The constructor takes two integers (uint32_t values) as input. That’s fine; everyone knows that width is specified before height, right? Well maybe in your world, but there is no such guarantee in mine. If by chance or mistake, the user of this struct specifies the height before the width, then that is just plain wrong. The program will compile, and the error may or may not be caught at runtime.

Example (Strongly TypedClasses as Arguments)

Let’s fix this. To do so, we have to change the argument types in the constructor to indicate that one is a width and the other is a height. Let’s use Width and Height as the argument types:

struct Size final
{
public:
    Size(const Width& w, const Height& h) 
        : m_width(w), m_height(h) {}
    Width getWidth() const noexcept 
        { return m_width; }
    Height getHeight() const noexcept 
        { return m_height;}
private:
    Width m_width;
    Height m_height;
};

and here are the definitions for Width and Height:

class Width
{
public:
	explicit Width(const uint32_t width) 
            : m_width(width) {}
	uint32_t getWidth() const noexcept 
            { return m_width; }
        operator uint32_t() { return m_width; }
private:
	uint32_t m_width;
};

class Height
{
public:
	explicit Height(const uint32_t height) 
            : m_height(height) {}
	uint32_t getHeight() const noexcept 
            { return m_height; }
        operator uint32_t() { return m_height; }
private:
	uint32_t m_height;
};

We create a Size object as follows:

Size size(Width(400), Height(300));

Now there is no confusion; the width is 400 units and Height is 300 units, whatever units is. If the programmer specifies Height before Width, the compiler will catch this and the code will not compile.

Note that, instead, I could add a second constructor to Size that takes Height and then Width as arguments. The program will then compile, and there will still be no confusion as to what the arguments represent.

Conclusions

  1. C++ is both strongly and weakly typed. Using the common variable types as arguments to functions and methods can still cause a number of problems.
  2. By creating classes for weakly typed values, it is possible to make them strongly typed. Replacing these arguments with classes helps both the compiler and the programmer to ensure that arguments to functions and methods are both correct and in the correct order.

References

  1. Strong and Weak Typing
  2. Is C Strongly Typed?
  3. Is C++ Considered Weakly Typed? Why?
  4. Use Stronger Types!
  5. C++ strongly typed typedef
  6. String types for strong interfaces
Advertisements

5 thoughts on “Strong Typing or Naked Primitives

  1. I wouldn’t exactly refer to these problems as weakly typed. The types in question are strongly typed — they are ints. However int is not the natural type of the problem domain here, which are the two dimensions width and height. Introducing domain specific types for these does solve the problem and a good programming exercise is to practice some problem with the “No Naked Primitives” constraint imposed.

    To explore the concept of embracing the natural types coming from your problem domain, I would recommend reading Eric Evans’s great book “Domain Driven Design”

    Like

    • I have made a number of updates based on your comments.
      Note, however, new reference #6, which supports the idea that interfaces that use built-in types (common variable types) are not strongly typed.

      Like

      • I’m going to go along with wikipedia’s idea of “strong” and “weak” typing:
        https://en.wikipedia.org/wiki/Strong_and_weak_typing

        In this sense, C++ is strongly typed. WP also says that the terms “strongly typed” and “weakly typed” are not well defined. I agree. This is why Reference 6 uses the term “strong typing” for what I would call “domain specific types”. It doesn’t make him wrong and make me right, it is simply going back to WP’s point that these terms are not well defined. To the extent that they are well-defined, C++ has always been considered strongly typed. You get a compile error if you don’t pass in a type that matches the declared type or has an implicit conversion from the actual type used to the declared type.

        Languages that are considered weakly typed like most scripting languages, do not give you these sorts of errors up-front. You don’t find out until you execute your code, which is why I consider extensive unit testing is mandatory for writing anything of significant size in a scripting language like PHP, JavaScript, Python, etc. You don’t want to find these problems during deployment, but during development.

        I agree with the author of Reference 6 that directly encapsulating constraints and concepts from your problem domain into your types can relieve you of many headaches. On my current team we’ve often had the discussion around creating custom types to represent specific kinds of strings in our application so that it is clear when we are using arbitrary strings and when we are using specific concepts from our problem domain that happen to be encoded as strings in a certain format.

        In a certain sense you can think of an enum as the simplest way of differentiating between all possible integer values and a specific set of integer values that have specific meaning. I find that string values of a specific form are one of the most frequent kinds of values that should be encapsulated into a domain specific type instead of just a naked string. The path type in the proposed C++ File System library is a good example of something that is ordinarily used as a string in most applications but has been encapsulated into a class to reflect the constraints and usages in the problem domain (files, directories, etc.).

        Like

  2. Pingback: User Defined Literals – Using C++

  3. Pingback: More on Naked Primitives | Using C++

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s