AKA Zig casts, and when to use them
Today we'll just talk about integer casting, but Zig also provides explicit casting support for floats, bools, enums, pointers, and more. For the full list of explicit casts, see https://ziglang.org/documentation/master/#Explicit-Casts
There are two major problems with integer casting in C - you don't always know when it happens, and you don't always know what it does. A bit of a bold claim, but what I mean here is that a purely textual analysis of a C program (e.g, via {% c-line %}grep{% c-line-end %} or {% c-line %}ctrl+f{% c-line-end %}) cannot find all of the integer casts (due to implicit conversions) and of the ones it does find, there is no textual way to distinguish truncation, reinterpretation, or extension.
For an example of the former problem, try compiling and running this program:
{% c-block language="c" %}
#include <stdlib.h>
int main (void) {
if (sizeof(int) < -1) abort();
return 0;
}
{% c-block-end %}
You won't even get a warning unless {% c-line %}-Wextra{% c-line-end %} or {% c-line %}-Weverything{% c-line-end %} is turned on. What's happening is that {% c-line %}-1{% c-line-end %} is implicitly converted to unsigned (since the result of {% c-line %}sizeof{% c-line-end %} is unsigned) and results in a very large number.
There are many other ways implicit conversions can cause trouble.
Zig solves these problems by 1) eliminating implicit conversions unless they are guaranteed to be safe (for example, assigning a {% c-line %}u8{% c-line-end %} value to a {% c-line %}u16{% c-line-end %} variable cannot fail or lose data) and 2) giving each casting operator its own built-in function, making it easy to audit their use with simple text-based tools.
So, let's review those built-ins and when to use them
@as()
{% c-line %}@as(){% c-line-end %}is only allowed when the casting operation is unambiguous and safe, meaning no information will be lost and the target type is guaranteed to be able to hold the source type.
When to use: casting a compile-time integer for use in type inference:
{% c-block language="zig" %}
var x = 5; // not allowed - should x be signed? unsigned? what size?
var x = @as(u8, 5); // type inference allows the compiler to determine that x is type u8
{% c-block-end %}
When to use: casting an unsigned int to a larger signed int:
{% c-block language="zig" %}
var x : u8 = 5;
var y = @as(i32, x);
{% c-block-end %}
When to use: casting an int to a larger-size int of the same sign
{% c-block language="zig" %}
var x : u8 = 5;
var y = @as(u32, x);
{% c-block-end %}
@truncate()
{% c-line %}@truncate(){% c-line-end %} is used to explicitly cast to a smaller-size integer with the same signedness, by removing the most-significant bits. It's the equivalent of an explicit C cast from a larger to a smaller integer of the same sign.
{% c-block language="zig" %}
var x = @as(u16, 513); // x in binary: 0000001000000001
var y = @truncate(u8, x); // y in binary: 00000001
{% c-block-end %}
Warning: You can call {% c-line %}@truncate(){% c-line-end %} on signed integers, but you need to make sure that's really what you want to do - since {% c-line %}@truncate{% c-line-end %} always removes the most significant bits, the resulting value may or may not be negative, regardless of whether the original value was negative.
@bitCast()
{% c-line %}@bitCast{% c-line-end %} is used to to cast between types that are the same size, preserving the bitpattern. In other words the data in memory does not change, but its interpretation may. In the context of integer casting, this is relevant when casting between signed and unsigned types, if the source value is outside the bounds of the target type. For example, given the {% c-line %}u8{% c-line-end %} value {% c-line %}180{% c-line-end %}, the bitpattern is {% c-line %}10110100{% c-line-end %}. If we use {% c-line %}@bitCast{% c-line-end %} to convert this to an {% c-line %}i8{% c-line-end %} (assuming 2s complement notation), the value becomes {% c-line %}-76{% c-line-end %}. However if the original value were e.g. {% c-line %}100{% c-line-end %}, it would still be {% c-line %}100{% c-line-end %} after {% c-line %}@bitCast{% c-line-end %} since {% c-line %}100{% c-line-end %} has the same bitwise representation for {% c-line %}i8{% c-line-end %} and {% c-line %}u8{% c-line-end %}.
{% c-block language="zig" %}
var x = @as(u8, 180); // x in binary: 10110100 (value is 180)
var y = @bitCast(i8, x); // y in binary: 10110100 (value is -76)
{% c-block-end %}
@intCast()
Finally, we have {% c-line %}@intCast{% c-line-end %}. In some ways {% c-line %}@intCast{% c-line-end %} is the dual of {% c-line %}@bitCast{% c-line-end %} - it casts between integer types at runtime, preserving the value but not necessarily the bitpattern. For example, {% c-line %}@intCast{% c-line-end %} could be used to convert an {% c-line %}i32{% c-line-end %} value of {% c-line %}3{% c-line-end %} to a {% c-line %}u8{% c-line-end %}. Since it's not always possible to do this (for example - a {% c-line %}u16{% c-line-end %} value of {% c-line %}0xFFFF{% c-line-end %} cannot be stored in a {% c-line %}u8{% c-line-end%}) {% c-line %}@intCast{% c-line-end %} can invoke safety-checked undefined behavior. In other words, if runtime safety is active (either due to the build mode or {% c-line %}@setRuntimeSafety(true){% c-line-end %}), then calling {% c-line %}@intCast{% c-line-end %} with invalid parameters will panic the program rather than leaving an unexpected value in the destination. If runtime safety is turned off, you get undefined behavior.
{% c-block language="zig" %}
// zig build-exe test.zig && ./test
pub fn main() void {
var x = @as(i16, 180);
var y = @intCast(u8, x); // this is fine
var z = @intCast(i8, y); // this will crash
}
{% c-block-end %}
{% c-block language="console" %}
➜ ~ zig run test.zig
thread 14280091 panic: integer cast truncated bits
/Users/ehaas/test.zig:4:13: 0x1018bf5bc in main (test)
var z = @intCast(i8, y);
^
/Users/ehaas/source/zig/build/lib/zig/std/start.zig:410:22: 0x1018c177c in std.start.callMain (test)
root.main();
^
/Users/ehaas/source/zig/build/lib/zig/std/start.zig:362:12: 0x1018bf767 in std.start.callMainWithArgs (test)
return @call(.{ .modifier = .always_inline }, callMain, .{});
^
/Users/ehaas/source/zig/build/lib/zig/std/start.zig:332:12: 0x1018bf6a5 in std.start.main (test)
return @call(.{ .modifier = .always_inline }, callMainWithArgs, .{ @intCast(usize, c_argc), c_argv, envp });
^
???:?:?: 0x7fff2032e620 in ??? (???)
???:?:?: 0x0 in ??? (???)
[1] 79460 abort zig run test.zig
{% c-block-end %}
Thus, when calling {% c-line %}@intCast{% c-line-end %} you should always check that the target type can hold the source value, unless you're certain the value will fit.
What if you're not sure if the result will fit, and you don't want to check each time you add or multiply? In that case you'll want to see our next article, Preventing integer overflow in Zig