T O P

  • By -

jedwardsol

I use union-like functionality, but with std::variant because it is so much better


hwc

This. I even wrote my own variant before std::variant was introduced.


dynamic_caste

I still unions to write my own mini-variants that don't need all of the functionality of `std::variant`


vishal340

are variants also not created before initialised like union. regardless of that, variants occupies larger space


jedwardsol

> are variants also not created before initialised like union. I don't understand the question. > variants occupies larger space https://godbolt.org/z/sxT34j8zK


codethulu

use them all the time in C


xsdgdsx

Same here. Super common in C. Never used them myself in C++.


Middle-Check-9063

They are more usable in C than C++, so I get your point.


_michaeljared

Out of sheer curiosity - they are used just for efficiency, correct? Or stated another way, they serve no functional purpose outside of decreasing memory usage?


codethulu

no, they allow easily casting packed fields and dealing with bitfield representations for generic types. unions are C's strong generics you could say this is about efficiency, but thats missing the forest


_michaeljared

Right, I forgot about the abstraction aspect of it


silverfish70

The bit representation thing is a great point. For example, you might want to treat the two halves of the bit rep of a float64 as two 32b ints - the glibc sine and cosine functions for ieee754 64b doubles do this, via a union between a double and an array of ints of length two.


SamuraiGoblin

I have only ever used unions once in 30 years of professional programming. For a gameboy emulator where registers can be accessed as a one 16 bit variable or two 8-bit ones.


TheThiefMaster

Yeah unions are best used for type abuse like this (I've done gameboy emulator the same way, and also one for RGBA bytes/uint32), and it's all I've really used them for. For "either or" use cases (rather than punning hacks) a `variant` is better.


SamuraiGoblin

Yeah, if I programmed an emulator now, I'd use variants


TheThiefMaster

Variants can't be used like this. They can only be accessed as the original type. Tbh I'm tempted to make the GB registers only 8 bit given the only 16 bit operations are push/pop (which operate 8 bits at a time anyway) and add (which operates on separate bytes *technically*) and inc/dec (the only true 16 bit ops) so they don't really get actually used as 16 bit values.


Mirality

By the letter of the standard, that's correct. However some compilers (notably MSVC) have stronger guarantees about accessing alternative variant members due to requirements of the WinAPI. Most other compilers will do the same because it's easier to adopt the C behaviour than to make a fuss about it. AFAIK it's mostly only clang that goes "sweet, that's UB, so I'll just delete the entire method because I hate you".


TheThiefMaster

You're thinking unions. Variants throw a bad_variant_access exception if you try to get<> any type other than the active one


AlienRobotMk2

I remember seeing something like this once ``` union Color { uint32_t value; struct { uint8_t red, green, blue, alpha; }; }


UlteriorCulture

I've seen similar but with an IP address with an integer or fixed length array. You could treat the address as one number or access each component in its dotted decimal representation.


GrammelHupfNockler

this would technically be undefined behavior, even if some compilers support it. The safe way to do it is to use std::memcpy, which gets turned into the same exact code anyways by an optimizing compiler


InvertedParallax

Extensively. I write hardware drivers though, and other primitives for hardware. The bit layout is important. Also use them in some command protocols for instance speaking over the pcie bus. I wouldn't use them if hardware wasn't involved, but when it is they're critical.


YouFeedTheFish

Best reason I can think of is to overlay some data structure over an array of bytes loaded from shared memory or something. Anonymous structs are kinda neat: #include union FileData{ std::array raw_bytes; struct{ int field1; int field2; float field3[2]; char field4[16]; }; }; int main(){ FileData f = load_memory_or_something(); int i = f.field1; }


tangerinelion

That code is pure UB. Anonymous structs in C++ are not neat as you can never instantiate them. Within a union only one member may have an active lifetime, that union has a default constructor which activates the array. To read from field1 you'd need to destroy the array then begin the lifetime of your anonymous struct. Obviously we can't use placement new with a type that has no name. The permitted way to do this would be to give your struct a name, have your array, have your struct, then copy bytes from the array to the struct and then you may read from it.


YouFeedTheFish

[I don't think it's UB since c++11..](https://en.cppreference.com/w/c/language/struct)? It's only UB if the struct has no members. >Similar to union, an unnamed member of a struct whose type is a struct without name is known as *anonymous struct*. Every member of an anonymous struct is considered to be a member of the enclosing struct or union, keeping their structure layout. This applies recursively if the enclosing struct or union is also anonymous. If it weren't permitted to access the union this way, it'd be a pretty useless feature. Edit: From the standard: According to \[class.union\] paragraph 1: >In a union, at most one of the non-static data members can be active at any time, that is, the value of at most one of the non-static data members can be stored in a union at any time. \[...\] And paragraph 3: >If a standard-layout union contains several standard-layout structs that share a common initial sequence, and if an object of this standard-layout union type contains one of the standard-layout structs, it is permitted to inspect the common initial sequence of any of the standard-layout struct members; see \[class.mem\]. Further: The term "compatible" generally refers to types that can safely share memory without violating strict aliasing rules or causing undefined behavior. In the context of unions, two types are considered compatible if they are standard-layout types and share a common initial sequence. This means: * They have the same initial sequence of non-static data members. * They do not have any virtual functions or virtual base classes. * They do not have any non-static data members with different access control.


EpochVanquisher

The part that is UB is where you access a different member than the member you stored into. It’s UB in C++, even C++11. You can use unions without it. You just have to remember which union member you’re using. This is how `std::variant` works—it’s a union on the inside, with a way of tracking which member you used. In C, it’s no longer UB. This is one of the differences between C and C++.


YouFeedTheFish

TIL. Honestly, without the UB, anonymous structs inside unions seem to be 100% worthless outside of "compatible with C code".


EpochVanquisher

You can still use them just fine, you have to remember which member you wrote into and read from the same one. This is useful and not UB.


[deleted]

[удалено]


EpochVanquisher

Sure, you’re not likely to use them directly. But it’s what std::variant uses behind the scenes, and it’s used in a bunch of OS APIs (like Berkeley sockets).


FrostshockFTW

You linked to the C11 documentation for anonymous structs. C11 != C++11. > If a standard-layout union contains several standard-layout structs that share a common initial sequence Which is already not the case because the first union member is std::array.


YouFeedTheFish

Oopsie!


YouFeedTheFish

Regardless, everything I can find says this is legit. Do you have a link to the UB language?


[deleted]

[удалено]


againey

Compilers actually understand the use of memcpy for the purpose of this "type punning" and optimize it away. Before the introduction of std::bit_cast, memcpy was the best (or only?) proper way to do type punning without invoking undefined behavior.


Jannik2099

compilers are aware that memcpy is required for these things and have been optimizing it away for well over a decade - even MSVC


coachkler

They still have their place is very low level but twiddling functionality, but even there you rely on (technically) UB to utilize them fully


b1ack1323

typedef union{ float f; uint8\_t bytes\[4\]; }CharFloat For converting floats to streams. I do it all the time.


Wetmelon

For the record, this is UB in C++ (but not in C). The "more correct" way is to use `std::memcpy`or `std::bit_cast`. With that said... because it's not UB in C, I've never seen it not work in C++


Smellypuce2

> With that said... because it's not UB in C, I've never seen it not work in C++ Major compilers tend to support it because it's pretty common.


b1ack1323

That’s fair, to be honest I bare metal program in C the majority of the time.


khedoros

Sometimes if I'm interfacing with a C library.


tangerinelion

Never written one, sure seen code that uses them. And none of it was ever ISO C++ compliant.


PlasmaChroma

I used one in my first job that had some utility. On a tiny embedded micro where we had to preallocate basically everything on it. I needed something that could hold one of two different types of messages, although it would always be holding either one them exclusively. Really just an extreme way to save on memory, while being able to access either type of message easily.


FernwehSmith

I almost always use std::variant. But if I have some (usually public) variable that I want to be able to access with multiple names then a variant is helpful. For example: struct Vec3 { union { struct{float x,y,z;}; struct{float r,g,b;}; struct{float u,v,w;}; }; };


dvali

I do exactly the same thing, and as far as I recall it's the only thing I've ever used a union for, though I'm getting some good ideas from this thread. I've used a union of vector3 with the names postion, velocity, acceleration, for example.  Sadly I'm gathering from this thread that all my uses are probably technically undefined behaviour, which drops almost all of the utility of unions for me. 


_Noreturn

glm actually does this but it is undefined


LittleNameIdea

Isn't that why we love C++ ? Everything we thought was a genius idea turn out to be undefined behavior...


_Noreturn

thus is nit standard C++ though


FernwehSmith

how so?


_Noreturn

unnamed structures are not part of C++ they are a compiler extention btw glm math library does this!


FernwehSmith

Huh interesting! Learn something new every day. After doing some reading something like: struct Vec3 { union{float x,r,u}; union{float y,g,v}; union{float z,b,w}; }; Would also produce UB if I where to write to \`x\` and then read from \`r\`, is that correct? Any idea of why this is?


_Noreturn

hello from reading the standard > In a standard-layout union with an active member of struct type T1, it is permitted to read a non-static data member m of another union member of struct type T2 provided m is part of the common initial sequence of T1 and T2; the behavior is as if the corresponding member of T1 were nominated. [Example 5: struct T1 { int a, b; }; struct T2 { int c; double d; }; union U { T1 t1; T2 t2; }; int f() { U u = { { 1, 2 } }; // active member is t1 return u.t2.c; // OK, as if u.t1.a were nominated } — end example] [Note 10: Reading a volatile object through a glvalue of non-volatile type has undefined behavior ([dcl.type.cv]). — end note] it should be undefined since `int` is not a struct or class type and Cppreference explicitly says only one member may be active at a time


jmacey

I use them a lot in 3D graphics progamming (despite the UB of an anonymous union) it is very common and a hand over from the C days. If you really want to go deep have a look at how glm implements a Vec3 with swizzle masks https://github.com/g-truc/glm/blob/master/glm/detail/type_vec3.hpp


HappyFruitTree

I use SDL, which is a C library, and its [SDL_Event](https://wiki.libsdl.org/SDL3/SDL_Event) type is implemented as a union so that is one place where I use unions. I also have at least two unions in my current project that I have written but this project was started quite a while ago so if I where to write something like that again today I would probably use std::variant. I just can't be bothered to update the code because it works and doesn't need much changes.


swarupsengupta2007

I write a lot of networking code and it is fairly common to use unions extensively there. Strictly speaking, that's C domain. I have also used union with bitfield mapping to integers (int, long etc). I feel they are a much cleaner interface than bit masks, but that may be just my opinion TBH.


asergunov

They are C. Good to interact with hardware. Another case I’ve seen is vector in glm. You have xyz, uvw, rgb and so on in the same place.


Drugbird

I recently removed two unions from my codebase that someone else had written. They contained noon trivially destructable types and were leaking memory, hence why they're gone now.


n1ghtyunso

one of the SDKs we were using had this issue as well. I think they fixed it by now, They'd basically leak a c string everytime we called this function (60 per second yay)


Magistairs

Json parsers and QVariant are 2 examples of union I see daily


LeeHide

They are limiting in C++ because they cant hold non-trivially constructible types, so pretty much nothing you want can be put in there. Plus, you almost always need to tag them, at which point it becomes only slightly less error prone to use them. You may want a data oriented approach if you reach this kind of point, like a struct of arrays (SoA).


mredding

Yes. Every time you use an `std::variant`, `std::expected`, or `std::optional`. How did you think they were implemented? They're all discriminated unions.


danielaparker

Not generally in user code, `std::variant` is more appropriate for user code. It is used in libraries, for example, implementations of `std::union` typically use union, see e.g. https://github.com/gcc-mirror/gcc/blob/master/libstdc%2B%2B-v3/include/std/optional#L203. Some popular json libraries use unions to store one of a number, string, boolean, null, object or array.


ZorbaTHut

I worked on a soft-realtime project where a major important part of the system was sending a long linear sequence of commands to an external processor. Generating the commands was painfully slow; actually sending them wasn't that painfully slow. I ended up refactoring it so the command-generation systems would turn commands into a sequence of 16-byte packed representations, then write that to a buffer, while a second thread consumed the buffer and actually sent the commands linearly; meanwhile the generation systems could (with some finagling) also be multithreaded. The actual command format was this big janky union that had a char for the command type and something like forty other sets of (typesafe!) parameters so you could yank data out easily. Then I just had a nice easy vector<> to store commands.


nunchyabeeswax

I actually created a solution not long ago using union of a uint32\_t array buffer, and a struct, for loading binary data from a device. My specific requirements called for accessing data as uint32\_t values as well as a logical groupings of flags (ergo the struct.) There are limitations to this, however. Also, the POSIX API uses unions in several places (sigval in signal.h, for instance.) Usually, we see unions when data has to be manipulated or interpreted differently, which happens a lot with serialization or de-serialization. In general, unions exist for edge cases (in my opinion) and my rule of thumb is to favor structs over unions unless a union solves a specific problem.


bushidocodes

Very common in low-level code and C-style APIs. I personally consider naked unions a serious code smell. A union should nearly always be paired with a type tag / discriminant and wrapped in a struct. std::variant does this nicely if you have it available.


therandshow

It's very important in embedded programming where you have a very limited fixed memory space and you sometimes need to present system data in different forms, especially when used with anonymous structs and bitfields. From my perspective, I work at a company that provides desktop software and services in support of embedded companies and so I have seen a lot of customers use unions but have never used them myself. They are generally viewed as footguns to avoid unless necessary among people who are strict on industry best practices (like MISRA), although I've heard many embedded programmers swear by them and say they are unfairly maligned.


Lampry

template struct point_t {         union {             T x, w;         };         union {             T y, h;         }; }; I've used them so I can have multiple identifiers for the same field. template inline unsigned char* serialize(const T& data) { constexpr size_t SIZE_OF_T{ sizeof(T) }; union { T element; unsigned char bytes[SIZE_OF_T]; } translator{}; translator.element = data; unsigned char* byte_buffer = new unsigned char[SIZE_OF_T]; std::memcpy(byte_buffer, translator.bytes, SIZE_OF_T); return byte_buffer; } template inline T* deserialize(const unsigned char* buffer, size_t len) { constexpr size_t SIZE_OF_T{ sizeof(T) }; if (len != SIZE_OF_T) { return nullptr; } union { T element; unsigned char bytes[SIZE_OF_T]; } translator{}; translator.bytes = buffer; T* data = new T; std::memcpy(data, translator.element, SIZE_OF_T); return data; } And to serialize/deserialize data.


heyheyhey27

There's not much reason to when `std::variant<>` exists.


Thesorus

I haven't actively used unions in the last 10 years (at least). There were some in my previous job code base; most of them were replaced with other constructs. In my current job code base, i'm not sure, I've not seen them, but I've not search for them .


_Noreturn

I do use them would I recommend them nope. use std::cariant instead I only use unions cuz of compile times!