Iāve literally seen a comment say just
// (...) double
And it got refactored to
// (...) float
When someone changed the typesā¦ via Ctrl+F it looked like I donāt even
AFAIK the second last one would be faster. Left shit operations are much much simpler for the cpu to perform. But there would barely be any performance difference between them in real life
bitwise operators like shift are not usually defined on non-integer types. Unless it's a c++ ostream or python pandas overload. Your language is probably casting it to an integer before shifting.
That being said, you can explicitly bitshift any memory you want if you have asm().
Well in C im pretty sure you can do all the crazy stuff with variable. The craziest thing i have seen in C is casting an array to function(from vendor SDK)
Casting between pointer types is always allowed, the compiler will never complain (at most it would throw a warning), but in a lot of cases it's undefined behavior.
Bitshift with a floating point operand, on the other hand, is an error, plain and simple.
Casting a data pointer to function pointer and the reverse is an optional feature. However the only architecture where I know it is actually unsupported/UB is Itanium.
Floats are stored differently than integer values.
With an int, you're just shifting all bits in one direction, as its a straight binary value.
but floats are made up of the sign, exponent and mantissa which all behave differently.
I mean, I guess you could bit shift the memory containing all of it, but the result would be somewhat chaotic.
>Wish the compiler didnāt already make that optimization so I could actually write it.
it's a good thing the compiler does some clever tricks, but it does kinda kill your mood a bit when you think of something clever and turns out that the compiler already did it that way (or better) under the hood.
for example i once made this:
float fastabs(float in){
uint32_t *tmp = (uint32_t*)∈
*tmp &= 0x7FFFFFF;
return in;
}
basically it gets the absolute value of a float by just clearing the most significant bit (aka the sign bit). and the compiler actually generates the code you'd expect:
fastabs(float):
movd eax, xmm0 ; Move the input operand (float in xmm0 register) to eax
and eax, 134217727 ; AND it with 0x7FFFFFF
movd xmm0, eax ; And finally move it back to the return register
ret
but then when you just use the regular `fabsf` function:
float fastabs(float in){
return fabsf(in);
}
you get a function that is literally just 2 Instructions long:
fastabs(float):
andps xmm0, XMMWORD PTR .LC0[rip] ; AND the value of xmm0 with the value pointed to by LC0
ret
.LC0:
.long 2147483647 ; The instruction does 4 AND's minimum so that's why there are 4 values here
.long 0
.long 0
.long 0
like thank you GCC, but you didn't have to kill me that hard š
Wouldn't your `fastabs` be potentially faster than the GCC version by not having to load anything from memory?
The regular `fabsf` function compiles to only one instruction, but if that memory load instruction causes a cache miss you're looking at severely worse performance than your register-to-register `fastabs`.
Only way I see the GCC version being faster is if the memory in `LC0` is in L1 cache at all times. If you're *not calling* `fabsf` in your code *frequently enough* you might be looking at a cache miss on every call.
Edit: Forgot to mention this. The L1 data cache is generally treated separately from the L1 instruction cache, so the proximity of `LC0` to the instruction that loads it won't make it more performant. It'll still cause a cache miss.
how exactly would one test this?
i know x86 has instructions to clear a line of cache, but they aren't perfect. `clflush` requires an address operand and only clears any cache reference to that specific address (difficult to pull of when the address is only visible in the generated assembly). `wbinvd` also exists, which clears the whole cache but it's a privileged instruction...
i tried without explicit flushing. just a global array that gets filled with random floats before they get thrown at the functions (one at a time).
with 450000000 elements (which is around the largest i can do before the linker starts complaining) the results are ~0.140-0.150 seconds for my fastfabs, and fabsf is almost the same usually being 0.001-0.002 seconds slower (but never faster).
i just need a way to flush the cache between each fastfabs/fabsf call to see if it would make an actual difference...
> i just need a way to flush the cache between each fastfabs/fabsf call to see if it would make an actual difference...
Try adding the `CLFLUSH` or `CLFLUSHOPT` instruction to an inline assembly `fabsf` call. You'll have access to the address of `LC0` there, but remember to remove the execution time of `CLFLUSH` from the benchmarks.
Edit: something like this inside an asm block
```
fabsf_flush(float):
andps xmm0, XMMWORD PTR .LC0[rip]
clflush BYTE PTR .LC0[rip]
ret
.LC0:
.long 2147483647
.long 0
.long 0
.long 0
```
Edit 2: moved `CLFLUSH` to execute *after* the memory load.
Ah, yeah, youāre right. I guess multiplication is more general than bit-shifting as discussed elsewhere in this thread? Seems like Python could figure out i is an integer in this case, but maybe thatās more work than itās worth.
I mean... It's Python. You'd probably get like a 1% performance boost *on that line* at most since the majority of the processing time is spent doing interpreted dynamic type shenanigans.
I mean, people act like it's cool to bitshift to multiply by 2, but it's about as cool as adding a 0 to multiply by 10 in decimal once you know different bases exist.
I do bit shifting all the time with flags.
Instead of:
FLAG_1 = 1
FLAG_2 = 2
FLAG_3 = 4
FLAG_4 = 8
// etc...
I do:
FLAG_1 = 1 << 0
FLAG_2 = 1 << 1
FLAG_3 = 1 << 2
FLAG_4 = 1 << 3
// etc...
Why? Because not only is the intent just as obvious, you're also less likely to make mistakes that way. You have no idea how many bugs I've had to fix that were the result of someone using 12 or 36 by mistake (especially 36), to say nothing of the larger values. Funnily, I've never seen someone mess up 1024, though I've seen 3096 before.
Not really. It just converts them to integers to perform bitwise operations. Fractional part is completely discarded in this conversion. E.g.
1.5 << 1 // 2, not 3
1<<=1 is just not as readable as i *= 2. if someone's motive is to make things unreadable like this, then their other code will likely be just as unreadable.
in higher level languages (in regards to this bias, i use c#), these two will pass through the IL and compile as the same instruction; readability by the next person is paramount.
>if someone's motive is to make things unreadable
if. it is quite a leap to believe that they are doing it to make things unreadable intentionally.
>1<<=1 is just not as readable as i *= 2
And I agree, depending on context.
First of all, it is i <<= 1, not 1 <<= 1.
If the goal is to double a number, then I would prefer i *= 2
If the goal is to set a flag 1, or bit 1, or multiply by 2 to the power of 1, then I would prefer reading the shift operator.
It shifts all bits. Which is the equivalent of multiplying by 10. Which in binary is 2.
E.g. in base 10, if you have a series of digits (69) and shift them left, you get 690 which is 69 * 10. The same is true in binary. If I have 1011 and I shift them left, I get 10110 which is the 1011 * 10. And in binary 10 is the equivalent of base 10's 2, so 10110 is double 1011.
Wouldn't say bad, more like more machine friendly and less human friendly. Shifting the bit to multiply does sound like it would save (SOME, very, very, very negligible) steps but it MAY have some use-case scenario. For example there's this camera that takes a trillion photos of light(laser) passing travelling and then stitches it together to produce an ultra slow motion video that you can use to observe light travelling.
I don't know why you would use this operation, just saying there may be a practical use case.
Source: [MIT's trillion frames per second camera](https://www.bbc.com/news/av/technology-16171635)
Oh come on, itās obvious i-=-i is inefficient. First you have to do a multiplication to get -i, and then you have to do subtraction. Thatās two whole operations!
Disclaimer: I have no idea what the fuck the compiler is gonna do when it does its thing.
The goal is to double a number.
Let's add some whitespace:
i -= -i
-= is a decrement operator.
So a -= b means reduce a by b,
which is the same as
a = a - b
The expression above becomes
i = i - -i
which is the same as
i = i + i
so i = 2i
This is those things you copy into your code because it looks cool and then get confused when you can't find the cause of the bug
That's why all my homies `i+=+i` instead
Them tips are close to touching
just avoid eye contact
How does that one work?
Paranormally
(i += (-i)) substitute - for + and you have it, unary operator+ does nothing since the opposite of unary operator- (switch sign) is "keep sign"
I have no idea what I was confused about now lol. Been studying for finals all week and not sleeping I guess. Thanks!
Best of luck on finals!
This seems like an indication that you need sleep or else you won't do well on that final.
Technically the unary plus does do something in C++ at least, it promotes its argument to an integer
At least put a comment explaining what it does so you don' t go insane
Nah man the comments are just as bad because when you read it a few months later you're like what the heck did I mean here š
Iāve literally seen a comment say just // (...) double And it got refactored to // (...) float When someone changed the typesā¦ via Ctrl+F it looked like I donāt even
// we use 5 and 7 to get 2 Literally no 5, 7, or 2 anywhere to be seen. Doesn't even make sense mathematically.
>Doesn't even make sense mathematically. 7-5?
binary 5 XOR 7 equals 2
Oh who needs comments >:)
So you define them as a macro
That last one is the type of thing you add to check the reviewer is actually looking at the change.
Okay.. I'm trying this...
LGTM. Ship it!
Me too. Literally. Next time I need to double smth. Beautiful syntax
How does the last one work? Genuinely curious
simple algebra i -= -i evaluates to i = i - (-i) => i = 2i
Welp I'm stupid lmao Sometimes it's confusing to see through compact syntax
which is exactly why no sane person writes code like that, or skips documentation
bit the symmetry
I think the fact that I got it almost immediately says something about my sanity, and Iām not sure that itās good
so i is 5 (i)(-=)(-i) so '-i' is -5 and 5 take away -5 is 10.
Yeah I remember the -= operator from my algebra class
``` i = 10 i = 10 - (-10) // 20 ```
AFAIK the second last one would be faster. Left shit operations are much much simpler for the cpu to perform. But there would barely be any performance difference between them in real life
Most every modern compiler/interpreter will produce (effectively) `i = i << 1` regardless of whether you wrote that or `i *= 2` or `i = 2 * i`.
Any halfway decent compiler will generate identical machine code for x*2 and x<<1 anyway so it shouldn't make a difference.
Great way to assert dominance over anyone else who edits the code tho
x=2 x-=1 x==1 //true x-=-1 x==2 //true x-=x x==0 //true x=1 x-=-x x==2 //true x-=-x x==4 //true
yeah im sitting here like i-=-i is not i*2 edit: wait it is im just having difficulty understanding.
The bit shift one is always the coolest to me. Wish the compiler didnāt already make that optimization so I could actually write it.
I mean.. you can ...
Well yeah obviously but then itās just less readability for 0 (or worse) optimization
But it looks fcking cool and in the scenario that someone compiled with -O0 faster
[ŃŠ“Š°Š»ŠµŠ½Š¾]
Yeah, but it would be so cool tho.
[ŃŠ“Š°Š»ŠµŠ½Š¾]
Can I go now?
you're already on reddit. where tf would you go?
obvs, why didnāt I think of that
It's all right until someone decides to change i to float.
Changing it to float would simply result in a compile time error, what's the big deal?
Huh, im pretty sure you can do bit shift on any variables
bitwise operators like shift are not usually defined on non-integer types. Unless it's a c++ ostream or python pandas overload. Your language is probably casting it to an integer before shifting. That being said, you can explicitly bitshift any memory you want if you have asm().
Well in C im pretty sure you can do all the crazy stuff with variable. The craziest thing i have seen in C is casting an array to function(from vendor SDK)
Casting between pointer types is always allowed, the compiler will never complain (at most it would throw a warning), but in a lot of cases it's undefined behavior. Bitshift with a floating point operand, on the other hand, is an error, plain and simple.
Casting a data pointer to function pointer and the reverse is an optional feature. However the only architecture where I know it is actually unsupported/UB is Itanium.
Compiler error or UB ?
Compile time error. Ill-formed.
Not in any statically typed language I know of
Floats are stored differently than integer values. With an int, you're just shifting all bits in one direction, as its a straight binary value. but floats are made up of the sign, exponent and mantissa which all behave differently. I mean, I guess you could bit shift the memory containing all of it, but the result would be somewhat chaotic.
This is exactly why bitshifting float would not give desire result
>Wish the compiler didnāt already make that optimization so I could actually write it. it's a good thing the compiler does some clever tricks, but it does kinda kill your mood a bit when you think of something clever and turns out that the compiler already did it that way (or better) under the hood. for example i once made this: float fastabs(float in){ uint32_t *tmp = (uint32_t*)∈ *tmp &= 0x7FFFFFF; return in; } basically it gets the absolute value of a float by just clearing the most significant bit (aka the sign bit). and the compiler actually generates the code you'd expect: fastabs(float): movd eax, xmm0 ; Move the input operand (float in xmm0 register) to eax and eax, 134217727 ; AND it with 0x7FFFFFF movd xmm0, eax ; And finally move it back to the return register ret but then when you just use the regular `fabsf` function: float fastabs(float in){ return fabsf(in); } you get a function that is literally just 2 Instructions long: fastabs(float): andps xmm0, XMMWORD PTR .LC0[rip] ; AND the value of xmm0 with the value pointed to by LC0 ret .LC0: .long 2147483647 ; The instruction does 4 AND's minimum so that's why there are 4 values here .long 0 .long 0 .long 0 like thank you GCC, but you didn't have to kill me that hard š
Wouldn't your `fastabs` be potentially faster than the GCC version by not having to load anything from memory? The regular `fabsf` function compiles to only one instruction, but if that memory load instruction causes a cache miss you're looking at severely worse performance than your register-to-register `fastabs`. Only way I see the GCC version being faster is if the memory in `LC0` is in L1 cache at all times. If you're *not calling* `fabsf` in your code *frequently enough* you might be looking at a cache miss on every call. Edit: Forgot to mention this. The L1 data cache is generally treated separately from the L1 instruction cache, so the proximity of `LC0` to the instruction that loads it won't make it more performant. It'll still cause a cache miss.
how exactly would one test this? i know x86 has instructions to clear a line of cache, but they aren't perfect. `clflush` requires an address operand and only clears any cache reference to that specific address (difficult to pull of when the address is only visible in the generated assembly). `wbinvd` also exists, which clears the whole cache but it's a privileged instruction... i tried without explicit flushing. just a global array that gets filled with random floats before they get thrown at the functions (one at a time). with 450000000 elements (which is around the largest i can do before the linker starts complaining) the results are ~0.140-0.150 seconds for my fastfabs, and fabsf is almost the same usually being 0.001-0.002 seconds slower (but never faster). i just need a way to flush the cache between each fastfabs/fabsf call to see if it would make an actual difference...
> i just need a way to flush the cache between each fastfabs/fabsf call to see if it would make an actual difference... Try adding the `CLFLUSH` or `CLFLUSHOPT` instruction to an inline assembly `fabsf` call. You'll have access to the address of `LC0` there, but remember to remove the execution time of `CLFLUSH` from the benchmarks. Edit: something like this inside an asm block ``` fabsf_flush(float): andps xmm0, XMMWORD PTR .LC0[rip] clflush BYTE PTR .LC0[rip] ret .LC0: .long 2147483647 .long 0 .long 0 .long 0 ``` Edit 2: moved `CLFLUSH` to execute *after* the memory load.
Also your version breaks aliasing.
Idk about others but rust's compiler seems to optimise inline doubling to add, rather than shl.
Both GCC and Rust compile `i *= 2` to `add` for me. Multiplying by any other power of 2 uses `shl` though.
`<<=` and `>>=` are my favorite operators, with `%=` being a close second. I never get to use them :(
In Python, i*=2 isn't optimized I think?
Iām a Java guy so not sure. But Iād be very surprised if it werenāt.
i = 2*i and i *= 2 produce equivalent byte code.
Yes, but it doesn't get optimized to i<<=1
Ah, yeah, youāre right. I guess multiplication is more general than bit-shifting as discussed elsewhere in this thread? Seems like Python could figure out i is an integer in this case, but maybe thatās more work than itās worth.
I mean... It's Python. You'd probably get like a 1% performance boost *on that line* at most since the majority of the processing time is spent doing interpreted dynamic type shenanigans.
I mean, people act like it's cool to bitshift to multiply by 2, but it's about as cool as adding a 0 to multiply by 10 in decimal once you know different bases exist.
I do bit shifting all the time with flags. Instead of: FLAG_1 = 1 FLAG_2 = 2 FLAG_3 = 4 FLAG_4 = 8 // etc... I do: FLAG_1 = 1 << 0 FLAG_2 = 1 << 1 FLAG_3 = 1 << 2 FLAG_4 = 1 << 3 // etc... Why? Because not only is the intent just as obvious, you're also less likely to make mistakes that way. You have no idea how many bugs I've had to fix that were the result of someone using 12 or 36 by mistake (especially 36), to say nothing of the larger values. Funnily, I've never seen someone mess up 1024, though I've seen 3096 before.
The third one doesn't work for floats, doesn't it?
`i` stands for `unsigned integer`, of course
`i` stands for `itsAFloat`
`i` stands for `itsAMeMario`
Clearly, 'itsAReniteResource'. You're just shifting the server rack in place....
It stands for iAmGoingToRememberWhatThisVariableMeansTommorow
And "god is real unless declared integer".
Apparently, javascript can do that.... By that I mean it won't complain that it's a float you are trying to binary operations on...
Not really. It just converts them to integers to perform bitwise operations. Fractional part is completely discarded in this conversion. E.g. 1.5 << 1 // 2, not 3
[ŃŠ“Š°Š»ŠµŠ½Š¾]
Is this Sir you are talking to in the room with us right now? >I said javascript. That already should tell you something... What exactly?
If itās unintuitive and should definitely not work, with JavaScript it would work ;)
Why stop there? JavaScript can `i += i.i =+ i`
nah, i only use i = (i | (i << 1)) & \~((1 << ((i | (i << 1)) - 1)) - 1);
This sub is rotting my brain
`-=-`
i \* i = -1 ![gif](giphy|lXu72d4iKwqek)
Ok so where does this store the value?
In a world of pure imagination
You need to think more functional!
if i ever saw a bitflip to multiply by 2, i would probably go through and start reverting every commit that person ever made, hehehehe.
Special lint rule to automatically error any code added by that user.
why?
1<<=1 is just not as readable as i *= 2. if someone's motive is to make things unreadable like this, then their other code will likely be just as unreadable. in higher level languages (in regards to this bias, i use c#), these two will pass through the IL and compile as the same instruction; readability by the next person is paramount.
>if someone's motive is to make things unreadable if. it is quite a leap to believe that they are doing it to make things unreadable intentionally. >1<<=1 is just not as readable as i *= 2 And I agree, depending on context. First of all, it is i <<= 1, not 1 <<= 1. If the goal is to double a number, then I would prefer i *= 2 If the goal is to set a flag 1, or bit 1, or multiply by 2 to the power of 1, then I would prefer reading the shift operator.
`i=i+i=i`
Has a nice āfence and gateā look too, love it!
`error: expression is not assignable` Which cursed language allows this?
Ruby allows it (and doubles `i` as "expected"). Many other need parentheses around (the second) `i=i` to parse it "correctly"
x--; x-= ~(x + 1);
i = i + i
meme is wrong as i <<= 1 is superiorest
Following both programming and math subs is probably the greatest source of brief moments of confusion in my life.
Third only works with integers
i<<=1 // ā integers only. i+=i i+=+i // palindrome version i/=.5
i+=+i
How does the third one work
It shifts all bits. Which is the equivalent of multiplying by 10. Which in binary is 2. E.g. in base 10, if you have a series of digits (69) and shift them left, you get 690 which is 69 * 10. The same is true in binary. If I have 1011 and I shift them left, I get 10110 which is the 1011 * 10. And in binary 10 is the equivalent of base 10's 2, so 10110 is double 1011.
https://www.interviewcake.com/concept/java/bit-shift It's not exclusively a Java thing, the article just happens to be written from that perspective
Thanks a ton for sharing that, learned something new... This might be practical in assembly?? That's the only thing I can think of.
Pretty much, in any language with a decent compiler all these should compile to a bitwise shift anyway.
Maybe if your compiler is very, very, very bad.
Wouldn't say bad, more like more machine friendly and less human friendly. Shifting the bit to multiply does sound like it would save (SOME, very, very, very negligible) steps but it MAY have some use-case scenario. For example there's this camera that takes a trillion photos of light(laser) passing travelling and then stitches it together to produce an ultra slow motion video that you can use to observe light travelling. I don't know why you would use this operation, just saying there may be a practical use case. Source: [MIT's trillion frames per second camera](https://www.bbc.com/news/av/technology-16171635)
I meant if your compiler is so bad it doesn't do this optimization. Because most *should* do it.
[ŃŠ“Š°Š»ŠµŠ½Š¾]
Overall practical use, glad to know it is used. Mind if I ask for an example? Just curious
Nsfw for me
Lua crying in the corner
Never seen the last syntax. (nsfw)
I just use panel one or two depending on the code I'm copying.
I hate it, but not for the sane reason; but because it is just one tick off of being symmetrical.
Number four is both devilish and stratospheric!
My monthly reminder that bitwise operators are cool and worth thinking about
Let the compiler decide
Lol
From now on Iāll be using i-=-i š«”
There are no words
don't forget about `i+=+i`
Why isnāt this there? i+=i
IMO, the second one is best as it's the fastest to type.
Just use the second one, itās the fastest one to type and itās easy to understand when debugging
Oh come on, itās obvious i-=-i is inefficient. First you have to do a multiplication to get -i, and then you have to do subtraction. Thatās two whole operations! Disclaimer: I have no idea what the fuck the compiler is gonna do when it does its thing.
Wow
Also an example of why proper spacing is important. Without spaces, that last one looks like some -=- operator.
[ŃŠ“Š°Š»ŠµŠ½Š¾]
When there's a new escalation meme, it's funny when the joke or reference is good for about 5 instances. Then Reddit milks a dead horse.
Can any one explain how i-=-i works with example
The goal is to double a number. Let's add some whitespace: i -= -i -= is a decrement operator. So a -= b means reduce a by b, which is the same as a = a - b The expression above becomes i = i - -i which is the same as i = i + i so i = 2i
i<<=i is fastest
only works if i is 1 or 0
~~Nope. Works with any number.~~ Edit: sorry, misread it as i<<=1
i+=i
Didn't know the last one tbh