Search
Search titles only
By:
Search titles only
By:
Log in
Register
Search
Search titles only
By:
Search titles only
By:
Menu
Install the app
Install
Forums
New posts
All threads
Latest threads
New posts
Trending threads
Trending
Search forums
What's new
New posts
New ads
New profile posts
Latest activity
Free Ads
Latest reviews
Search ads
Members
Current visitors
New profile posts
Search profile posts
Contact us
Latest ads
Ad icon
Wechat qr verification
Pawan2005
Updated:
Today at 1:28 AM
🚀 GOOGLE AI PRO 18 MONTHS ACTIVATION 🚀
sayuru bandara
Updated:
Yesterday at 5:34 PM
Pure VPN - Up to 27 Months
vgp
Updated:
Friday at 8:10 AM
එක පැකේජ් එකයි මාසෙටම Unlimited Internet. තාමත් DATA CARD දාන්න සල්ලි වියදම් කරනවද? අඩුම මිලට අපෙන්.
sayuru bandara
Updated:
Jun 2, 2026
Ad icon
ඉන්ටර්නෙට් එකෙන් හරියටම සල්ලි හොයන්න සහ Success වෙන්න කැමතිද? 🚀 (E-Money & Success Stories)
siri sumana
Updated:
May 30, 2026
Electronics
Vehicles
Property
Search
Reply to thread
Forums
General
ElaKiri Talk!
nice question and answer..!
Get the App
JavaScript is disabled. For a better experience, please enable JavaScript in your browser before proceeding.
You are using an out of date browser. It may not display this or other websites correctly.
You should upgrade or use an
alternative browser
.
Message
<blockquote data-quote="RoxDrex12" data-source="post: 22381253" data-attributes="member: 562482"><p><span style="font-size: 12px"><strong>What is the longest amount of time you have spent fighting a code bug?</strong></span></p><p></p><p><span style="color: Black"><span style="font-size: 12px">Six years, with eight engineers. What’s more, we found the same bug in Windows, MacOS, FreeBSD and Linux, for about six or seven devices. In the case of the Linux and FreeBSD examples we could fix, the change to fix it required changing two characters in the source code.</span></span></p><p><span style="color: Black"><span style="font-size: 12px"></span></span></p><p><span style="color: Black"><span style="font-size: 12px">The bug goes like this:</span></span></p><p><span style="color: Black"><span style="font-size: 12px"></span></span></p><p><span style="color: Black"><span style="font-size: 12px">Wi-Fi has something called ad-hoc mode, which is very rarely used these days (probably because this bug is still out there). It allows a group of Wi-Fi devices to form a network together, without an access point, and is really quite cool.</span></span></p><p><span style="color: Black"><span style="font-size: 12px"></span></span></p><p><span style="color: Black"><span style="font-size: 12px">We were building large outdoor networks using ad-hoc mode, and we found that after around six weeks of uptime, randomly one device would start to be very slow. The slowness would be contagious; after that first device, every reboot would have a chance of being slow when it came back up, until the whole network would be slow and we would have to switch off all the devices, and all our laptops and test gear that had ever joined the network, and cold-start the whole thing. This was massively inconvenient, as some of the devices were at the top of 45 meter lighting poles in a railway yard where we had to make special arrangements to get access to the power switches…</span></span></p><p><span style="color: Black"><span style="font-size: 12px"></span></span></p><p><span style="color: Black"><span style="font-size: 12px">We searched for this bug for years. We found dozens of other bugs, and fixed them; some of those fixes have become standard parts of the Linux WiFi stack. We changed to new hardware twice, one of them with chips where we collaborated with the designers during development of the hardware.</span></span></p><p><span style="color: Black"><span style="font-size: 12px"></span></span></p><p><span style="color: Black"><span style="font-size: 12px">We discovered many things:</span></span></p><p><span style="color: Black"><span style="font-size: 12px"></span></span></p><p><span style="color: Black"><span style="font-size: 12px">There was a minimum time before this could not happen.</span></span></p><p><span style="color: Black"><span style="font-size: 12px">Wi-Fi tracks the time since the network started; even before the bug showed as performance problems, ours would be claiming to be sixty thousand years old, and getting older by about two thousand years a day.</span></span></p><p><span style="color: Black"><span style="font-size: 12px">This is done with a time variable called the TSF that is in units of 802.11 TU, each 1.024 microseconds, since the time the network was set up.</span></span></p><p><span style="color: Black"><span style="font-size: 12px">The slow nodes would be unable to receive for up to 90% of the time, but could transmit fine and were always received properly even by another slow node.</span></span></p><p><span style="color: Black"><span style="font-size: 12px">Wi-Fi devices at the time were terrible at selecting good transmitter settings, and we could do much better at that; we fixed that problem, and while it was not stuck slow the network got ten times faster, but this fix actually made the slow node problem worse; the slow nodes were much slower, and the contagion spread faster.</span></span></p><p><span style="color: Black"><span style="font-size: 12px">One day we got so tired of this problem, we decided that we were going to sit in a conference room with all our kernel developers together, put the source code on the projector screen, and read it all together.</span></span></p><p><span style="color: Black"><span style="font-size: 12px"></span></span></p><p><span style="color: Black"><span style="font-size: 12px">The TSF is formally a 64 bit number, but is handled in various places in 24, 32, and 48 bit suffixes, with code having to determine the missing bits.</span></span></p><p><span style="color: Black"><span style="font-size: 12px"></span></span></p><p><span style="color: Black"><span style="font-size: 12px">We started with the file that defined the basic data structures of the Wi-Fi stack. We got a few dozen lines into that file, and spotted a line of code that I now can’t find, but it defined the type of variable that would be used to handle time values. And it said that the TSF would be a 32 bit integer. And we all looked at that line of code, and eventually I said “u32 TSF? Wonder if the arithmetic is all correct on that…”. We went and looked at every place it was used, and couldn’t figure out if it was or not.</span></span></p><p><span style="color: Black"><span style="font-size: 12px"></span></span></p><p><span style="color: Black"><span style="font-size: 12px">So we decided to do the obvious thing, and change it to a 64 bit integer. Then we rebuilt our code and rebooted the network, which took a good week to do.</span></span></p><p><span style="color: Black"><span style="font-size: 12px"></span></span></p><p><span style="color: Black"><span style="font-size: 12px">Three months later, the network was still fine and we declared we had fixed it.</span></span></p><p><span style="color: Black"><span style="font-size: 12px"></span></span></p><p><span style="color: Black"><span style="font-size: 12px">We tested every Wi-Fi device we could lay our hands on, and about 3/4 of them had the same bug. The ones that could run different operating systems, mostly Apple laptops, sometimes had the bug in two or three operating systems. We reported this problem to everyone we could find: Apple, Microsoft, four chip manufacturers, and so on.</span></span></p><p><span style="color: Black"><span style="font-size: 12px"></span></span></p><p><span style="color: Black"><span style="font-size: 12px">It turned out that there were quite a lot of implementations that were much worse: instead of using a 32 bit number, they had used 24 bits, and then their ad-hoc mode networks would fail after 4 hours and 46 minutes…</span></span></p><p></p><p><span style="font-size: 12px"><span style="color: Red">But if you wonder why we have Bluetooth for so many things when Wi-Fi could do just as well or better… this bug is the reason, I believe. Wi-Fi just wasn’t reliable in ad-hoc mode during the critical period of time, and Bluetooth became the way to do these things.</span></span> <img src="/styles/default/xenforo/smilies/default/eek.gif" class="smilie" loading="lazy" alt=":eek:" title="eek :eek:" data-shortname=":eek:" /><img src="/styles/default/xenforo/smilies/default/eek.gif" class="smilie" loading="lazy" alt=":eek:" title="eek :eek:" data-shortname=":eek:" /><img src="/styles/default/xenforo/smilies/default/eek.gif" class="smilie" loading="lazy" alt=":eek:" title="eek :eek:" data-shortname=":eek:" /></p><p></p><p>by Andrew McGregor </p><p>Site Reliability Engineer at Google (2013-present)</p><p></p><p>gaththe methanin <a href="https://www.quora.com/What-is-the-longest-amount-of-time-you-have-spent-fighting-a-code-bug" target="_blank">link to quora</a></p><p></p><p>gihin balapalla thawa answers thiyenawa <img src="/styles/default/xenforo/smilies/default/rolleyes.gif" class="smilie" loading="lazy" alt=":rolleyes:" title="Rolleyes :rolleyes:" data-shortname=":rolleyes:" /><img src="/styles/default/xenforo/smilies/default/rolleyes.gif" class="smilie" loading="lazy" alt=":rolleyes:" title="Rolleyes :rolleyes:" data-shortname=":rolleyes:" /></p></blockquote><p></p>
[QUOTE="RoxDrex12, post: 22381253, member: 562482"] [SIZE="3"][B]What is the longest amount of time you have spent fighting a code bug?[/B][/SIZE] [COLOR="Black"][SIZE="3"]Six years, with eight engineers. What’s more, we found the same bug in Windows, MacOS, FreeBSD and Linux, for about six or seven devices. In the case of the Linux and FreeBSD examples we could fix, the change to fix it required changing two characters in the source code. The bug goes like this: Wi-Fi has something called ad-hoc mode, which is very rarely used these days (probably because this bug is still out there). It allows a group of Wi-Fi devices to form a network together, without an access point, and is really quite cool. We were building large outdoor networks using ad-hoc mode, and we found that after around six weeks of uptime, randomly one device would start to be very slow. The slowness would be contagious; after that first device, every reboot would have a chance of being slow when it came back up, until the whole network would be slow and we would have to switch off all the devices, and all our laptops and test gear that had ever joined the network, and cold-start the whole thing. This was massively inconvenient, as some of the devices were at the top of 45 meter lighting poles in a railway yard where we had to make special arrangements to get access to the power switches… We searched for this bug for years. We found dozens of other bugs, and fixed them; some of those fixes have become standard parts of the Linux WiFi stack. We changed to new hardware twice, one of them with chips where we collaborated with the designers during development of the hardware. We discovered many things: There was a minimum time before this could not happen. Wi-Fi tracks the time since the network started; even before the bug showed as performance problems, ours would be claiming to be sixty thousand years old, and getting older by about two thousand years a day. This is done with a time variable called the TSF that is in units of 802.11 TU, each 1.024 microseconds, since the time the network was set up. The slow nodes would be unable to receive for up to 90% of the time, but could transmit fine and were always received properly even by another slow node. Wi-Fi devices at the time were terrible at selecting good transmitter settings, and we could do much better at that; we fixed that problem, and while it was not stuck slow the network got ten times faster, but this fix actually made the slow node problem worse; the slow nodes were much slower, and the contagion spread faster. One day we got so tired of this problem, we decided that we were going to sit in a conference room with all our kernel developers together, put the source code on the projector screen, and read it all together. The TSF is formally a 64 bit number, but is handled in various places in 24, 32, and 48 bit suffixes, with code having to determine the missing bits. We started with the file that defined the basic data structures of the Wi-Fi stack. We got a few dozen lines into that file, and spotted a line of code that I now can’t find, but it defined the type of variable that would be used to handle time values. And it said that the TSF would be a 32 bit integer. And we all looked at that line of code, and eventually I said “u32 TSF? Wonder if the arithmetic is all correct on that…”. We went and looked at every place it was used, and couldn’t figure out if it was or not. So we decided to do the obvious thing, and change it to a 64 bit integer. Then we rebuilt our code and rebooted the network, which took a good week to do. Three months later, the network was still fine and we declared we had fixed it. We tested every Wi-Fi device we could lay our hands on, and about 3/4 of them had the same bug. The ones that could run different operating systems, mostly Apple laptops, sometimes had the bug in two or three operating systems. We reported this problem to everyone we could find: Apple, Microsoft, four chip manufacturers, and so on. It turned out that there were quite a lot of implementations that were much worse: instead of using a 32 bit number, they had used 24 bits, and then their ad-hoc mode networks would fail after 4 hours and 46 minutes…[/SIZE][/COLOR] [SIZE="3"][COLOR="Red"]But if you wonder why we have Bluetooth for so many things when Wi-Fi could do just as well or better… this bug is the reason, I believe. Wi-Fi just wasn’t reliable in ad-hoc mode during the critical period of time, and Bluetooth became the way to do these things.[/COLOR][/SIZE] :eek::eek::eek: by Andrew McGregor Site Reliability Engineer at Google (2013-present) gaththe methanin [URL="https://www.quora.com/What-is-the-longest-amount-of-time-you-have-spent-fighting-a-code-bug"]link to quora[/URL] gihin balapalla thawa answers thiyenawa :rolleyes::rolleyes: [/QUOTE]
Insert quotes…
Verification
Asuwa dahayen wadi kalama keeyada?
Post reply
Top
Bottom