Curve: secp521r1 Public key (b64 encoded): MIGbMBAGByqGSM49AgEGBSuBBAAjA4GGAAQBNtwf+HWIV/ifAz826Anbd6Ce5L3WPvXGBZ99EEd1QNYqzToWCCLMd5ajzFOidBESl5jjX0jwgpxvV626KBHaJMgB6zKDw3zd2v1IC7IkNCXUDe7DRgqyjFpkLTJ+aGrBRfBgJq20Sqf/RHINHvlzulzQYKV0/vrdGqdqbsQURHoWZGQ=
- GPUs use parallelism
- Floating point math is not associative
- Rounding error accumulates differently
- GPUs generate noisy computations
- Known noise vs accuracy tradeoff in data
- Noise requires overparameterization/larger network to generalize
- Overparameterization prevents the network from fully generalizing to the problem space
Therefore, GPU nondeterminism seems bad for AI. Where did I go wrong?
Questions:
- Has this been quantified? As I understand it, the answer would be situational and tied to other details like network depth, width, architecture, learning rate, etc. At the end of the day, entropy means some sort of noise/accuracy tradeoff, but are we talking magnitudes like 10%, 1%, 0.1%?
- Because of the noise/accuracy tradeoff, it seems to hold that one could use a smaller network trained deterministically and achieve the same performance as X bigger network trained non-deterministically. Is this true, even if we're talking only a single neuron of a difference?
- If something like the problem space of driving a car is too large to be fully represented into a dataset (consider the atoms of the universe as a hard drive), how can we be sure a dataset is a perfect sampling of the problem space?
- Wouldn't overparameterization guarantee the model learns the dataset and not the problem space? Is it incorrect to conceptualize this as using a polynomial of a higher degree to represent another?
- Even with perfect sampling, noisy computation seems incompatible when a small amount of noise is capable of causing an avalanche. If this noise is somehow quantified to 1%, couldn't you say the dataset's "impression" left in the network would be 1% larger than it should, maybe spilling over in a sense? Eval data points "very close to" but not included in training datapoints would be more likely to incorrectly evaluate to as the same "nearby" training datapoint. Maybe I'm reinventing edge case and overfitting here, but I don't think overfitting just spontaneously starts happening towards the end of training.
I'm not looking to drag my ISP through the dirt, as they're universally acclaimed to be bad service providers. I'm sharing my experience here to hear what others think, and to add to the dime of a dozen of other precautionary tales on the internet that maybe, just maybe, you don't need an app for that.
My ISP has an exclusivity contract with the company I rent a property from. The property comes with a preinstalled router. Upon moving into the property and attempting to transfer service, I was told that the preinstalled router was no longer supported. To resolve this, they provided the same model router that other customers rent in lieu of providing their own, free of charge. I was unbothered by this proposition, because my personal router was not compatible either. The replacement worked, had capabilities for local administration, and all the other typical features a router should have. Several months later, the ISP pushed a firmware update to the device that visually put a coat of paint on the admin portal. After some digging, I realized the remote administration page had been removed. I didn't see this as a problem at the time, as it was a feature I always had off, but found the ability to administrate the router remotely on their website. At first, it was simply the ability to reboot the device, but with time, more and more functionality was added to their website. This pattern continued for several more months until a tipping point was reached. Another firmware update was pushed that lobotomized the router's admin portal to nothing more than a way to look at data. The only functionality that remained was the ability to reboot the device. Changing WiFi passwords, port forwarding, disallowing clients, etc. It was all only available via their website. I was perturbed by this change, but it was their router, not mine. Thankfully, I still had all the capabilities the router originally provided (foreshadowing).
In this day and age, every company and their grandparents have an app. I'm no stranger to this concept, myself, having done lots of software development for mobile, web, and general computing environments. It came to no surprise when my ISP released an app that boasted the ability to do all the same operations the web interface provided. What would, however, was when the web portal was shut down, only displaying a splash page for the iOS and android apps. The app is now the only way to administrate the router.
That's when it struck me. My ISP expects their customers to have an internet connection to administrate the router. Customers setting up a new router or experiencing an outage are expected to have cellular data to do something as simple as change the WiFi password. I understand how ubiquitous cellphones and data plans are these days, but I personally see it as a failure for an ISP to require something like that.
I was now forced to migrate over to the app. I have an iPhone 6s with the most recent version of iOS. Perhaps some of you can correct me in the comments, but I do not consider that prehistoric. I sucked it up, downloaded the app, but the situation went from bad to abysmal. The app uses an integrated in app browser to authenticate, but the login portal it navigates to requires cookies to function. In short, I'm unable to use the app, so to me, the router was effectively turned into a modem. I purchased a new router, got it setup, and thus ends my tale.