Long Term Care Insurance for Baby Boomers 2023

Long term care insurance is a type of insurance that covers the cost of long-term care services for individuals who are unable to perform daily living activities due to illness, injury, or cognitive…

Smartphone

独家优惠奖金 100% 高达 1 BTC + 180 免费旋转




The Hardware Bottleneck For Artificial Intelligence

Research in AI has yielded models with complexities of increasing orders of magnitude to solve the world’s pressing machine learning problems. The current state-of-the-art ranges from Facebook’s M2M-100 model that translates over 100 languages to Google’s BERT embeddings and more. However, this also leads to the demand for increased processing power for the algorithms to learn a larger number of variables, requiring better hardware infrastructure. With the demise of Moore’s Law constraining speed-gains from the number of transistors on circuitry, to what extent does the future of AI research have a fundamental hardware bottleneck? We explore emerging trends that may dominate mainstream research by consequence: the intersection with quantum computing to gain new hardware capabilities and design hybrid quantum-based models, or even tinyML to miniaturize existing models for compatibility on low-power embedded systems.

AI’s history can be traced to Pitts and McCulloch’s work on simple artificial neurons performing logic functions that culminated in Minsky’s basic neural net machine, SNARC, in 1951. In the aftermath of the Dartmouth Conference of 1956, more complex architectures were designed, including STUDENT and ELIZA which used large “semantic nets” to converse in English to Gerald Tesauro’s TD-Gammon which leveraged temporal difference learning to analyze gaming environments. The complexity stems from the transition in the “layers” used in creating such models, from simpler single-input neural networks (Ivakhnenko and Lapa, 1967) to convolutions to process images for computer vision (Fukushima, 1980) and recurrent memory for sequenced data (Rumelhart, 1986). The depth and combinations of such features in a single network have significantly increased model training time to learn a vast magnitude of hidden variables.

The above advancements in the size, complexities, and hyperparameters for deep neural networks have significantly outpaced research in semiconductor technology that is core to any hardware system, noted in studies by the University of Notre Dame. The growth of the internet has made unparalleled amounts of unstructured data available to train such models, which further complicates these woes. As Shi (2019) illustrates, hardware platforms currently used like graphics processing units lack the computational bandwidth and memory energy-efficiency to continuously scale AI development in the future. Notable examples of increasing hardware requirements include DeepMind’s AlphaGo Zero, the world’s premier reinforcement-learning agent at the board-game Go which was trained on 4.9 million games against itself, requiring over 64 GPUs and 19 CPU servers, as compared to the chess model Fritz-3 in 1995 that ran on merely 2 CPUs. Alternatively, consider OpenAI’s GPT-3, a 175 billion-parameter language model trained on over 410 billion word tokens, requiring theoretically 355 years to train on a single GPU. The size of state-of-the-art ML models grows exponentially each year, from GPT-2’s 10 billion tokens in 2018 to GPT-3’s 410 billion in 2020. On the other hand, GPU advancements are linear with NVIDIA’s flagship GPU increasing memory size from 32 to 40GB over two years. Thus, a hardware constraint exists, which can not only impair future ML research but also immensely increase model training time, and is attributed to the degradation of Moore’s Law.

Moore’s Law, central to semiconductor design, claims that the number of transistors in a circuit, used as a proxy for hardware speed, would double every two years as seen in a historic trend from 1970 (Figure 1). While initial innovations by Bell Labs and Toshiba shrunk the size oftransistors by seven orders of magnitude in under five decades, fundamental limits have neared — the current 10 nanometer transistor size often means that the channel on silicon chips isn’t always stable for current, causing electrical leakage. Thus, it is much harder to shrink transistors anymore as leakage threatens the chip integrity, limiting its carrying voltage and consequently processing power.

The constraint is further manifested in Dennard scaling- as transistors shrink, their power density remains constant which keeps the chip’s power requirement in proportion with area and creates a barrier on speed. Hence, we are likely not going to witness the rampant innovation in transistor density seen since the 1970s, meaning that current hardware is unlikely to keep pace with requirements for AI research. This is particularly relevant since the availability of a mechanism to sustain future research is necessary to avoid an ‘AI winter’ which could otherwise stagnate advancements in the field. Thus, newer paradigms must be imagined. They are likely to concentrate on creating more powerful hardware with quantum computing or also designing a novel set of algorithms with lower hardware requisiteswith tinyML. Addressing the former, since the crucible of classical hardware lies in encoding information in binary as strings of 1s and 0s, can we rethink computing without this constraint? he human brain does not visualize decision-making as binary, but rather with uncertainty. Can we develop architectures that can similarly integrate this probabilistic measure in information processing, effectively birthing a newer generation of hardware? Instead of bits requiring transistors to be either on (1) or off (0), can they represent a probability distribution? This elemental question fast-tracked the Noisy Intermediate-Scale Quantum (NISQ) era where computing leverages superposition (the ability for objects to exist in multiple energy states simultaneously) and entanglement (an object’s state instantly referencing another even over large distances to create dependencies). Such quantum phenomena use ‘qubits’ which can exist in superpositions with varied probabilities. Hence, particles can vary their probabilities over the course of quantum processes whilst interacting with distant particles in perfect unison by entanglement, seemingly facilitating instant information processing. Such speed gains are evidenced by Google’s quantum supremacy breakthrough, where their 54-bit Sycamore quantum chip performed a numeric computation task in under 200 seconds to beat a traditional computer’s 10,000-year estimated run-time.

The United States Federal Government and State Government have been focusing on Bitcoin at an administrative agency level instead of a federal level. This means Bitcoin is overlooked by federal agencies such as the Securities and Exchange Commision (SEC) and the Internal Revenue Service (IRS). In comparison, Finish economists felt like there was no need for governments to regulate Bitcoin due to how decentralized it is, contrary to its European counterparts. Asian countries such as China banned cryptocurrency exchanges altogether. It is not hard to see that some countries have tighter regulations on cryptocurrency and it is this variation that creates a loophole for money launderers. This can be further supported by a 2020 cryptocurrency crime and money laundering report by Ciphertrace where it found that “74% of the bitcoin moved in exchange-to-exchange transactions were cross-border”.

So, how do such hardware advances benefit artificial intelligence? Firstly, it would address the hardware bottleneck by drastically reducing training and inference times required to run neural networks in fields across computer vision, natural language processing and more. This facilitates novel research in the development of complex and resource-efficient models that scale effectively to large datasets. Hybrid “quantum-classical” models, popularized by Beer et al. (2020), offer such advantages seamlessly through Tensorflow Quantum (a novel open-source development framework). Secondly, it spearheads research into optimization problems whose sheer number of possibilities currently prevent empirical testing: what is the best order to assemble a Boeing airplane, consisting of millions of parts, to minimize cost and time? What is the best scheduling algorithm for traffic signals in urban neighbourhoods to reduce car wait times? How can logistics like multiple dependent delivery routes be ascertained for lowest time-to-consumer?

Alternatively, instead of focussing on powerful hardware with quantum enhancements to “break” any existing any existing bottleneck, researchers expect AI literature published in the near future to instead be guided more by the “miniaturization” of current models to run on low-memory, low-power devices — the burgeoning field of tinyML. Drawing from Pete Warden at the O’Reilly AI Conference 2019, this newer paradigm of “data-centric” computing focuses on improving energy-efficiency, privacy, storage, and latency by leveraging IoT systems and the over 250 billion microcontrollers in use. TinyML attempts to rely on a three-pronged process (pruning, quantization, and encoding) to translate larger model designs to smaller versions that microcontrollers can effectively run. Pruning, shown in Figure 2, relies on removing model layers that have low impact on output, reducing model size with minor changes in accuracy. Since microcontrollers often support lower numeric precision than traditional GPUs, quantization reduces floating-point representation sizes (number of decimal points) while Huffman encoding compresses model weights to achieve similar results with lower storage space such that it can subsequently be loaded by ML development platforms like Tensorflow Lite. These stages often miniaturize systems by an entire order of magnitude, like Microsoft’s Bonsai algorithm reduced by 1:10000 with comparable precision, allowing a novel class of algorithms to be deployed to end-points.

What use-cases does tinyML bring to the forefront? Elementally, instead of machine learning being positioned as the final stage in a data pipeline, tinyML attempts to add intelligent nodes throughout the process. his has two critical impacts on the hardware bottleneck in question. Firstly, by “miniaturizing” models, hardware requirements are lowered and the bottleneck ceases to exist, allowing novel AI applications.

Consider real-time voice recognition (‘Ok Google’ or ‘Hey Siri’) which can deplete device battery in a few hours if the phone’s main CPU has to always be active to recognize commands. Thus, initial tinyML curiosity led to developing low-power hardware that can run on a single CR2032 battery for over a year while hosting those advanced voice detection algorithms. Secondly, some processes can be moved upstream in the pipeline for processing by these smaller models, reducing the dataset and thus the compute time for more conventional large-scale models to run downstream on traditional hardware. Consider a video surveillance and security system that often has large data-sets due to high frame-rates on a 24/7 recording. Advanced microcontrollers can be added to the control flow to run basic object detection and motion analysis (only save images containing an object of interest), scene segmentation algorithms (finding patches of the recording that are relevant), or upscaling/denoising such that the traditional threat assessment model now has lesser, more relevant data to operate on.

To conclude, the potent hardware constraint facing AI research stems from the demise of Moore’s Law rendering hardware advancements inadequate to match increases in model complexity and data-set sizes that are involved in building state-of-the-art AI models. Therefore, several emerging trends aim to dominate the research ecosystem by consequence, including novel quantum hardware to scalable high-compute algorithms and tinyML to deploy “compressed” models onto embedded systems. As each narrative enhances new use-cases, the future of artificial intelligence research stands at the cusp of a novel paradigm.

Add a comment

Related posts:

Top 5 Secrets to Achieving Cardiovascular Fitness

Top 5 Secrets to Achieving Cardiovascular Fitness. Learn how regular aerobic exercise, strength training, HIIT, a balanced diet, and stress management can help improve your cardiovascular fitness and lower the risk of heart disease and stroke.

Look Your Past in the Face

No one can share your story better than you. You have it within you to heal from the hurt and the pain. Many times I’ve thought about divulging my story, to get even for all the wrongs. But it…

7 Books that Will Satisfy Your Craving for Travel and Adventure

With the power to transport you to another time and place, the right book can make you feel like you’re jetting off to a faraway land — no passport required. At times the tales are so captivating…