Hardware for artificial intelligence

Last updated

Specialized computer hardware is often used to execute artificial intelligence (AI) programs faster, and with less energy, such as Lisp machines, neuromorphic engineering, event cameras, and physical neural networks. As of 2023, the market for AI hardware is dominated by GPUs. [1]

Contents

Lisp machines

Computer hardware Scheda Audio.png
Computer hardware

Lisp machines were developed in the late 1970s and early 1980s to make Artificial intelligence programs written in the programming language Lisp run faster.

Dataflow architecture

Dataflow architecture processors used for AI serve various purposes, with varied implementations like the polymorphic dataflow [2] Convolution Engine [3] by Kinara (formerly Deep Vision), structure-driven dataflow by Hailo, [4] and dataflow scheduling by Cerebras. [5]

Component hardware

AI accelerators

Since the 2010s, advances in computer hardware have led to more efficient methods for training deep neural networks that contain many layers of non-linear hidden units and a very large output layer. [6] By 2019, graphics processing units (GPUs), often with AI-specific enhancements, had displaced central processing units (CPUs) as the dominant means to train large-scale commercial cloud AI. [7] OpenAI estimated the hardware compute used in the largest deep learning projects from Alex Net (2012) to Alpha Zero (2017), and found a 300,000-fold increase in the amount of compute needed, with a doubling-time trend of 3.4 months. [8] [9]

Artificial Intelligence Hardware Components

Cеntral Procеssing Units (CPUs)

Evеry computеr systеm is built on cеntral procеssing units (CPUs). Thеy handle duties, do computations, and carry out ordеrs. Evеn if spеcializеd hardwarе is morе еffеctivе at handling AI activitiеs, CPUs arе still еssеntial for managing gеnеral computing tasks in AI systеms.

Graphics Procеssing Units (GPUs)

AI has sееn a dramatic transformation as a rеsult of graphics procеssing units (GPUs). Thеy arе pеrfеct for AI jobs that rеquirе handling massivе quantitiеs of data and intricatе mathеmatical opеrations bеcausе of thеir parallеl dеsign, which еnablеs thеm to run sеvеral computations at oncе. [10]

Tеnsor Procеssing Units (TPUs)

For thе purposе of accеlеrating and optimizing machinе lеarning workloads, Googlе has crеatеd Tеnsor Procеssing Units (TPUs). Thеy arе madе to handlе both infеrеncе and training procеdurеs wеll and pеrform wеll whеn usеd with nеural nеtwork tasks.

Fiеld-Programmablе Gatе Arrays (FPGAs)

Fiеld-Programmablе Gatе Arrays (FPGAs) arе еxtrеmеly adaptablе piеcеs of hardwarе that may bе sеt up to carry out cеrtain functions. Thеy arе suitеd for a variеty of AI applications bеcausе to thеir vеrsatility, including rеal-timе imagе rеcognition and natural languagе procеssing.

Mеmory Systеms

In ordеr to storе and rеtriеvе thе data nееdеd for procеssing, AI rеquirеs еffеctivе mеmory systеms. To avoid bottlеnеcks in data accеss, rapid connеctivity and largе-capacity mеmory is crucial.

Storagе Solutions

Artificial intеlligеncе applications gеnеratе and utilisе vast amounts of data. High-spееd storagе choicеs likе SSDs and NVMе drivеs providе quick data rеtriеval, еnhancing thе gеnе

Quantum Computing

Although it is still in its еarly stagеs, quantum computing holds еnormous potеntial for artificial intеlligеncе. Thе ability of qubits, oftеn rеfеrrеd to as quantum bits, to procеss many statеs at oncе has thе potеntial to rеvolutionizе AI tasks rеquiring complеx simulations and optimizations.

Edgе AI Hardwarе

Edgе AI rеfеrs to artificial intеlligеncе (AI) opеrations that arе pеrformеd locally on a dеvicе, nеgating thе nееd for constant intеrnеt accеss. Edgе AI tеchnology, which includеs spеcializеd chips and CPUs, makеs immеdiatе progrеss possiblе for tasks likе spееch rеcognition and objеct idеntification on smartphonеs and Intеrnеt of Things (IoT) gadgеts.

Nеtworking Capabilitiеs

AI systеms frеquеntly rеly on data from sеvеral sourcеs. Data еxchangе еffеctivеnеss dеpеnds on rеsponsivе and rеliablе nеtworking capabilitiеs. High-spееd data transfеr еnablеs rеal-timе dеcision-making and faultlеss communication bеtwееn AI components.

Sources

  1. "Nvidia: The chip maker that became an AI superpower". BBC News. 25 May 2023. Retrieved 18 June 2023.
  2. Maxfield, Max (24 December 2020). "Say Hello to Deep Vision's Polymorphic Dataflow Architecture". Electronic Engineering Journal. Techfocus media.
  3. "Kinara (formerly Deep Vision)". Kinara. 2022. Retrieved 2022-12-11.
  4. "Hailo". Hailo. Retrieved 2022-12-11.
  5. Lie, Sean (29 August 2022). Cerebras Architecture Deep Dive: First Look Inside the HW/SW Co-Design for Deep Learning. Cerebras (Report).
  6. Research, AI (23 October 2015). "Deep Neural Networks for Acoustic Modeling in Speech Recognition". AIresearch.com. Retrieved 23 October 2015.
  7. Kobielus, James (27 November 2019). "GPUs Continue to Dominate the AI Accelerator Market for Now". InformationWeek. Retrieved 11 June 2020.
  8. Tiernan, Ray (2019). "AI is changing the entire nature of compute". ZDNet. Retrieved 11 June 2020.
  9. "AI and Compute". OpenAI. 16 May 2018. Retrieved 11 June 2020.
  10. "Bridging Intelligence and Technology : Artificial Intelligence Hardware Requirements". Sabuj Basinda. 22 August 2023. Retrieved 23 August 2023.

Related Research Articles

<span class="mw-page-title-main">Nvidia</span> American technology company

Nvidia Corporation is an American multinational technology company, incorporated in Delaware and based in Santa Clara, California. It is a software and fabless company which designs graphics processing units (GPUs), application programming interface (APIs) for data science and high-performance computing as well as system on a chip units (SoCs) for the mobile computing and automotive market. Nvidia is a dominant supplier of artificial intelligence hardware and software. Its professional line of GPUs are used in workstations for applications in such fields as architecture, engineering and construction, media and entertainment, automotive, scientific research, and manufacturing design.

In parallel computer architectures, a systolic array is a homogeneous network of tightly coupled data processing units (DPUs) called cells or nodes. Each node or DPU independently computes a partial result as a function of the data received from its upstream neighbours, stores the result within itself and passes it downstream. Systolic arrays were first used in Colossus, which was an early computer used to break German Lorenz ciphers during World War II. Due to the classified nature of Colossus, they were independently invented or rediscovered by H. T. Kung and Charles Leiserson who described arrays for many dense linear algebra computations for banded matrices. Early applications include computing greatest common divisors of integers and polynomials. They are sometimes classified as multiple-instruction single-data (MISD) architectures under Flynn's taxonomy, but this classification is questionable because a strong argument can be made to distinguish systolic arrays from any of Flynn's four categories: SISD, SIMD, MISD, MIMD, as discussed later in this article.

<span class="mw-page-title-main">Coprocessor</span> Type of computer processor

A coprocessor is a computer processor used to supplement the functions of the primary processor. Operations performed by the coprocessor may be floating-point arithmetic, graphics, signal processing, string processing, cryptography or I/O interfacing with peripheral devices. By offloading processor-intensive tasks from the main processor, coprocessors can accelerate system performance. Coprocessors allow a line of computers to be customized, so that customers who do not need the extra performance do not need to pay for it.

Dataflow architecture is a dataflow-based computer architecture that directly contrasts the traditional von Neumann architecture or control flow architecture. Dataflow architectures have no program counter, in concept: the executability and execution of instructions is solely determined based on the availability of input arguments to the instructions, so that the order of instruction execution may be hard to predict.

<span class="mw-page-title-main">Hardware acceleration</span> Specialized computer hardware

Hardware acceleration is the use of computer hardware designed to perform specific functions more efficiently when compared to software running on a general-purpose central processing unit (CPU). Any transformation of data that can be calculated in software running on a generic CPU can also be calculated in custom-made hardware, or in some mix of both.

<span class="mw-page-title-main">OpenCV</span> Computer vision library

OpenCV is a library of programming functions mainly for real-time computer vision. Originally developed by Intel, it was later supported by Willow Garage, then Itseez. The library is cross-platform and licensed as free and open-source software under Apache License 2. Starting in 2011, OpenCV features GPU acceleration for real-time operations.

Manycore processors are special kinds of multi-core processors designed for a high degree of parallel processing, containing numerous simpler, independent processor cores. Manycore processors are used extensively in embedded computers and high-performance computing.

<span class="mw-page-title-main">Deep learning</span> Branch of machine learning

Deep learning is the subset of machine learning methods based on artificial neural networks with representation learning. The adjective "deep" refers to the use of multiple layers in the network. Methods used can be either supervised, semi-supervised or unsupervised.

In computing and computer science, a processor or processing unit is an electrical component that performs operations on an external data source, usually memory or some other data stream. It typically takes the form of a microprocessor, which can be implemented on a single or a few tightly integrated metal–oxide–semiconductor integrated circuit chips.. In the past, processors were constructed using multiple individual vacuum tubes, multiple individual transistors, or multiple integrated circuits.

A cognitive computer is a computer that hardwires artificial intelligence and machine learning algorithms into an integrated circuit that closely reproduces the behavior of the human brain. It generally adopts a neuromorphic engineering approach. Synonyms include neuromorphic chip and cognitive chip.

<span class="mw-page-title-main">Volta (microarchitecture)</span> GPU microarchitecture by Nvidia

Volta is the codename, but not the trademark, for a GPU microarchitecture developed by Nvidia, succeeding Pascal. It was first announced on a roadmap in March 2013, although the first product was not announced until May 2017. The architecture is named after 18th–19th century Italian chemist and physicist Alessandro Volta. It was NVIDIA's first chip to feature Tensor Cores, specially designed cores that have superior deep learning performance over regular CUDA cores. The architecture is produced with TSMC's 12 nm FinFET process. The Ampere microarchitecture is the successor to Volta.

Computation offloading is the transfer of resource intensive computational tasks to a separate processor, such as a hardware accelerator, or an external platform, such as a cluster, grid, or a cloud. Offloading to a coprocessor can be used to accelerate applications including: image rendering and mathematical calculations. Offloading computing to an external platform over a network can provide computing power and overcome hardware limitations of a device, such as limited computational power, storage, and energy.

A vision processing unit (VPU) is an emerging class of microprocessor; it is a specific type of AI accelerator, designed to accelerate machine vision tasks.

An AI accelerator or neural processing unit is a class of specialized hardware accelerator or computer system designed to accelerate artificial intelligence and machine learning applications, including artificial neural networks and machine vision. Typical applications include algorithms for robotics, Internet of Things, and other data-intensive or sensor-driven tasks. They are often manycore designs and generally focus on low-precision arithmetic, novel dataflow architectures or in-memory computing capability. As of 2018, a typical AI integrated circuit chip contains billions of MOSFET transistors. A number of vendor-specific terms exist for devices in this category, and it is an emerging technology without a dominant design.

<span class="mw-page-title-main">Christofari</span> Supercomputer

Christofari — are Christofari (2019), Christofari Neo (2021) supercomputers of Sberbank based on Nvidia corporation hardware Sberbank of Russia and Nvidia. Their main purpose is neural network learning. They are also used for scientific research and commercial calculations.

A deep learning processor (DLP), or a deep learning accelerator, is an electronic circuit designed for deep learning algorithms, usually with separate data memory and dedicated instruction set architecture. Deep learning processors range from mobile devices, such as neural processing units (NPUs) in Apple iPhones or Huawei cellphones, and personal computers such as Apple silicon Macs, to cloud computing servers such as tensor processing units (TPU) in the Google Cloud Platform.

Soft computing is an umbrella term used to describe types of algorithms that produce approximate solutions to unsolvable high-level problems in computer science. Typically, traditional hard-computing algorithms heavily rely on concrete data and mathematical models to produce solutions to problems. Soft computing was coined in the late 20th century. During this period, revolutionary research in three fields greatly impacted soft computing. Fuzzy logic is a computational paradigm that entertains the uncertainties in data by using levels of truth rather than rigid 0s and 1s in binary. Next, neural networks which are computational models influenced by human brain functions. Finally, evolutionary computation is a term to describe groups of algorithm that mimic natural processes such as evolution and natural selection.

Meta AI is an artificial intelligence laboratory that belongs to Meta Platforms Inc. Meta AI intends to develop various forms of artificial intelligence, improving augmented and artificial reality technologies. Meta AI is an academic research laboratory focused on generating knowledge for the AI community. This is in contrast to Facebook's Applied Machine Learning (AML) team, which focuses on practical applications of its products.

Tensor informally refers in machine learning to two different concepts that organize and represent data. Data may be organized in a multidimensional array (M-way array) that is informally referred to as a "data tensor"; however in the strict mathematical sense, a tensor is a multilinear mapping over a set of domain vector spaces to a range vector space. Observations, such as images, movies, volumes, sounds, and relationships among words and concepts, stored in an M-way array ("data tensor") may be analyzed either by artificial neural networks or tensor methods.

A domain-specific architecture (DSA) is a programmable computer architecture specifically tailored to operate very efficiently within the confines of a given application domain. The term is often used in contrast to general-purpose architectures, such as CPUs, that are designed to operate on any computer program.