https://web.archive.org/web/20221231221702/https://en.wikipedia.org/wiki/Floating-point_unit

   #alternate Edit this page Wikipedia (en)

Floating-point unit

   From Wikipedia, the free encyclopedia
   Jump to navigation Jump to search
   Part of a computer system
   Collection of the x87 family of math coprocessors by Intel

   A floating-point unit (FPU, colloquially a math coprocessor) is a part
   of a computer system specially designed to carry out operations on
   floating-point numbers.^[1] Typical operations are addition,
   subtraction, multiplication, division, and square root. Some FPUs can
   also perform various transcendental functions such as exponential or
   trigonometric calculations, but the accuracy can be very low,^[2]^[3]
   so that some systems prefer to compute these functions in software.

   In general-purpose computer architectures, one or more FPUs may be
   integrated as execution units within the central processing unit;
   however, many embedded processors do not have hardware support for
   floating-point operations (while they increasingly have them as
   standard, at least 32-bit ones).

   When a CPU is executing a program that calls for a floating-point
   operation, there are three ways to carry it out:
     * A floating-point unit emulator (a floating-point library)
     * Add-on FPU
     * Integrated FPU

   [ ]

Contents

     * 1 History
     * 2 Floating-point library
     * 3 Integrated FPUs
     * 4 Add-on FPUs
     * 5 See also
     * 6 References
     * 7 Further reading

History[edit]

   In 1954, the IBM 704 had floating-point arithmetic as a standard
   feature, one of its major improvements over its predecessor the IBM
   701. This was carried forward to its successors the 709, 7090, and
   7094.

   In 1963, Digital announced the PDP-6, which had floating point as a
   standard feature.^[4]

   In 1963, the GE-235 featured an "Auxiliary Arithmetic Unit" for
   floating point and double-precision calculations.^[5]

   Historically, some systems implemented floating point with a
   coprocessor rather than as an integrated unit (but now in addition to
   the CPU, e.g. GPUs - that are coprocessors not always built into the
   CPU - have FPUs as a rule, while first generations of GPUs didn't).
   This could be a single integrated circuit, an entire circuit board or a
   cabinet. Where floating-point calculation hardware has not been
   provided, floating-point calculations are done in software, which takes
   more processor time, but avoids the cost of the extra hardware. For a
   particular computer architecture, the floating-point unit instructions
   may be emulated by a library of software functions; this may permit the
   same object code to run on systems with or without floating-point
   hardware. Emulation can be implemented on any of several levels: in the
   CPU as microcode (not a common practice), as an operating system
   function, or in user-space code. When only integer functionality is
   available, the CORDIC floating-point emulation methods are most
   commonly used.

   In most modern computer architectures, there is some division of
   floating-point operations from integer operations. This division varies
   significantly by architecture; some have dedicated floating-point
   registers, while some, like Intel x86, take it as far as independent
   clocking schemes.^[6]

   CORDIC routines have been implemented in Intel x87 coprocessors
   (8087,^[7]^[8]^[9]^[10]^[11] 80287,^[11]^[12] 80387^[11]^[12]) up to
   the 80486^[7] microprocessor series, as well as in the Motorola
   68881^[7]^[8] and 68882 for some kinds of floating-point instructions,
   mainly as a way to reduce the gate counts (and complexity) of the FPU
   subsystem.

   Floating-point operations are often pipelined. In earlier superscalar
   architectures without general out-of-order execution, floating-point
   operations were sometimes pipelined separately from integer operations.

   The modular architecture of Bulldozer microarchitecture uses a special
   FPU named FlexFPU, which uses simultaneous multithreading. Each
   physical integer core, two per module, is single-threaded, in contrast
   with Intel's Hyperthreading, where two virtual simultaneous threads
   share the resources of a single physical core.^[13]^[14]

Floating-point library[edit]

   Wikibooks has a book on the topic of: Floating Point/Soft
   Implementations
   Wikibooks has a book on the topic of: Embedded Systems/Floating Point
   Unit

   Some floating-point hardware only supports the simplest operations:
   addition, subtraction, and multiplication. But even the most complex
   floating-point hardware has a finite number of operations it can
   support - for example, no FPUs directly support arbitrary-precision
   arithmetic.

   When a CPU is executing a program that calls for a floating-point
   operation that is not directly supported by the hardware, the CPU uses
   a series of simpler floating-point operations. In systems without any
   floating-point hardware, the CPU emulates it using a series of simpler
   fixed-point arithmetic operations that run on the integer arithmetic
   logic unit.

   The software that lists the necessary series of operations to emulate
   floating-point operations is often packaged in a floating-point
   library.

Integrated FPUs[edit]

   In some cases, FPUs may be specialized, and divided between simpler
   floating-point operations (mainly addition and multiplication) and more
   complicated operations, like division. In some cases, only the simple
   operations may be implemented in hardware or microcode, while the more
   complex operations are implemented as software.

   In some current architectures, the FPU functionality is combined with
   SIMD units to perform SIMD computation; an example of this is the
   augmentation of the x87 instructions set with SSE instruction set in
   the x86-64 architecture used in newer Intel and AMD processors.

Add-on FPUs[edit]

   Main article: Coprocessor

   In the 1980s, it was common in IBM PC/compatible microcomputers for the
   FPU to be entirely separate from the CPU, and typically sold as an
   optional add-on. It would only be purchased if needed to speed up or
   enable math-intensive programs.

   The IBM PC, XT, and most compatibles based on the 8088 or 8086 had a
   socket for the optional 8087 coprocessor. The AT and 80286-based
   systems were generally socketed for the 80287, and 80386/80386SX-based
   machines - for the 80387 and 80387SX respectively, although early ones
   were socketed for the 80287, since the 80387 did not exist yet. Other
   companies manufactured co-processors for the Intel x86 series. These
   included Cyrix and Weitek. Acorn Computers opted for the WE32206 to
   offer single, double and extended precision^[15] to its ARM powered
   Archimedes range.

   Coprocessors were available for the Motorola 68000 family, the 68881
   and 68882. These were common in Motorola 68020/68030-based
   workstations, like the Sun-3 series. They were also commonly added to
   higher-end models of Apple Macintosh and Commodore Amiga series, but
   unlike IBM PC-compatible systems, sockets for adding the coprocessor
   were not as common in lower-end systems.

   There are also add-on FPUs coprocessor units for microcontroller units
   (MCUs/mCs)/single-board computer (SBCs), which serve to provide
   floating-point arithmetic capability. These add-on FPUs are
   host-processor-independent, possess their own programming requirements
   (operations, instruction sets, etc.) and are often provided with their
   own integrated development environments (IDEs).

See also[edit]

     * Arithmetic logic unit (ALU)
     * Address generation unit (AGU)
     * Load-store unit
     * CORDIC routines are used in many FPUs to implement functions but
       not greatly increase gate count
     * Execution unit
     * IEEE 754 floating-point standard
     * IBM hexadecimal floating point
     * Graphics processing unit
     * Multiply-accumulate operation

References[edit]

    1. ^ Anderson, Stanley F.; Earle, John G.; Goldschmidt, Robert
       Elliott; Powers, Don M. (January 1967). "The IBM System/360 Model
       91: Floating-Point Execution Unit". IBM Journal of Research and
       Development. 11 (1): 34-53. doi:10.1147/rd.111.0034.
       ISSN 0018-8646.
    2. ^ Bruce Dawson (2014-10-09). "Intel Underestimates Error Bounds by
       1.3 quintillion". randomascii.wordpress.com. Retrieved 2020-01-16.
    3. ^ "FSIN Documentation Improvements in the "Intel(R) 64 and IA-32
       Architectures Software Developer's Manual"". intel.com. 2014-10-09.
       Retrieved 2020-01-16.
    4. ^ "PDP-6 Handbook" (PDF). www.bitsavers.org. Archived (PDF) from
       the original on 2022-10-09.
    5. ^ "GE-2xx documents". www.bitsavers.org.
       CPB-267_GE-235-SystemManual_1963.pdf, p. IV-4.
    6. ^ "Intel 80287 family". www.cpu-world.com. Retrieved 2019-01-15.
    7. ^ ^a ^b ^c Muller, Jean-Michel (2006). Elementary Functions:
       Algorithms and Implementation (2 ed.). Boston: Birkhaeuser. p. 134.
       ISBN 978-0-8176-4372-0. LCCN 2005048094. Retrieved 2015-12-01.
    8. ^ ^a ^b Nave, Rafi (March 1983). "Implementation of Transcendental
       Functions on a Numerics Processor". Microprocessing and
       Microprogramming. 11 (3-4): 221-225.
       doi:10.1016/0165-6074(83)90151-5.
    9. ^ Palmer, John F.; Morse, Stephen Paul (1984). The 8087 Primer
       (1 ed.). John Wiley & Sons Australia, Limited. ISBN 0471875694.
       9780471875697. Retrieved 2016-01-02.
   10. ^ Glass, L. Brent (January 1990). "Math Coprocessors: A look at
       what they do, and how they do it". Byte. 15 (1): 337-348.
       ISSN 0360-5280.
   11. ^ ^a ^b ^c Jarvis, Pitts (1990-10-01). "Implementing CORDIC
       algorithms - A single compact routine for computing transcendental
       functions". Dr. Dobb's Journal: 152-156. Retrieved 2016-01-02.
   12. ^ ^a ^b Yuen, A. K. (1988). "Intel's Floating-Point Processors".
       Electro/88 Conference Record: 48/5/1-7.
   13. ^ "Archived copy". cdn3.wccftech.com. Archived from the original on
       9 May 2015. Retrieved 14 March 2022.{{cite web}}: CS1 maint:
       archived copy as title (link)
   14. ^ "AMD unveils Flex FP". bit-tech.net. Retrieved 29 March 2018.
   15. ^ "Western Electric 32206 co-processor". www.cpu-world.com.
       Retrieved 2021-11-06.

Further reading[edit]

     *

   Filiatreault, Raymond (2003). "SIMPLY FPU".

     * v
     * t
     * e

   Processor technologies

   Models

     * Abstract machine
     * Stored-program computer
     * Finite-state machine
          + with datapath
          + Hierarchical
          + Deterministic finite automaton
          + Queue automaton
          + Cellular automaton
          + Quantum cellular automaton
     * Turing machine
          + Alternating Turing machine
          + Universal
          + Post-Turing
          + Quantum
          + Nondeterministic Turing machine
          + Probabilistic Turing machine
          + Hypercomputation
          + Zeno machine
     * Belt machine
     * Stack machine
     * Register machines
          + Counter
          + Pointer
          + Random-access
          + Random-access stored program

   Architecture

     * Microarchitecture
     * Von Neumann
     * Harvard
          + modified
     * Dataflow
     * Transport-triggered
     * Cellular
     * Endianness
     * Memory access
          + NUMA
          + HUMA
          + Load-store
          + Register/memory
     * Cache hierarchy
     * Memory hierarchy
          + Virtual memory
          + Secondary storage
     * Heterogeneous
     * Fabric
     * Multiprocessing
     * Cognitive
     * Neuromorphic

   Instruction set
   architectures

   Types
     * Orthogonal instruction set
     * CISC
     * RISC
     * Application-specific
     * EDGE
          + TRIPS
     * VLIW
          + EPIC
     * MISC
     * OISC
     * NISC
     * ZISC
     * VISC architecture
     * Quantum computing
     * Comparison
          + Addressing modes

   Instruction
   sets
     * Motorola 68000 series
     * VAX
     * PDP-11
     * x86
     * ARM
     * Stanford MIPS
     * MIPS
     * MIPS-X
     * Power
          + POWER
          + PowerPC
          + Power ISA
     * Clipper architecture
     * SPARC
     * SuperH
     * DEC Alpha
     * ETRAX CRIS
     * M32R
     * Unicore
     * Itanium
     * OpenRISC
     * RISC-V
     * MicroBlaze
     * LMC
     * System/3x0
          + S/360
          + S/370
          + S/390
          + z/Architecture
     * Tilera ISA
     * VISC architecture
     * Epiphany architecture
     * Others

   Execution

   Instruction pipelining
     * Pipeline stall
     * Operand forwarding
     * Classic RISC pipeline

   Hazards
     * Data dependency
     * Structural
     * Control
     * False sharing

   Out-of-order
     * Scoreboarding
     * Tomasulo's algorithm
          + Reservation station
          + Re-order buffer
     * Register renaming
     * Wide-issue

        Speculative
     * Branch prediction
     * Memory dependence prediction

   Parallelism

   Level
     * Bit
          + Bit-serial
          + Word
     * Instruction
     * Pipelining
          + Scalar
          + Superscalar
     * Task
          + Thread
          + Process
     * Data
          + Vector
     * Memory
     * Distributed

   Multithreading
     * Temporal
     * Simultaneous
          + Hyperthreading
     * Speculative
     * Preemptive
     * Cooperative

   Flynn's taxonomy
     * SISD
     * SIMD
          + Array processing (SIMT)
          + Pipelined processing
          + Associative processing
          + SWAR
     * MISD
     * MIMD
          + SPMD

   Processor
   performance

     * Transistor count
     * Instructions per cycle (IPC)
          + Cycles per instruction (CPI)
     * Instructions per second (IPS)
     * Floating-point operations per second (FLOPS)
     * Transactions per second (TPS)
     * Synaptic updates per second (SUPS)
     * Performance per watt (PPW)
     * Cache performance metrics
     * Computer performance by orders of magnitude

   Types

     * Central processing unit (CPU)
     * Graphics processing unit (GPU)
          + GPGPU
     * Vector
     * Barrel
     * Stream
     * Tile processor
     * Coprocessor
     * PAL
     * ASIC
     * FPGA
     * FPOA
     * CPLD
     * Multi-chip module (MCM)
     * System in a package (SiP)
     * Package on a package (PoP)

   By application
     * Embedded system
     * Microprocessor
     * Microcontroller
     * Mobile
     * Notebook
     * Ultra-low-voltage
     * ASIP
     * Soft microprocessor

   Systems
   on chip
     * System on a chip (SoC)
     * Multiprocessor (MPSoC)
     * Programmable (PSoC)
     * Network on a chip (NoC)

   Hardware
   accelerators
     * Coprocessor
     * AI accelerator
     * Graphics processing unit (GPU)
     * Image processor
     * Vision processing unit (VPU)
     * Physics processing unit (PPU)
     * Digital signal processor (DSP)
     * Tensor Processing Unit (TPU)
     * Secure cryptoprocessor
     * Network processor
     * Baseband processor

   Word size

     * 1-bit
     * 4-bit
     * 8-bit
     * 12-bit
     * 15-bit
     * 16-bit
     * 24-bit
     * 32-bit
     * 48-bit
     * 64-bit
     * 128-bit
     * 256-bit
     * 512-bit
     * bit slicing
     * others
          + variable

   Core count

     * Single-core
     * Multi-core
     * Manycore
     * Heterogeneous architecture

   Components

     * Core
     * Cache
          + CPU cache
          + Scratchpad memory
          + Data cache
          + Instruction cache
          + replacement policies
          + coherence
     * Bus
     * Clock rate
     * Clock signal
     * FIFO

   Functional
   units
     * Arithmetic logic unit (ALU)
     * Address generation unit (AGU)
     * Floating-point unit (FPU)
     * Memory management unit (MMU)
          + Load-store unit
          + Translation lookaside buffer (TLB)
     * Branch predictor
     * Branch target predictor
     * Integrated memory controller (IMC)
          + Memory management unit
     * Instruction decoder

   Logic
     * Combinational
     * Sequential
     * Glue
     * Logic gate
          + Quantum
          + Array

   Registers
     * Processor register
     * Status register
     * Stack register
     * Register file
     * Memory buffer
     * Memory address register
     * Program counter

   Control unit
     * Hardwired control unit
     * Instruction unit
     * Data buffer
     * Write buffer
     * Microcode ROM
     * Horizontal microcode
     * Counter

   Datapath
     * Multiplexer
     * Demultiplexer
     * Adder
     * Multiplier
          + CPU
     * Binary decoder
          + Address decoder
          + Sum-addressed decoder
     * Barrel shifter

   Circuitry
     * Integrated circuit
          + 3D
          + Mixed-signal
          + Power management
     * Boolean
     * Digital
     * Analog
     * Quantum
     * Switch

   Power
   management

     * PMU
     * APM
     * ACPI
     * Dynamic frequency scaling
     * Dynamic voltage scaling
     * Clock gating
     * Performance per watt (PPW)

   Related

     * History of general-purpose CPUs
     * Microprocessor chronology
     * Processor design
     * Digital electronics
     * Hardware security module
     * Semiconductor device fabrication
     * Tick-tock model
     * Pin grid array
     * Chip carrier

   Retrieved from
   "https://en.wikipedia.org/w/index.php?title=Floating-point_unit&oldid=1
   115051546"

   Categories:
     * Central processing unit
     * Computer arithmetic
     * Coprocessors
     * Floating point

   Hidden categories:
     * CS1 maint: archived copy as title
     * Articles with short description
     * Short description is different from Wikidata

Navigation menu

Personal tools

     * Not logged in
     * Talk
     * Contributions
     * Create account
     * Log in

Namespaces

     * Article
     * Talk

   [ ] English

Views

     * Read
     * Edit
     * View history

   [ ] More

   ____________________ Search Go

Navigation

     * Main page
     * Contents
     * Current events
     * Random article
     * About Wikipedia
     * Contact us
     * Donate

Contribute

     * Help
     * Learn to edit
     * Community portal
     * Recent changes
     * Upload file

Tools

     * What links here
     * Related changes
     * Upload file
     * Special pages
     * Permanent link
     * Page information
     * Cite this page
     * Wikidata item

Print/export

     * Download as PDF
     * Printable version

In other projects

     * Wikimedia Commons

Languages

     * a+l+e+r+b+y+tm
     * Catal`a
     * Cestina
     * Deutsch
     * Ellynika'
     * Espanol
     * f+a+r+s+
     * Franc,ais
     *
     * Bahasa Indonesia
     * Italiano
     * E+B+R+J+T+
     * Latviesu
     * Magyar
     * Nederlands
     *
     * Norsk bokmaal
     * Polski
     * Portugues
     * Russkij
     * Simple English
     * Slovencina
     * Slovenscina
     * Suomi
     * Svenska
     * Ukrayins'ka
     *

   Edit links

     * This page was last edited on 9 October 2022, at 15:57 (UTC).
     * Text is available under the Creative Commons Attribution-ShareAlike
       License 3.0; additional terms may apply. By using this site, you
       agree to the Terms of Use and Privacy Policy. Wikipedia(R) is a
       registered trademark of the Wikimedia Foundation, Inc., a
       non-profit organization.

     * Privacy policy
     * About Wikipedia
     * Disclaimers
     * Contact Wikipedia
     * Mobile view
     * Developers
     * Statistics
     * Cookie statement

     * Wikimedia Foundation
     * Powered by MediaWiki