float-formats
文件大小: unknow
源码售价: 5 个金币 积分规则     积分充值
资源说明:Ruby tools for handling floating point representations
[![Gem Version](https://badge.fury.io/rb/float-formats.svg)](http://badge.fury.io/rb/float-formats)
[![Build Status](https://travis-ci.org/jgoizueta/float-formats.svg)](https://travis-ci.org/jgoizueta/float-formats)

# Introduction

Float-Formats is a Ruby package with methods to handle diverse floating-point formats.
These are some of the things that can be done with it:

* Encoding and decoding numerical values in specific floating point representations.
* Conversion of floating-point data between different formats.
* Obtaining properties of floating-point formats (ranges, precision, etc.)
* Exploring and learning about floating point representations.
* Definition and testing of new floating-point formats.

# Installation

To install the gem manually:

    gem install float-formats

You can find the code in GitHub:

* http://github.com/jgoizueta/float-formats/

# Predefined formats

A number of common formats are defined as constants in the Flt module:

## IEEE 754-2008

**Binary** floating point representations in little endian order:

* IEEE_binary16 (half precision),
* IEEE_binary32 (single precision),
* IEEE_binary64 (double precision),
* IEEE_binary80 (extended), IEEE_binary128 (quadruple precision) and
  as little endian: IEEE_binary16_BE, etc.

**Decimal** formats (using DPD):

* IEEE_decimal32, IEEE_decimal64 and IEEE_decimal128.

**Interchange binary & decimal** formats:

* IEEE_binary256, IEEE_binary512, IEEE_binary1024, IEEE_decimal192, IEEE_decimal256.

Others can be defined with IEEE.interchange_binary and IEEE.interchange_decimal
(see the IEEE module).

## Legacy

Formats of historical interest, some of which are found
in file formats still in use.

**Mainframe/supercomputer** formats:
Univac 1100 (UNIVAC_SINGLE, UNIVAC_DOUBLE),
IBM 360 etc. (IBM32, IBM64 and IBM128),
CDC 6600/7600: (CDC_SINGLE, CDC_DOUBLE),
Cray-1: (CRAY).

**Minis**: PDP11 and Vaxes: (PDP11_F, PDP11_D, VAX_F, VAX_D, VAX_G and VAX_H),
HP3000: (XS256, XS256_DOUBLE),
Wang 2200: (WANG2200).

**Microcomputers** (software implementations):
Apple II: (APPLE),
Microsoft Basic, Spectrum, etc.: (XS128),
Microsoft Quickbasic: (MBF_SINGLE, MBF_DOUBLE),
Borland Pascal: (BORLAND48).

**Embedded systems**:
Formats used in the Intel 8051 by the C51 compiler:
(C51_BCD_FLOAT, C51_BCD_DOUBLE and C51_BCD_LONG_DOUBLE).

**Minifloats**:
AI Formats: (BFLOAT16, MSFP8, MSFP9, MSFP10, MSFP11)
Khronos Vulkan unsigned formats: (KHRONOS_VULKAN_UNSIGNED10, KHRONOS_VULKAN_UNSIGNED11)

## Calculators

Formats used in HP RPL calculators: (RPL, RPL_X),
HP-71B formats (HP71B, HP71B_X)
and classic HP 10 digit calculators: (HP_CLASSIC).

# Using the pre-defined formats

    require 'rubygems'
    require 'float-formats'
    include Flt

The properties of the floating point formats can be queried (which can be
used for tables or reports comparing different formats):

Size in bits of the representations:

    puts IEEE_binary32.total_bits                        # -> 32

Numeric radix:

    puts IEEE_binary32.radix                             # -> 2

Digits of precision (radix-based)

    puts IEEE_binary32.significand_digits                # -> 24

Minimum and maximum values of the radix-based exponent:

    puts IEEE_binary32.radix_min_exp                     # -> -126
    puts IEEE_binary32.radix_max_exp                     # -> 127

Decimal precision

    puts IEEE_binary32.decimal_digits_stored             # -> 6
    puts IEEE_binary32.decimal_digits_necessary          # -> 9

Minimum and maximum decimal exponents:

    puts IEEE_binary32.decimal_min_exp                   # -> -37
    puts IEEE_binary32.decimal_max_exp                   # -> 38

## Encode and decode numbers

For each floating-point format class there is a constructor method with the same
name which can build a floating-point value from a variety of parameters:

* Using three integers:
  the sign (+1 for +, -1 for -), the significand (coefficient or mantissa)
  and the exponent.
* From a text numeral (with an optional `Numerals` format specifier)
* From a number : converts a numerical value
  to a floating point representation.

Examples:

    File.open('binary_file.dat','wb'){|f| f.write IEEE_binary80('0.1').to_bytes}
    puts IEEE_binary80('0.1').to_hex(true)           # -> CD CC CC CC CC CC CC CC FB 3F
    puts IEEE_binary80(0.1).to_hex(true)             # -> CD CC CC CC CC CC CC CC FB 3F
    puts IEEE_binary80(+1,123,-2).to_hex(true)       # -> 00 00 00 00 00 00 00 F6 03 40
    puts IEEE_decimal32('1.234').to_hex(true)        # -> 22 20 05 34

A floating-point encoded value can be converted to useful formats with the to_ and similar methods:

* `split` (split as integral sign, significand, exponent)
* `to_text`
* `to(num_class)`

Examples:

    v = IEEE_binary80.from_bytes(File.read('binary_file.dat'))
    puts v.to(Rational)                              # -> 1/10
    puts v.split.inspect                             # -> [1, 14757395258967641293, -67]
    puts v.to_text                                   # -> 0.1
    puts v.to(Float)                                 # -> 0.1
    puts v.to_hex                                    # -> CDCCCCCCCCCCCCCCFB3F
    puts v.to_bits                                   # -> 00111111111110111100110011001100110011001100110011001100110011001100110011001101
    puts v.to_bits_text(16)                          # -> 3ffbcccccccccccccccd

## Special values:

Let's show the decimal expression of some interesting values using
3 significative digits:

    fmt = Numerals::Format[mode: :general, rounding: [precision: 3]]
    puts IEEE_SINGLE.min_value.to_text(fmt)             # -> 1e-45
    puts IEEE_SINGLE.min_normalized_value.to_text(fmt)  # -> 1.18e-38
    puts IEEE_SINGLE.max_value.to_text(fmt)             # -> 3.40e38
    puts IEEE_SINGLE.epsilon.to_text(fmt)               # -> 1.19e-7

## Convert between formats

    v = IEEE_EXTENDED.from_text('1.1')
    v = v.convert_to(IEEE_SINGLE)
    v = v.convert_to(IEEE_DEC64)

# Tools for the native floating point format

This is an optional module to perform conversions and manipulate the native Float format.

    require 'float-formats/native'
    include Flt

    puts float_shortest_dec(0.1)                     # -> 0.1
    puts float_significant_dec(0.1)                  # -> 0.10000000000000001
    puts float_dec(0.1)                              # -> 0.1000000000000000055511151231257827021181583404541015625
    puts float_bin(0.1)                              # -> 1.100110011001100110011001100110011001100110011001101E-4
    puts hex_from_float(0.1)                         # -> 0x1999999999999ap-56

    puts float_significant_dec(Float::MIN_D)           # -> 5E-324
    puts float_significant_dec(Float::MAX_D)           # -> 2.2250738585072009E-308
    puts float_significant_dec(Float::MIN_N)           # -> 2.2250738585072014E-308

Together with flt/sugar (from Flt) can be use to explore or work with Floats:

    require 'flt/sugar'

    puts 1.0.next_plus-1 == Float::EPSILON              # -> true
    puts float_shortest_dec(1.0.next_plus)              # -> 1.0000000000000002
    puts float_dec(1.0.next_minus)                      # -> 0.99999999999999988897769753748434595763683319091796875
    puts float_dec(1.0.next_plus)                       # -> 1.0000000000000002220446049250313080847263336181640625
    puts float_bin(1.0.next_plus)                       # -> 1.0000000000000000000000000000000000000000000000000001E0
    puts float_bin(1.0.next_minus)                      # -> 1.1111111111111111111111111111111111111111111111111111E-1

    puts float_significant_dec(Float::MIN_D.next_plus)  # -> 1.0E-323
    puts float_significant_dec(Float::MAX_D.next_minus) # -> 2.2250738585072004E-308

# Defining new formats

New formats are defined using one of the classes defined in float-formats/classes.rb
and passing the necessary parameters in a hash to the constructor.

For example, here we define a binary floating point 32-bits format with
22 bits for the significand, 9 for the exponent and 1 for the sign
(these fields are allocated from least to most significant bits).
We'll use excess notation with bias 127 for the exponent, interpreting
the significand bits as a fractional number with the radix point after
the first bit, which will be hidden:

    Flt.define(
      :MY_FP, BinaryFormat,
      fields: [:significand,22,:exponent,9,:sign,1],
      bias: 127, bias_mode: :scientific_significand,
      hidden_bit: true
    )

Now we can encode values in this format, decode values, convet to other
formats, query it's range, etc:

Numerals::Format[mode: :general, rounding: [precision: 3]]
     puts MY_FP('0.1').to_bits_text(16)              # -> 1EE66666
     puts MY_FP.max_value.to_text                    # -> 7.8804e115

You can look at float-formats/formats.rb to see how the built-in formats
are defined.

# License

This code is free to use under the terms of the MIT license.

# References

[*Floating Point Representations.* C.B. Silio.](http://www.ece.umd.edu/class/enpm607.S2000/fltngpt.pdf)
  Description of formats used in UNIVAC 1100, CDC 6600/7600, PDP-11, IEEE754, IBM360/370

[*Floating-Point Formats.* John Savard.](http://www.quadibloc.com/comp/cp0201.htm)
  Description of formats used in VAX and PDF-11

### IEEE754 binary formats

[*IEEE-754 References.* Christopher Vickery.](http://babbage.cs.qc.edu/courses/cs341/IEEE-754references.html)

[*What Every Computer Scientist Should Know About Floating-Point Arithmetic.* David Goldberg.](http://docs.sun.com/source/806-3568/ncg_goldberg.html)

### DPD/IEEE754r decimal formats

[*Decimal Arithmetic Encoding. Strawman 4d.* Mike Cowlishaw.](http://www2.hursley.ibm.com/decimal/decbits.pdf)

[*A Summary of Densely Packed Decimal encoding.* Mike Cowlishaw.](http://www2.hursley.ibm.com/decimal/DPDecimal.html)

[*Packed Decimal Encoding IEEE-754-r.* J.H.M. Bonten.](http://home.hetnet.nl/mr_1/81/jhm.bonten/computers/bitsandbytes/wordsizes/ibmpde.htm)

[*DRAFT Standard for Floating-Point Arithmetic P754.* IEEE.](http://www.validlab.com/754R/drafts/archive/2007-10-05.pdf)

### HP 10 digits calculators

[*HP CPU and Programming*. David G.Hicks.](http://www.hpmuseum.org/techcpu.htm)
  Description of calculator CPUs from the Museum of HP Calculators.

[*HP 35 ROM step by step.* Jacques Laporte](http://www.jacques-laporte.org/HP35%20ROM.htm)
  Description of HP35 registers.

*Scientific Pocket Calculator Extends Range of Built-In Functions.* Eric A. Evett, Paul J. McClellan, Joseph P. Tanzini.
  Hewlett Packard Journal 1983-05 pgs 27-28. Describes format used in HP-15C.

### HP 12 digits calculators

*Software Internal Design Specification Volume I For the HP-71*. Hewlett Packard.
  Available from http://www.hpmuseum.org/cd/cddesc.htm

*RPL PROGRAMMING GUIDE*
  Excerpted from *RPL: A Mathematical Control Language*. by W. C. Wickes.
  Available at http://www.hpcalc.org/details.php?id=1743

### HP-3000

*A Pocket Calculator for Computer Science Professionals.* Eric A. Evett.
  Hewlett Packard Journal 1983-05 pg 37. Describes format used in HP-3000

### IBM

[*IBM Floating Point Architecture.* Wikipedia.](http://en.wikipedia.org/wiki/IBM_Floating_Point_Architecture)

[*The IBM eServer z990 floating-point unit*. G. Gerwig, H. Wetter, E. M. Schwarz, J. Haess, C. A. Krygowski, B. M. Fleischer and M. Kroener.](http://www.research.ibm.com/journal/rd/483/gerwig.html)

### MBF

[*Microsoft Knowledbase Article 35826*](http://support.microsoft.com/?scid=kb%3Ben-us%3B35826&x=17&y=12)

[*Microsoft MBF2IEEE library*](http://download.microsoft.com/download/vb30/install/1/win98/en-us/mbf2ieee.exe)

### Borland

*An Overview of Floating Point Numbers.* Borland Developer Support Staff

[*Pascal Floating-Point Page.* J R Stockton.](http://www.merlyn.demon.co.uk/pas-real.htm)

### 8-bit micros

This is the MS Basic format (BASIC09 for TRS-80 Color Computer, Dragon),
also used in the Sinclair Spectrum.

*Numbers are followed by information not in listings*
  Sinclair User October 1983 http://www.sincuser.f9.co.uk/019/helplne.htm

*Sinclair ZX Spectrum / Basic Programming.*. Steven Vickers.
  Chapter 24. http://www.worldofspectrum.org/ZXBasicManual/zxmanchap24.html

### Apple II

*Floating Point Routines for the 6502* Roy Rankin and Steve Wozniak.
  Dr. Dobb's Journal, August 1976, pages 17-19.

### C51

[*Advanced Development System* Franklin Software, Inc.](http://www.fsinc.com/reference/html/com9anm.htm)

### CDC6600

*CONTROL DATA 6400/6500/6600 COMPUTER SYSTEMS Reference Manual*
  Manuals available at http://bitsavers.org/

### Cray

*CRAY-1 COMPUTER SYSTEM Hardware Reference Manual*
  See pg 3-20 from 2240004 or pg 4-30 from HR-0808 or pg 4-21 from HP-0032.
  Manuals available at http://bitsavers.org/

### Wang 2200

[*Internal Floating Point Representation*](http://www.wang2200.org/fp_format.html)

本源码包内暂不包含可直接显示的源代码文件,请下载源码包。