Path: senator-bedfellow.mit.edu!bloom-beacon.mit.edu!newsfeed.sgi.net!pitt.edu!newsflash.concordia.ca!nntp.cs.ubc.ca!torn!watserv3.uwaterloo.ca!undergrad.math.uwaterloo.ca!neumann.uwaterloo.ca!alopez-o
From: alopez-o@neumann.uwaterloo.ca (Alex Lopez-Ortiz)
Newsgroups: sci.math,news.answers,sci.answers
Subject: sci.math FAQ: Fundamentals
Followup-To: sci.math
Date: 17 Feb 2000 22:51:57 GMT
Organization: University of Waterloo
Lines: 505
Approved: news-answers-request@MIT.Edu
Expires: Sun, 1 Mar 1998 09:55:55
Message-ID: <88hu2d$qtk$1@watserv3.uwaterloo.ca>
Reply-To: alopez-o@neumann.uwaterloo.ca
NNTP-Posting-Host: daisy.uwaterloo.ca
Summary: Part 1 of 31, New version
Keywords: Algebraic structures/What are numbers
Originator: alopez-o@neumann.uwaterloo.ca
Originator: alopez-o@daisy.uwaterloo.ca
Xref: senator-bedfellow.mit.edu sci.math:347413 news.answers:177533 sci.answers:11241

Archive-name: sci-math-faq/numbers
Last-modified: February 20, 1998
Version: 7.5

                           FUNDAMENTALS


     _________________________________________________________________
   
                             Algebraic structures
                                       
   We will attempt to give a brief explanation of the following concepts:
     * N is a monoid
     * Z is an integral domain
     * Q is a field
     * in the field R the order is complete
     * the field C is algebraically complete
       
   If you have been asked by a child to give them arithmetic problems, so
   they could show off their newly learned skills in addition and
   subtraction I'm sure that after a few problems such as: 2 + 3, 9 - 5,
   10 + 2 and 6 - 4, you tried tossing them something a little more
   difficult: 4 - 7 only to be told `` That's not allowed.''
   
   What you may not have realized is that you and the child did not just
   have different objects in mind (negative numbers) but entirely
   different algebraic systems. In other words a set of objects (they
   could be natural numbers, integers or reals) and a set of operations,
   or rules regarding how the numbers can be combined.
   
   We will take a very informal tour of some algebraic systems, but
   before we define some of the terms, let us build a structure which
   will have some necessary properties for examples and counterexamples
   that will help us clarify some of the definitions.
   
   We know that any number that is divided by six will either leave a
   remainder, or will be divided exactly (which is after all the
   remainder 0). Let us write any number by the remainder n it leaves
   after division by six, denoting it as [ n ]. This means that, 7, 55
   and 1 will all be written [1], which we call the class to which they
   all belong: i.e. 7 in [1], 55 in [1], or, a bit more technically, they
   are all equivalent to 1 modulo 6. The complete set of class will
   contain six elements, and this is called partitioning numbers into
   equivalent classes because it separates (or partitions) all of our
   numbers into these classes, and any one number in a class is
   equivalent to any other in the same class.
   
   One interesting thing we can do with these classes is to try to add or
   to multiply them. What can [1] + [3] mean? We can, rather naively try
   out what they mean in ``normal'' arithmetic: [1] + [3] = [1 + 3] =
   [4]. So far so good, let us try a second example 25 in [1] and 45 in
   [3], their sum is 70 which certainly belongs to [4]. Here we see what
   we meant above by equivalence, 25 is equivalent to 1 as far as this
   addition is concerned. Of course this is just one example, but
   fortunately it can be proven that the sum of two classes is always the
   class of the sums.
   
   Now this is the kind of thing we all do when we add hours for example,
   7 (o' clock) plus 6 hours is 1 (o' clock), and all we are really doing
   is adding hours (modulo 12).
   
   The neat part comes with multiplication, as we will see later on. But
   for now just remember, it can be proven that something like [4] x [5]
   = [2] will work: the product of two classes is the class of the
   product.
   
   Now for some of the necessary terminology.
   

Monoids and Groups

   We need to define a group.
   
   Let us take a set of objects and a rule (called a binary operation)
   which allows us to combine any two elements of this set. Addition is
   an example from math, or ANDing in some computer language.
   
   The set must be closed under the operation. That means that when two
   elements are combined the result must also be in the set. For example
   the set containing even numbers will always give us an even number
   when two elements are added together. But if we restrict ourselves to
   odd numbers, their sum is not an odd number and so we know right off
   the bat that the set of odd numbers and addition cannot constitute a
   group. Some books will consider closure in the definition of binary
   operation, and others add it as one of the requirements for a group
   along with the ones that follow below.
   
   The set and the operation is called a group if the binary operation
   satisfies the following criteria:
     * the operation is associative, which means it doesn't matter how
       you group the elements you are operating on, for example in our
       set of remainders: [1] + ([3] + [4]) = ([1] + [3]) + [4]
     * there is an identity element, meaning: one of the elements
       combined with the others in the set doesn't change them in the
       least. For example the zero in addition, or the one in
       multiplication.
     * every element has an inverse with respect to that operation. If
       you combine an element and its inverse you get the identity (of
       that operation) back.
       
   (Be careful with this last one, -3 is the inverse of 3 in addition,
   since they give us 0 when added, but 1/3 is the inverse of 3 with
   respect to multiplication, since 3 x 1/3 = 1 the identity under
   multiplication.)
   
   So we can see that the set of natural numbers N (with the operation of
   addition) is not even a group, since there is no inverse for 5, for
   example. (In other words there is no natural number which added to 5
   will give us zero.) And so the third rule for our operation is
   violated. But it still has some structure, even if it is not as rich
   as the ones we'll see later on.
   
   Sets with an associative operation (the first condition above) are
   called semigroups, and if they also have an identity element (the
   second condition) then they are called monoids.
   
   Our set of natural numbers under addition is then an example of a
   monoid, a structure that is not quite a group because it is missing
   the requirement that every element have an inverse under the operation
   (Which is why in elementary school 4 - 7 is not allowed.)
   
   What about the set of integers, is it a group?
   
   By itself this question is nonsensical. Why? Well, we have not
   mentioned under what operation. OK, let us say: the set of integers
   with addition.
   
   Now, addition is associative, the zero does not change any number when
   added to it, and for every number n we can add -n and get zero. So
   it's a group all right.
   
   In fact it is a special kind of group. When we can perform the
   operation on the two elements in any order (e.g a + b = b + a) then
   the group is called commutative, or Abelian in honor of Abel. Not
   every operation is commutative, for example three minus two is
   certainly not the same as two minus three. Our set of integers under
   addition is then an Abelian group.
   
Rings

   If we take an Abelian group (remember: a set with a binary operation)
   and we define a second operation on it we get a bit more of a
   structure than we had with just a group.
   
   If the second operation is associative, and it is distributive over
   the first then we have a ring. Note that the second operation may not
   have an identity element, nor do we need to find an inverse for every
   element with respect to this second operation. As for what
   distributive means, intuitively it is what we do in math when perform
   the following change: a x (b + c) = (a x b) + (a xc).
   
   If the second operation is also commutative then we have what is
   called a commutative ring. The set of integers (with addition and
   multiplication) is a commutative ring (with even an identity - called
   unit element - for multiplication).
   
   Now let us go back to our set of remainders. What happens if we
   multiply [5] x [1]? We see that we get [5], in fact we can see a
   number of things according to our definitions above, [5] is its own
   inverse, and [1] is the multiplicative element. We can also show
   easily enough (by creating a complete multiplication table) that it is
   commutative. But notice that if we take [3] and [2], neither of which
   are equal to the class that the zero belongs to [0], and we multiply
   them, we get [3]x[2] = [0]. This bring us to the next definition. In a
   commutative ring, let us take an element which is not equal to zero
   and call it a. If we can find a non-zero element, say b that combined
   with a equals zero ( a x b = 0) then a is called a zero divisor.
   
   A commutative ring is called an integral domain if it has no zero
   divisors. Well the set Z with addition and multiplication fullfills
   all the necessary requirements, and so it is an integral domain.
   Notice that our set of remainders is not an integral domain, but we
   can build a similar set with remainders of division by five, for
   example, and voil`, we have an integral domain.
   
   Let us take, for example, the set Q of rational numbers with addition
   and multiplication - I'll leave out the proof that it is a ring, but I
   think you should be able to verify it easily enough with the above
   definitions. But to give you a head start, notice the addition of
   rationals follow all the requirements for an abelian group. If we
   remove the zero we will have another abelian group, and that implies
   that we have something more than a ring, in fact, as we will see in
   the next section.
   
Fields

   Now we can make one step further. If the elements of a ring, excluding
   the zero, form an abelian group (with the second operation) then it is
   a field. For example, write the multiplication table of the remainders
   of division by 5, and you will see that it satisfies all the
   requirements for a group: (You will probably have noticed that the
   group does not contain the number five itself since [5] = [0].)
   (tabular)(c | c c c c) 1 2 3 4 ; 1 2 3 4 ; 2 2 4 1 3 ; 3 3 1 4 2 ; 4 4
   3 2 1(tabular)
   
   (Why isn't the set of divisors of six - excluding the zero and under
   multiplication - a group? That's easy enough, since we have excluded
   the zero we do not have the result of [2] x [3] in our set, so it
   isn't closed.)
   
Ordering

   Given a ring, we can say that it is ordered when you have a special
   subset of that ring behaves in a very special way. If any two elements
   of that special subset are added or multiplied their sum and their
   product are again in the special subset. Take the negative numbers in
   R , can they be that special subset? Well the sum seems to be
   allright, it is also a negative number. But things don't work with the
   product: it is positive. What about the positive numbers? Yep, and in
   fact we call that special subset, the set of positive elements. Now,
   we gave the definition for an ordered ring, we can also define an
   ordered field the same way.
   
   But what does a complete ordered field mean? Well the definition looks
   rather nasty: it is complete if every non-empty subset which posesses
   an upper bound has a least upper bound.
   
   Let's translate some of that, trying to lose as little information on
   the way. A bound is something that guarantees that all of the elements
   of your set are on one side of it (reasonably enough). For example,
   certainly all negative reals are less than 100, so 100 is a bound (it
   is in fact an upper bound 'cause all negatives are ``below'' it). But
   there are lots of other bounds, 1, 5, 26 will all do nicely. The
   question now is, of all of these (upper bounds) which is the smallest,
   that is which one is ``the border'' so to speak? Does it always
   exists?
   
   Let's take the rationals, and look at the following numbers:
   
   1.4, \; 1.41, \; 1.414, \; 1.4142, \; 1.41421, ... 
   
   Now each of these is a rational number (it can be written as a
   fraction), and they are getting closer and closer to a number we've
   probably seen before (just take out your calculator and find the
   square root of two). So we can write the shorthand for this series as
   sqrt(2). Certainly we can find an upper bound for this series, 3 will
   do nicely, but so can 1.5, or 1.42. But what is the smallest. Well
   there isn't any. Not among the rational at least, because no matter
   what fraction you give I can give you one closer to the square root of
   two. What about the square root of two itself? Well it's not a
   rational number (I'll skip the proof, but it is really rather easy) so
   you can't use it. If you want another series which is really neat look
   at the section on ``Euler's formula'' in the FAQ.
   
   And that is where the reals come in. Any set or reals that is bounded
   you can certainly find the smallest of these bounds. (By the way this
   ``least upper bound'' is abbreviated ``l.u.b.'', or ``sup'' for
   supremum.) We can also turn things around and talk of lower bounds,
   and of the largest of these etc. but most of that will be just a
   mirror image of what we have dealt with so far.
   
   So that should be it. And for years that did seem to be it, we seemed
   to have all the numbers we'd ever care to have.
   
   There was just one small stick in the works, but most people just sort
   of pretended not to notice, and that was that not all polynomials had
   solutions. One simple polynomial of this kind is x^2 + 1 = 0. It's so
   simple, yet there's no self respecting number that would solve this
   polynomial. There were these funny answers which seemed like they
   should be solutions but no one could make any sense out of them, so
   they were considered imaginary solutions. Which was really too bad
   because they were given the name of imaginary numbers and now that the
   name stuck we realize that they are numbers just as good as any of the
   ones we have been using for centuries. And in fact that takes us to
   the last great pinnacle in this short excursion. The field of complex
   numbers.
   
   We can define an algebraically closed field as a field where every
   nonconstant polynomial (i.e. one with an x in it from high school
   days) has a zero in the field. Whew! This in short means that as long
   as the polynomial is not a constant number (which is no fun anyways)
   but something which looks like it wants a solution, like 5 x^3 - 2 x^2
   + 6 = 0 it will always have one, if you are working with complex
   numbers and not just reals.
   
   There is another definition which is probably just as good, but may or
   may not be easier: A field is algebraically closed if every polynomial
   splits into linear factors. Linear factors are briefly factors not
   containing x to any power of two or higher, in other words in the
   form: ax + b. For example x^2 + x - 6 can be factored as (x + 3)(x -
   2), but if we are in the field of reals we cannot factor x^2 + 1, but
   we can in the field of complex numbers: x^2 + 1 = (x - i)(x + i),
   where, you may recall, i^2 = -1.


     _________________________________________________________________
   
                               What are numbers?
                                       
Introduction

   Informally:
     * N = { 0,1,... } or N = { 1,2,... } 
       Wether 0 is in N depends on where you live and what is your field
       of interest. At the informal level it is a religious topic.
     * Z = { ..., - 1,0,1,... } 
     * Q = { p/q | p, q in Z and q != 0 } 
     * R = { d_0.d_1d_2... | d_0 in Z and 0 <= d_i <= 9 for i > 0 } 
     * C = { a + b o i | a, b in R and i^2 = -1 } 
       
Construction of the Number System

   Formally (following the mainstream in math) the numbers are
   constructed from scratch out of the axioms of Zermelo Fraenkel set
   theory (a.k.a. ZF set theory) [Enderton77, Henle86, Hrbacek84]. The
   only things that can be derived from the axioms are sets with the
   empty set at the bottom of the hierarchy. This will mean that any
   number is a set (it is the only thing you can derive from the axioms).
   It doesn't mean that you always have to use set notation when you use
   numbers: just introduce the numerals as an abbreviation of the formal
   counterparts.
   
   The construction starts with N and algebraically speaking, N with its
   operations and order is quite a weak structure. In the following
   constructions the structures will be strengthen one step at the time:
   Z will be an integral domain, Q will be a field, for the field R the
   order will be made complete, and field C will be made algebraically
   complete.
   
   Before we start, first some notational stuff:
     * a pair (a,b) = { { a } , { a,b } } ,
     * an equivalence class [a] = { b | a == b } ,
     * the successor of a is s(a) = a U { a } .
       
   Although the previous notations and the constructions that follow are
   the de facto standard ones, there are different definitions possible.
   
Construction of N

     * { } in N
     * if a in N then s(a) in N
     * N is the smallest possible set such that the preceding rules hold.
       
   Informally n = { 0,...,n - 1 } (thus 0 = { } , 1 = { 0 } , 2 = { 0,1 }
   , 3 = { 0,1,2 } ). We will refer to the elements of N by giving them a
   subscript _n. The relation <_n on N is defined as: a_n <_n b_n iff a_n
   in b_n. We can define +_n as follows:
     * a_n +_n 0_n = a_n
     * a_n +_n s(b_n) = s(a_n +_n b_n)
       
   Define *_n as:
     * a_n *_n 0_n = 0_n
     * a_n *_n s(b_n) = (a_n *_n b_n) +_n a_n
       
Construction of Z

   We define an equivalence relation on N x N: (a_n,b_n) ==_z(c_n,d_n)
   iff a_n +_n d_n = c_n +_n b_n. Note that ==_z ``simulates'' a
   subtraction in N . Z = { [(a_n,b_n)]_z | a_n, b_n in N } . We will
   refer to the elements of Z by giving them a subscript _z. The elements
   of N can be embedded as follows: embed_n : N --> Z such that
   embed_n(a_n) = [(a_n,0_n)]_z. Furthermore we can define:
     * [(a_n,b_n)]_z <_z [(c_n,d_n)]_z iff a_n +_n d_n <_n c_n +_n b_n
     * [(a_n,b_n)]_z +_z [(c_n,d_n)]_z = [(a_n +_n c_n, b_n +_n d_n)]_z
     * [(a_n,b_n)]_z *_z [(c_n,d_n)]_z = 
       [((a_n *_n c_n) +_n (b_n *_n d_n), (a_n *_n d_n) +_n (c_n *_n
       b_n))]_z
       
Construction of Q

   We define an equivalence relation on Z x (Z { 0_z }): (a_z,b_z) ==_q
   (c_z,d_z) iff a_z *_z d_z = c_z *_z b_z. Note that ==_q ``simulates''
   a division in Z . Q = { [(a_z,b_z)]_q | a_z in Z and b_z in Z { 0_z }
   } . We will refer to the elements of Q by giving them a subscript _q.
   The elements of Z can be embedded as follows: embed_z : Z --> Q such
   that embed_z(a_z) = [(a_z,1_z)]_q. Furthermore we can define:
     * [(a_z,b_z)]_q <_q [(c_z,d_z)]_q iff a_z *_z d_z <_z c_z *_z b_z
       when 0_z <_z b_z and 0_z <_z d_z
     * [(a_z,b_z)]_q +_q [(c_z,d_z)]_q = [((a_z *_z d_z) +_z (c_z *_z
       b_z), b_z *_z d_z)]_q
     * [(a_z,b_z)]_q *_q [(c_z,d_z)]_q = [(a_z *_z c_z, b_z *_z d_z)]_q
       
Construction of R

   The construction of R is different (and more awkward to understand)
   because we must ensure that the cardinality of R is greater than that
   of Q .
   Set c is a Dedekind cut iff
     * { } subset c subset Q (strict inclusions!)
     * c is closed downward:
       if a_q in c and b_q <_q a_q then b_q in c
     * c has no largest element:
       there isn't an element a_q in c such that b_q <_q a_q for all b_q
       != a_q in c
       
   You can think of a cut as taking a pair of scissors and cutting Q in
   two parts such that one part contains all the small numbers and the
   other part contains all large numbers. If the part with the small
   numbers was cut in such a way that it doesn't have a largest element,
   it is called a Dedekind cut. R = { c | c is a Dedekind cut } . We will
   refer to the elements of R by giving them a subscript _r. The elements
   of Q can be embedded as follows: embed_q : Q --> R such that
   embed_q(a_q) = { b_q | b_q <_q a_q } . Furthermore we can define:
     * a_r <_r b_r iff a_r subset b_r (strict inclusion!)
     * a_r +_r b_r = { c_q +_q d_q | c_q in a_r and d_q in b_r } 
     * -_r a_r = ; { b_q | there exists an c_q in Q such that b_q <_q c_q
       and (-1)_q *_q c_q in a_r } 
     * |a_r|_r = a_r U -_r a_r
     * *_r is defined as:
          + if not a_r <_r 0_r and not b_r <_r 0_r
            then a_r *_r b_r = 0_r U { c_q *_q d_q | c_q in a_r and d_q
            in b_r } 
          + if a_r <_r 0_r and b_r <_r 0_r then a_r *_r b_r = |a_r|_r *_r
            |b_r|_r
          + otherwise a_r *_r b_r = -_r (|a_r|_r *_r |b_r|_r)
       
   There exists an alternative definition of R using Cauchy sequences: a
   Cauchy sequence is a s : N --> Q such that s(i_n) +_q((-1)_q *_q
   s(j_n)) can be made arbitrary near to 0_q for all sufficiently large
   i_n and j_n. We will define an equivalence relation ==_r on the set of
   Cauchy sequences as: r ==_r s iff r(m_n) +_q((-1)_q *_q s(m_n)) can be
   made arbitrary close to 0_q for all sufficiently large m_n. R = {
   [s]_r | s is a Cauchy sequence } . Note that this definition is close
   to ``decimal'' expansions.
   
Construction of C

   C = R x R. We will refer to the elements of C by giving them a
   subscript _c. The elements of R can be embedded as follows: embed_r :
   R --> C such that embed_r(a_r) = (a_r,0_r). Furthermore we can define:
     * (a_r,b_r) +_c (c_r,d_r) = (a_r +_r c_r, b_r +_r d_r)
     * (a_r,b_r) *_c (c_r,d_r) = ((a_r *_r c_r) +_r -_r (b_r * d_r), (a_r
       *_r d_r) +_r (b_r *_r c_r))
       
   There exists an elegant alternative definition using ideals. To be a
   bit sloppy: C = R [x]/< (x *_r x) +_r 1_r > , i.e. C is the resulting
   quotient ring of factoring ideal < (x *_r x) +_r 1_r > out of the ring
   R [x] of polynomials over R . The sloppy part is that we need to
   define concepts like quotient ring, ideal, and ring of polynomials.
   Note that this definition is close to working with i^2 = -1: (x *_r x)
   +_r 1_r = 0_r can be rewritten as (x *_r x) = (-1)_r.
   
Rounding things up

   At this moment we don't have that N is a subset of Z , Z of Q , etc.
   But we can get the inclusions if we look at the embedded copies of N ,
   Z , etc. Let
     * N' = ran embed_r o embed_q o embed_z o embed_n
     * Z' = ran embed_r o embed_q o embed_z
     * Q' = ran embed_r o embed_q
     * R' = ran embed_r
       
   For these sets we have N' subseteq Z' subseteq Q' subseteq R' subseteq
   C. Furthermore these sets have all the properties that the
   ``informal'' numbers have.
   
What's next?

   Well, for some of the more alien parts of math we can extend this
   standard number system with some exotic types of numbers. To name a
   few:
     * Cardinals and ordinals
       Both are numbers in ZF set theory [Enderton77, Henle86, Hrbacek84]
       and so they are sets as well. Cardinals are numbers that represent
       the sizes of sets, and ordinals are numbers that represent well
       ordered sets. Finite cardinals and ordinals are the same as the
       natural numbers. Cardinals, ordinals, and their arithmetic get
       interesting and ``tricky'' in the case of infinite sets.
     * Hyperreals
       These numbers are constructed by means of ultrafilters [Henle86]
       and they are used in non-standard analysis. With hyperreals you
       can treat numbers like Leibnitz and Newton did by using
       infinitesimals.
     * Quaternions and octonions
       Normally these are constructed by algebraic means (like the
       alternative C definition that uses ideals) [Shapiro75, Dixon94].
       Quaternions are used to model rotations in 3 dimensions.
       Octonions, a.k.a. Cayley numbers, are just esoteric artifacts :-).
       Well, if you know where they are used for, feel free to contribute
       to the FAQ.
     * Miscellaneous
       Just to name some others: algebraic numbers [Shapiro75], p-adic
       numbers [Shapiro75], and surreal numbers (a.k.a. Conway numbers)
       [Conway76].
       
   Cardinals and ordinals are commonly used in math. Most mortals won't
   encounter (let alone use) hyperreals, quaternions, and octonions.
   
      References
      
   J.H. Conway. On Numbers and Games, L.M.S. Monographs, vol. 6. Academic
   Press, 1976.
   
   H.B. Enderton. Elements of Set Theory. Academic Press, 1977.
   
   G.M. Dixon. Division Algebras; Octonions, Quaternions, Complex Numbers
   and the Algebraic Design of Physics. Kluwer Academic, 1994.
   
   J.M. Henle. An Outline of Set Theory. Springer Verlag, 1986.
   
   K. Hrbacek and T. Jech. Introduction to Set Theory. M. Dekker Inc.,
   1984.
   
   L. Shapiro. Introduction to Abstract Algebra. McGraw-Hill, 1975.
   
   This subsection of the FAQ is Copyright (c) 1994, 1995 Hans de
   Vreught. Send comments and or corrections relating to this part to
   J.P.M.deVreught@cs.tudelft.nl
     _________________________________________________________________
   
-- 
Alex Lopez-Ortiz                                         alopez-o@unb.ca
http://www.cs.unb.ca/~alopez-o                       Assistant Professor	
Faculty of Computer Science                  University of New Brunswick