WEEK 
Monday 
Wednesday 
Friday 
week1  822 
824 
826 
week 2  829 
831 
92 
week 3  95 NO Class Labor Day 
97 
99 
week 4 
912 
914 
916 
week 5 
919 
921 
923 
week 6 
926 
928 
930 
week 7 
103 
105 
107 
week 8 
1010 
1012 
1014 
week 9  1017 
1019 
1021 
week 10 
1024 
1026 
1028 
week 11 
1031 
112 
114 
week 12 
117 
119 
1111 
week 13 
1114 
1116 
1118 
week
14 NoClasses 
1121 
1123 
1127 
week 15 
1128 
1130 
122 
week 16 
125 
127 
119 
week 17 Final Exam 
1212 Exam 
`A^n =A` if `n` is odd. 


`A^n` `>` 
( 

) 
A=  ( 

) 
The geometry of complex arithmetic:
If z = a+bi = z(cos(t) +i sin(t)) and w = c+di = w(cos(s) +i sin(s)) then
z+w = (a+c)+(b+d)i which corresponds geometrically to the "vector " sum of z and w in the plane, and
zw = z(cos(t) +i sin(t)) w(cos(s) +i sin(s))=
z w (cos(t) +i sin(t))(cos(s) +i sin(s))
= z w (cos(t) cos(s)  sin(t)sin(s) + (sin(t) cos(s) +
sin(s)cos(t)) i)
= z w (cos(t+s) + sin(t+s) i)
So the magnitude of the product is the product of the magnitudes of z and w and the angle of the product is the sum of the angles of z and w.
Notation: cos(t) + i sin(t)
is
sometimes written as cis(t).
Note: If we consider the series for e^{x} = 1 + x +
x^{2}/2! +x^{3}/3! + ...
then e^{ix} = 1 + ix + (ix)^{2}/2!
+(ix)^{3}/3! + ... = 1 + ix  x^{2}/2!

ix^{3}/3! + ...
... = cos(x) + i sin(x)
Thus `e^{i*pi} = cos(pi) + i sin(pi)= 1`; and so....
`ln(1) = i *pi` because the definition of `ln` says `ln(a)=b`
if and only if `e^b = a`
Furthermore: `e^{a+bi} = e^a*e^{bi} = e^a ( cos (b) + sin(b) i)
`
A = 
( 

) 
If r and s are complex numbers in the matrix A,
then consider what happens to `A^n` as n gets large:
If r < 1 and s < 1, the powers of A will get
close to the zero matrix;
if r=s=1 the powers of A will always be A.
If r < 1 and s = 1 the powers of
A will get close to the matrix
A = 
( 

) 
and if s < 1 and r = 1 the powers of A
will get close to the matrix
A = 
( 

) 
and otherwise the powers of A will diverge .
Polynomials with
complex coefficients.
Because multiplication and addition make sense for complex
numbers, we can consider polynomials with coefficients that
are complex numbers and use a complex number for the
variable
This makes a complex polynomial a function from the complex
numbers to the complex numbers.
This can be visualized using one plane for the domain of
the polynomial and a second plane for the codomain/ target
range of the polynomial.
The Fundamental Theorem of Algebra: If f
is a non constant polynomial with complex number
coefficients then there is at
least on complex number z* where f (z*)
= 0.
For more on complex numbers see: Dave's Short Course on Complex Numbers,


How are these questions related to Motivation Question I?
919
ExamplesF[X] = { f in F^{∞}, where f(n) = 0 for all but a finite number of n.} < F^{∞}
(Internal) Sums , Intersections, and Direct Sums of Subspaces
Suppose U1, U2, ... , Un are all subspaces of V.Definition: U1+ U2+ ... + Un = {v in V where v = u1+ u2+ ... + un for uk in Uk , k = 1,2,...,n} called the (internal) sum of the subspaces.
Facts: (i) U1+ U2+ ... + Un < V.
(ii) Uk < U1+ U2+ ... + Un for each k, k= 1,2,...,n.
(iii) If W<V and Uk < W for each k, k= 1,2,...,n, then U1+ U2+ ... + Un <W.
So ...
U1+ U2+ ... + Un is the smallest subspace of V that contains Uk for each k, k= 1,2,...,n.Examples:
U1 = {(x,y,z): x+y+2z=0} U2 = {(x,y,z): 3x+yz=0}. U1 + U2 = R^{3}.Let Uk = {f in P(F): f(x) = a_{k}x^{k} where a_{k }is in F} . Then U0+ U1+ U2+ ... + Un = {f : f (x) = a_{0} + a_{1}x + a_{2}x^{2} + ...+ a_{n}x^{n} where a_{0} ,a_{1} ,a_{2},...,a_{n }are in F}.
The HW Exercise was presented in class: If U and W are subspaces of V and U $\cup$ W is also a subspace of V, then either U < W or W < U.
The proof was by contradiction.
Revisited Definition: U1 $\cap$ U2$\cap$ ... $\cap$ Un = {v in V where v is in Uk , for all k = 1,2,...,n} called the intersection of the subspaces.
Facts:(i) U1$\cap$ U2$\cap$ ... $\cap$ Un < V.
(ii) U1$\cap$U2$\cap$ ... $\cap$ Un < Uk for each k, k= 1,2,...,n.
(iii) If W<V and W < Uk for each k, k= 1,2,...,n, then W<U1$\cap$ U2$\cap$ ... $\cap$ Un .
So ...
U1$\cap$ U2$\cap$ ... $\cap$ Un is the largest subspace of V that is contained in Uk for each k, k= 1,2,...,n.
Examples: U1 = {(x,y,z): x+y+2z=0} U2 = {(x,y,z): 3x+yz=0}. U1 $\cap$ U2 = {(x,y,z): x+y+2z=0 and 3x+yz=0}= ...
Example: Let Uk = {f in P(F): f(x) = a_{k}x^{k} where a_{k }is in F and characteristic of F is 0} then Uj$\cap$Uk = {0} for j not equal to k.
Proof: Suppose f (x) = ax^{k} and bx^{j} for j not equal to k for all x in F. Then since F has an infinite number of elements, the polynomial ax^{k}  bx^{j}
= 0 for all x. Using x= 1 this means a = b and using x = 2 = 1+1, this means a=b=0.
Remark: Example is not be true F = {0,1}. For then x^{5} = x^{3} for all x in F.
Definition: Suppose V is a v.s over F and ${U}_{1}$ and ${U}_{2}$ are subspaces of V.
We say that V is the direct sum of U1 and U2 and we write
V = ${U}_{1}$ $\oplus$ ${U}_{2}$ if (1) ${U}_{1}$ + ${U}_{2}$ and (2) U1$\cap$ U2 = {0}.
Example: Suppose A is the 5 by 5 diagonal real valued matrix with `A_{1,1} = A_{2,2} = 3` and `A_{3,3} = A_{4,4} = A_{5,5}=2`. `U_2` = {v where vA=2v} and `U_3`={v where vA =3v}. These are subspaces of `R^5` and `R^5 = U_2` ⊕ `U_3`.
Proposition: Suppose V = ${U}_{1}$ $\oplus$ ${U}_{2}$ and $v\in V$, v = ${u}_{1}+{u}_{2}={w}_{1}+{w}_{2}$ with ${u}_{i}$ and ${w}_{i}$ are in ${U}_{i}$ for i = 1 and 2.
Then ${u}_{i}={w}_{i}$ for i = 1,2.
Conversely, if V = ${U}_{1}$ + ${U}_{2}$ and if v = ${u}_{1}+{u}_{2}={w}_{1}+{w}_{2}$ with ${u}_{i}$ and ${w}_{i}$ are in ${U}_{i}$ for i = 1 and 2.
implies ${u}_{i}={w}_{i}$ for i = 1,2 then V = ${U}_{1}$ $\oplus$ ${U}_{2}$.Proof: From the hypothesis `u_1 + (w_1) = w_2 + (u_2) in U_1 and U_2 ,`
so it is in ${U}_{1}\cap {U}_{2}$ = {0}. Thus ... ${u}_{i}={w}_{i}$ for i = 1, 2.
Conversely: Suppose the hypothesis... and $v\in {U}_{1}\cap {U}_{2}$ then v = v + 0 = 0 + v, so v = 0. Thus V = ${U}_{1}$ $\oplus$ ${U}_{2}$.
926
To generalize the direct sum to U1, U2, ... , Un, we would start by assuming V = U1 + U2 + ... + Un.
Direct Sums: Suppose U1, U2, ... , Un are all subspaces of V and U1+ U2+ ... + Un = V, we say V is the direct sum of U1, U2, ... , Un if for any v in V, the expression of v as v = u_{1}+ u_{2}+ ... + u_{n} for u_{k} in Uk is unique, i.e., if v = u_{1}'+ u_{2}'+ ... + u_{n}' for u_{k}' in Uk then u_{1} = u_{1}', u_{2}=u_{2}', ... , u_{n}=u_{n}'. In these notes we will write V = U1 $\oplus$ U2 $\oplus$...$\oplus$ Un
Examples:Uk = {v in F^{n}: v = (0,... 0,a,0, ... 0) where a is in F is in the kth place on the list.} Then U1$\oplus$ U2$\oplus$ ... $\oplus$ Un = V.Theorem: V = U1$\oplus$ U2$\oplus$ ... $\oplus$ Un if and only if (i)U1+ U2+ ... + Un = V AND 0=u_{1}+ u_{2}+ ... + u_{n} for u_{k} in Uk implies u_{1}=u_{2}=...=u_{n}=0.
Theorem: V = U$\oplus$W if and only if V = U+W and U$\cap$W={0}.
Exercise: Prove: If V = U1 ⊕ U2 ⊕...⊕ Un then Ui∩ Uj = {0} for all i and j that are not equal.
Examples using subspaces and direct sums in applications:
Suppose A is a square matrix (n by n) with entries in the field F.
For c in F, let W_{c }= { v in F^{n} where vA = cv}.
Fact: For any A and any c, W_{c}< F^{n }. [Comment: for most c, W_{c}= {0}. ]
Definition: If W_{c} is not the trivial subspace, then c is called an eigenvalue or characteristic value for the matrix A and nonzero elements of W_{c }are called eigen vectors or characteristic vectors for A.
Application 1 : Consider the coke and pepsi matrices:
Questions: For which c is W_{c} nontrivial?
Example A. vA = cv?_{ }where
A= (
5/6 1/6 1/4 3/4 )
Example B. vB = cv_{ }where
B= (
2/3 1/3 1/4 3/4 )
To answer this question we need to find (x,y) [not (0,0)] so that
Is R^{2} = W_{c1} + W_{c2} for these subspaces? Is this sum direct?
Example A
(x,y) (
5/6 1/6 1/4 3/4 ) = c(x,y)
Example B
(x,y) (
2/3 1/3 1/4 3/4 ) = c(x,y)
Focusing on Example B we consider for which c will the matrix equation have a nontrivial solution (x,y)?
We consider the equations: 2/3 x +1/4 y = cx and 1/3 x+3/4 y = cy.
Multiplying by 12 to get rid of the fractions and bringing the cx and cy to the left side we find that
(812c)x + 3 y = 0 and 4x + (912c)y = 0
Multiplying by 4 and (812c) then subtracting the first equation from the second we have
((812c)(912c)  12 )y = 0. For this system to have a nontrivial solution, it must be that
((812c)(912)c  12 ) = 0 or `72(108+96) c+144c^2 12 = 0` $\phantom{\rule{1ex}{0ex}}\text{or}$`60204c+144c^2 = 0`$.$
Clearly one root of this equation is 1, so factoring we have (1c)(60144c) = 0 and c = 1 and c = 5/12 are the two solutions... so there are exactly two distinct eigenvalues for example B,
c= 1 and c = 5/12 and two non trivial eigenspaces W_{1} and W_{5/12} .
General Claim:
Proposition: If c is different from k, then W_{c} $\cap$ W_{k} = {0}
Proof:?
Generalize?
What does this mean for v_{n} when n is large?
Express `v_0 = v_e + v~` with `v_e in W_1` and `v~ in W_{5/12}`.
Applying A we find `v_1 = v_0 A = v_e A + v~ A = v_e + 5/12 v~`.
Repeating this yields ` v_n = v_e + (5/12)^n v~`. As `n > oo` we have `v_n > v_e`.
Does the distribution of v_{n} when n is large depend on v_{0}?
928
Application 2: For c a real number let
W_{c} = {f in C^{∞}(R) where f '(x)=c f(x)} < C^{∞}(R).
What is this subspace explicitly?
Let V={f in C^{∞}(R) where f ''(x)  f(x) = 0} < C^{∞}(R).
Claim: V = W_{1} $\oplus$ W_{1}
Begin? We'll come back to this later in the course!
If c is different for k, then W_{c} $\cap$ W_{k }= {0}
Proof:...
Back to looking at things from the point of view of individual vectors:
Linear combinations:
Def'n. Suppose S is a set of vectors in a vector space V over the field F. We say that a vector v in V is a linear combination of vectors in S if there are vectors u_{1}, u_{2}, ... , u_{n} in S and scalars a_{1}, a_{2}, ..., a_{n} in F where v = a_{1}u_{1}+ a_{2}u_{2}+ ... + a_{n}u_{n} .
Comment: For many introductory textbooks: S is a finite set.
Recall. Span (S) = {v in V where v is a linear combination of vectors in S}
If S is finite and Span (S) = V we say that S spans V and V is a "finite dimensional" v.s.
Linear Independence.
Def'n. A set of vectors S is linearly dependent if there are vectors u_{1}, u_{2}, ... , u_{n} in S and scalars `α_1,α_2, ...,α_n in F` NOT ALL 0 where `0=α_1u_1+α_2u_2+ ...+α_n u_n` .
A set of vectors S is linearly independent if it is not linearly dependent.
Other ways to characterize linearly independent.
A set of vectors S is linearly independent if whenever there are vectors u_{1}, u_{2}, ... , u_{n} in S and scalars `α_1,α_2, ...,α_n in F` where `0=α_1u_1+α_2u_2+ ...+α_n u_n` , the scalars are all 0, i.e. `α_1=α_2= ...=α_n = 0` .
930
Examples: Suppose A is an n by m matrix: the row space of A= span ( row vectors of A) , the column space of A = Span(column vectors of A).
Relate to R(A)
Recall R(A) = "the range space of A" = { w in F^{k} where for some v in F^{n}, vA= w } < F^{k}.
w is in R(A) if and only if w is a linear combination of the row vectors, i.e., R(A) = the row space of A.
If you consider Av instead of vA, then R*(A) = the column space of A.
"Infinite dimensional" v.s. examples: P(F), F[X], F^{∞}, C^{∞} (R)
F[X] was shown to be infinite dimensional. [ If p is in SPAN(p1,....,pn) then the degree of p is no larger than the maximum of the degrees of {p1,...pn}. So F[X] cannot equal SPAN(p1,...,pn) for any finite set of polynomials i.e, F[X] is NOT finite dimensional.
Some Standard examples.
Bases def'n.
Definition: A set B is called a basis for the vector space V over F if (i) B is linearly independent and (ii) SPAN( B) = V.
Bases and representation of vectors in a f.d.v.s.
103
Suppose B is a finite basis for Vwith its elements in a list, (u_{1}, u_{2}, ... , u_{n}) . [A list is an ordered set.]
If v is in V, then there are unique vectors scalars `α_1,α_2, ...,α_n in F` where `v = α_1u_1+α_2u_2+ ...+α_n u_n .`
The scalars are called the coordinates of v w.r.t. B, and we will write
`v = [α_1,α_2, ...,α_n]_B.`
Linear Independence Theorems
Theorem 1 : Suppose S is a linearly independent set and `v` is not an element of Span(S), then `S ∪ {v}` is also linearly independent.
Proof Outline: Suppose vectors u_{1}, u_{2}, ... , u_{n} in S and scalars `α_1,α_2, ...,α_n,α in F` where `0=α_1u_1+α_2u_2+ ...+α_n u_n+α v` . If $\alpha$ is not 0 then
`v=α^{1}(α_1u_1+α_2u_2+ ...+α_n u_n) in Span(S)`, contradicting the hypothesis. So $\alpha =0$. But then `0=α_1u_1+α_2u_2+ ...+α_n u_n` and since S is linearly independent,
`α_1=α_2= ...=α_n=0`. Thus `S ∪ {v}` is linearly independent. EOP.
Theorem 2: Suppose S is a finite set of vectors with V = Span (S) and T is a subset of vectors in V. If n( T) > n(S) then T is linearly dependent.
Proof Outline: Suppose n(S) = N. Then by the assumption ... [Proof works by finding N homogeneous linear equations with N+1 unknowns.]
105
Theorem 3: Every finite dimensional vector space has a basis.
Proof outline:How to construct a basis, `B`, for a non trivial finite dimensional v.s., `V`. Since `V` is finite dimensional it has a subset S that is finite with `Span (S) = V`.
Start with the empty set. This is linearly independent. Call this `B_0`. If `span(B_0) = V` then you are done. `B_0` is a basis.
 If `Span(B_0)` is not `V` then there is a vector `v_1` in `V` where `v_1` is not in `Span(B_0)`. Apply Theorem 1 to obtain `B_1=B_0∪{v_1}` which is linearly independent. If `Span(B_1)=V` then `B_1` is a basis for `V`. Otherwise continue using Theorem 1 repeatedly until the resulting set of vectors has more then the number of sets in the spanning set. But by Theorem 2, this is a contradiction. So at some stage of the process, `Span(B_k) = V`, and `B_k` is a basis for `V`. IRMC
Comment:The proof of the Theorem also shows that given T, a linearly independent subset of V and V a finite dimensional vector space, one can step by step add elements to T, so that eventually you have a new set S where S is linearly independent with Span(S) = V and T contained in S. In other words we can construct a set B that is a basis for V with T contained in B. This proves
Corollary: Every Linearly independent subset of a finite dimensional vector space can be extended to a basis of the vector space.
Theorem 4. If V is finite dimensional vs and B and B' are bases for V, then `n(B) = n(B')`.
Proof: fill in ... based on the Theorem 2. `n(B) <= n(B')` and `n(B') <= n(B)` so...
Definition: The dimension of a finite dimensional v.s. over F is the number of elements in a(ny) basis for V.
Discuss dim({0})= ???.
The empty set is linearly independent!... so The empty set is a basis for {0} and the dimension of {0} is 0!
What is Span of the empty set? Recall we characterized SPAN(S) = the intersection of all subspaces that contain S. Then Span (empty set) = Intersection of all subspaces= {0}.
More Dimension Results:Prop: A Subspace of a finite dimensional vector space is finite dimensional.
Proposition: Suppose dim(V) = n, S a set of vectors with N(S) = n. Then
(1) If S is Linearly independent, then S is a basis.
(2) If Span(S) = V, then S is a basis.
Proof: (1) S is contained is a basis, B. If B is larger than S, then B has more than n elements, which contradicts that fact that any basis for V has exactly n elements. So B = S and S is a basis.
(2) Outline:V has a basis of n elements, B. Suppose S in linearly dependent and show that there is a set with less than n elements that spans V. Hence B cannot be a basis. This, S is a basis.
IRMC
Theorem: Sums, intersections and dimension: If U, W <V are finite dimensional, then so is U+W and
dim(U+W) = Dim(U) + Dim(W)  Dim(U$\cap$W).Proof: (idea) Build up bases of U and W from U$\cap$W.... then check that the union of these bases is a basis for U+W
1010
Example [from problem in Axler.]: Suppose p_{0},...,p_{m} are in P_{m}(R) and p_{i}(2) = 0 for all i.
Prove {p_{0},...,p_{m}} is linearly dependent.
Proof: Suppose {p_{0},...,p_{m}} is linearly independent.
Notice that by the assumption for any coefficients
(a_{0}p_{0}+..+a_{m}p_{m} )(2) = a_{0}p_{0}(2)+..+a_{m}p_{m}(2) = 0and since u(x)= 1 has u(2) = 1, u (= 1) is not in the SPAN(p_{0},...,p_{m}).
Thus SPAN(p_{0},...,p_{m}) is not P_{m}(R).
But SPAN ( 1,x, ..., x^{m}) = P_{m}(F) and {1,x, ..., x^{m} } is linearly independent (proof?)
So dim (P_{m}(R)) = m+1 thus we have SPAN (p_{0},...,p_{m})=P_{m}(R), a contradiction. So {p_{0},...,p_{m}} is not linearly independent.
End of proof.
Example (visualize): In R^{2}, P_{4}(R). Any 5 polynomials of degree less than 5 that pass through (2,0) are linearly dependent.
Connect to Coke and Pepsi example: find a basis of eigen vectors using the B example for R^{2}. [Use the online technology]
Example B
(x,y) (
2/3 1/3
1/4 3/4 ) = c(x,y)
We considered the equations: 2/3 x +1/4 y = cx and 1/3 x+3/4 y = cy and show that
there are exactly two distinct eigenvalues for example B,
c= 1 and c = 5/12 and two non trivial eigenspaces W_{1} and W_{5/12} .
Now we can use technology to find eigenvectors in each of these subspaces.
Matrix calculator, gives as a result that the eigenvalue 1 had an eigenvector (1,4/3) while 5/12 had an eigenvector (1,1). These two vectors are a basis for R^{2}.
Linear Transformations: V and W vector spaces over F.
Definition: A function T:V $\to$ W is a linear transformation if for any x,y in V and in F, T(x+y) = T(x) + T(y) and T(ax) = a T(x).
Examples: T(x,y) = (3x+2y,x3y) is a linear transformation T: R2 > R2.
G(x,y) = (3x+2y, x^2 2y) is not a linear transformation.
G(1,1) = (5, 1) , G(2,2) = (10, 0)... 2*(1,1) = (2,2) but 2* (5,1) is not (10,0)!
Notice that T(x,y)can be thought of as the result of a matrix multiplication
So the two key properties are the direct consequence of the properties of matrix multiplication.... (v+w)A= vA+wA and (cv)A = c(vA).
(x,y) (
3
1
2
2 )
For A a k by n matrix : T_{A} (left argument) and _{A}T (right) are linear transformations on F^{k} and F^{n}.
T_{A} (x) = x A for x in F^{k} and _{A}T(y) = A[y]^{tr} for y in F^{n} and [y]^{tr} indicates the entries of the vector treated as a one column matrix.
1012
The set of all linear transformations from V to W is denoted L(V,W).
Consequences of the definition: If T:V>W is a linear transformation, then for any x and y in V and a in F,
(i) T(0) = 0.
(ii) T(x) = T(x)
(iii) T(x+ay) = T(x) + aT(y).
Quick test: If T:V>W is a function and (iii) holds for any x and y in V and a in F, then the function is a linear transformation.
D... Differentiation is a linear transformation: on polynomials, on ...Example: (D(f))(x) = f' (x) or D(f) = f'.
(D(f + $\alpha$ g))(x) = (f+$\alpha$g)' (x) = f'(x) + $\alpha$g'(x) = (f'+$\alpha$g') (x) or
D(f+$\alpha$g) = f'+ $\alpha$g'= D(f) +$\alpha$ D(g).
Theorem: T : V>W linear, B a basis, gives T_{B}:B >W.
Suppose S:B > W, then there is a unique linear transformation T_S:V>W such that T_S_{B}=S.
Proof: Let T_S(v) be defined as follows: Suppose v is expressed (uniquely) as a linear combination of elements of B, ie. v = a_{1}u_{1}+ a_{2}u_{2}+ ... + a_{n}u_{n} ... then let T_S(v) = a_{1}S(u_{1})+ a_{2}S(u_{2})+ ... + a_{n}S(u_{n}) .
This is well defined since the representation of v is unique.
Uniqueness of T_S: If U is any linear transformation, U:V>W with U_{B}=S then U(v) = T_S(v) ,
so U must be T_S.
Show T_S is linear: Choose v and v' in V, α in F, with v = a_{1}u_{1}+ a_{2}u_{2}+ ... + a_{n}u_{n}and v' = a'_{1}u_{1}+ a'_{2}u_{2}+ ... + a'_{n}u_{n}
where T_S(v) = a_{1}S(u_{1})+ a_{2}S(u_{2})+ ... + a_{n}S(u_{n}) and T_S(v') = a'_{1}S(u_{1})+ a'_{2}S(u_{2})+ ... + a'_{n}S(u_{n})
Then check that T_S(v+av') = T_S(v) + aT_S(v'). [Details omitted here.]
Finally... for any u in B, u = 1u, so T_S_{}(u) = 1S(u) = S(u) and T_S_{B}= S. EOP
Comment: "A linear transformation is completely determined by what it does to a basis."
or... "T_(T_{B})= T: V>W and T_S_{B}= S: B> W".
Example: T: P(F) $\to$ P(F)....Use for a basis { x^{n }for n = 0, 1,2,... } and S(x^{n}) = nx ^{n1}.
Or another example: S(x^{n}) = 1/(n+1) x ^{n+1}.
1017
Key Spaces related to T:V>W
Null Space of T= kernel of T = {v in V where T(v) = 0 [ in W] }= N(T) < V
Range of T = Image of T = T(V) = {w in W where w = T(v) for some v in V} <W.
Comment: Connect these to matrix subspaces R(A) and N(A).
 N(A) ="the null space of A" = { v in F^{n} where vA= (0,0,...0) in F^{k} } < F^{n}
 R(A) = "the range space of A" = { w in F^{k} where for some v in F^{n}, vA= w } < F^{k}
Major result of the day:
Theorem: Suppose T:V>W and V is a finite dimensional v.s. over F. Then N(T) and R(T) are also finite dimensional and Dim(V) = Dim (N(T)) + Dim(R(T)).
Proof:
Outline: start with a basis C for N(T) and extend this to a basis B for V.
Show that T(BC) is a basis for R(T).
Algebraic structure on L(V,W)
Definition of the sum and scalar multiplication:
T, U in L(V,W), a in F, (T+U)(v) = T(v) + U(v).
Fact: T+U is also linear.
(aT)(v) = a T(v) .
Fact:aT is also Linear.
Check: Proposition: L(V,W) is a vector space over F.
1019
Note: If T':V> W and U':W>Z are also linear, then U(T+T') = UT + UT' and (U+U') T = UT + UT'. If S:Z>Y is also linear then S(UT) = (SU)T.
Composition: T:V > W and U : W > Z both linear, then UT:V>Z where UT(v) = U(T(v)) is linear.
Linear Transformations and Bases
Theorem: If V and W are finite dimensional then so is `L(V,W)` and `dim(L(V,W)) = dim(V) dim(W)`.
Outline: Use bases for V and W to find a basis for L(V,W). That basis for L(V,W) also establishes a function from L(V,W) to the matrices that is a linear transformation!
More details will be supplied.
Details: Let `B={v_1,v_2, ..., v_k}` is a basis for V and `C= {w_1,w_2, ... ,w_q}` be a basis for W. We can define a linear transformation `T_{i,j}: V >W` by `T_{i,j}(v_r)=w_j` when `r=i` and otherwise `T_{i,j}(v_r)=0_W`.
Claim: `D = {T_{i,j} : i = 1,...,k ; j = 1,...,q}` is a basis for `L(V,W)` and thus `dim(L(V,W)) = dim(V) dim(W)`.
`D` is linearly independent: Suppose `a_{i,j} in F` with `U =sum_{(i,j)} a_{i,j}T_{i,j} = 0.` Then with `r` fixed, `U(v_r)=sum_{i,j} a_{i,j}T_{i,j}(v_r) = sum_{j} a_{r,j}T_{r,j}(v_r) = sum_{j} a_{i,j}w_j= 0`. Since C is a basis this means that `a_{r,j}=0` for all j and thus `a_{i,j}=0` for all `i and j`. This shows that D is linearly independent.
`D` spans `L(V,W)`: Suppose `R in L(V,W)`. Then for each `r` there are some `b_{r,j} in F`, `R(v_r)= sum_{j} b_{r,j}w_j`. This shows that
`R = sum_{(i,j)} b_{i,j}T_{i,j}` and thus `R in SPAN(D)`. EOP
Matrices and Linear transformations.
1024
Footnote on notation for Matrices: If the basis for V is B and for W is C as above and `R in L(V,W); R: V >W`,
the matrix of `R`, with respect to those bases can be denoted `M_B^C(R)` .
Note  this establishes a convention on the representation of a transformation.
The matrix for a vector `v` in the basis B is denoted `M_B(v)` and for `T(V)` is denoted `M_C(T(v))`. If we treat these as row vectors we have
`M_C(T(v)) = M_B(v) M_B^C(R) `.
This all can be transposed using column vectors for the matrices of the vectors and transposing the matrix `M_B^C(R)` we have with this transposed view:
`M_C(T(v))^{tr} = M_B^C(R)^{tr} M_B(v)^{tr}`
To conform with the usual convention for function notation, the transposed version using column vectors matrices is used most frequently. When it is not ambiguous we will denote the transposed matrix by switching the position of B and C to the left of M: ` M_B^C(R)^{tr}= _C^BM(R) `.
`So _B^CM(R) _BM(v)=_CM(T(v)) `
Example: Suppose `T(x,y,z)=(2x+yz, 3x+4z); T:R^3 > R^2.` Let `B` and `C` be the standard ordered bases, `B={(1,0,0),(0,1,0),(0,0,1)} ; C={(1,0),(0,1)}`. Then
`T(1,0,0) = 2(1,0)+3(0,1)`
`T(0,1,0) = 1(1,0)+0(0,1)` and
`T(0,0,1) = 1(1,0) +4(0,1)`.
`M_B^C(T) = [(2,3),(1,0),(1,4)]` or transposed ` [(2,1,1),(3,0,4)]= _B^CM(T) `
Example: Suppose `T(f)=f'; T:P_2(R) >P_1(R).` Let `B` and `C` be the ordered bases, `B={1, x, x^2} ; C={1,x}`. Then
`T(1) = 0*1+0*x`
`T(x) = 1*1+ 0*x` and
`T(x^2) =0*1 + 2x`.
`M_B^C(T) = [(0,0),(1,0),(0,2)]` or transposed ` [(0,1,0),(0,0,2)]= _B^CM(T) `
The function `M : L(V,W) > Mat (k,q; F)` is a linear transformation.
Recall definition of "injective" or "1:1" function.
Recall definition of "surjective" or "onto" function.
Theorem: T is 1:1 (injective) if and only if N(T) = {0}
Proof: => Suppose T is 1:1. We now that T(0)=0 , so if T(v) = 0, then v = 0. Thus 0 is the only element of N(T) or N(T) = {0}.
<= Suppose N(T) = {0}. If T (v) = T(w) then T(vw) =T(v)T(w) = 0 so vw is in N(T).... ok, than must mean that vw = 0, so v=w and T is 1:1.
Theorem: T is onto if and only of the Range of T = W.
Theorem: T is onto if and only if for any (some) basis, B, of V, Span(T(B)) = W.
Theorem: If V and W are finite dimensional v.s. / F, dimV = dim W, T : V $\to$ W is linear, then T is 1:1 if and only if T is onto.
Proof: We know that dim V = dim N(T) + dim R(T).
=> If T is 1:1, then dim N(T) = 0, so dim V = dim R(T) . Thus dim R(T) = dim W and T is onto.
<= If T is onto, then dimR(T) = dim W. So dim N(T) = 0 and thus N(T) = {0} and T is 1:1.
Application to `M: L(V,W) > Mat(k,q;F)`:
Assuming `dim(V) = k` and `dim(W)=q` then `dim(L(V,W)) = kq = dim (Mat(k,q;F)).` Then to show M is 1:1 it suffices to show that M is onto. If we suppose `A in Mat(k,q;F)` with entries `A_{i,j}` and let `T = sum A_{i,j} T_{i,j}` then it should be direct to see that `M(T) = A.`
Definition: If `T:V > W` is a linear transformation that is 1:1 and onto, then T is called a linear (vector space) isomorphism and W is said to be (linearly) isomorphic to V.
Cor.: Mat(k,q;F) is linearly isomorphic to L(V,W).
Remark: There is more "good" stuff happening in this isomorphism. If Z is another vector space over F with dim(Z) = r and basis D, and `U in L(W,Z)` then `M_B^C(T)M_C^D(U) = M_B^D(UT)` or using the transposed version (which is generally preferred for the niceness of its appearance) `thus... _C^DM(U)_B^CM(T) = _B^DM(UT)`.
The importance of the Null Space of T, N(T), is in understanding what T does in general.
Example 1. D:P(R) > P(R)... D(f) = f'. Then N(D) = { f: f(x) = C for some constant C.} [from calculus 109!]
Notice: If f'(x) = g'(x) then f(x) = g(x) + C for some C.
Proof: consider D(f(x)  g(x)) = Df(x)  Dg(x) = 0, so f(x) g(x) is in N(T).
Example 2: Solving a system of homogeneous linear equations. This was connected to finding the null space of a linear transformation connected to a matrix. Then what about a non homogeneous system with the same matrix. Result: If z is a solution of the non homogeneous system of linear equations and z ' is another solution, then z' = z + n where n is a solution to the homogeneous system.
Consider `p in R[x]` and `T in L(V,V)`, where `p = a_0 + a_1x + a_2x^2 + ... a_nx^n`. We define `p(T) in L(V,V)` by `p(T)= a_0Id + a_1T + a_2T^2 + ... + a_nT^n`
Example 3: More differential equations: `D:C^{oo}(R) > C^{oo}(R)` where `Df(x) = f'(x)`. Suppose `p= x^2 5x+4`. Then `N(p(T)) = {f : p(T)(f) = z}` where `z` denotes the zero function. Then `N(p(T)) = {f:f''(x)5f'(x)+4f(x) = 0` for all `x in R}`. This is the set of solutions to the homogeneous linear differential equation `f''(x)5f'(x)+4f(x) = 0 `.
Note that `p = (x4)(x1)` which has roots (zeroes) 1 and 4 and that the solutions of the differential equation are all of the form `f = a_1e^x + a_2e^{4x}`. More on this example later in the course.
General Proposition: `T:V>W`. If b is a vector in W and a is in V with T(a) = b, then T^{1}({b}) = {v in V: v = a +n where n is in N(T)} = a + N(T)
Comment: a + N(T) is called the coset of a mod N(T)...these are analogous to lines in R^{2}. More on this later in the course.
Note:Why this called a "linear" transformation:
The geometry of linear: A line in R2 is {(x,y): Ax +By = C where A and B are not both 0} = {(x,y): (x,y) = (a,b) + t(u,v)}= L, line through (a,b) in direction of (u,v).Suppose T is a linear transformation :
Let T(L) = L' = {(x'y'): (x',y')= T(x,y)}
T(x,y) = T(a,b) + t T(u,v).
If T(u,v) = (0,0) then L' = T(L) = {T(a,b)}.
If not then L' is also a line though T(a,b) in the direction of T(u,v).
[View this in winplot?]
1031
Coke/Pepsi example B: T(x,y) =(2/3 x +1/4 y, 1/3 x+3/4 y)
T(v_{0}) = v_{1}, T(v_{1}) = v_{2}.... T(v_{k})=T(v_{k+1}).
T(v*)=v* means a nonzero v* is an eigenvector with eigenvalue 1. T(1, 4/3) = (1,4/3).
Also `T(3/7, 4/7) = T[(3/7)(1,4/3)] = 3/7T(1,4/3) =3/7(1,4/3) =(3/7,4/7)`.
`T(1,1) =(5/12,5/12 )= (5/12)(1,1)` means that `(1,1)` is an eigenvector with eigenvalue 5/12.
Invertibility of Linear Transformations
Def'n: T:V > W, `T in L(V,W)`, is invertible if and only if
there is a linear transformation S :W > V where TS = Id_{W} and ST = Id_{V} .
Fact: If T is invertible then the S :W>V used in the definition is also invertible!
S is unique: If S' satsifies the same properties as s, then
S = S Id = S(TS') =(ST)S' = Id S' = S'S is called "the inverse of T".
Prop: T is invertible iff T is 1:1 and onto (injective and surjective).
Outline of Proof:
(i) => Assume S... (a)show T is 1:1. [This uses ST = Id].(b) show T is onto [This uses TS = Id].
(ii) <= Assume T is 1:1 and onto. Define S. [This uses that T is onto and 1:1] Show S is linear [This uses T is linear.] and TS =I and ST = I
Def: If there is a T:V>W that is invertible, we say W is isomorphic with V. (V=_{T} W)
Comments: (i)V=_{Id} V (ii)If V=_{T} W then W=_{S} V (iii)If V=_{T} W and W=_{U} Z then V=_{UT} Z.
Theorem: Suppose V and W are finite dimensional v.s./F. Then V=_{T} W if and only if dim(V) = dim(W).
Proof: =>: Suppose V=_{T} W. Then since there is a T: V`>` W that is an isomorphism, dim (V) = dim N(T) + dim R(T). But R(T) = W and N(T) = {0} so dim N(T) = 0 and dim(V) = dim(W).
<= (outline) Assume dim(V) = dim ( W) = n. Choose a basis B for V, B = {v1,v2,...,vn} and a basis C for W, C = {w1,w2,...,wn}. Then use T defined by T(vi) = wi and show this is invertible.
Theorem: Suppose B= (v_{1},...,v_{n}) and C=(w_{1},...,w_{m}) are finite bases (lists) for V and W respectively. The linear transformation M: L(V,W) > Mat(m,n,F) is an isomorphism.
Proof: Show injective by Null(M)= {Z} where Z is the zero transformation.
Show M is onto by giving T_{A} where M(T_{A} ) = A based on knowing A.
[OR use dimensions of these vector spaces are equal proved previously.]
Cor. [if isomorphism is established directly]: Dim L(V,W) = Dim(V) Dim(W).
112
Note on HW: For SOS problem 6.69, assume `V = U ⊕ W`.
Prop. V a f.d.v.s. If T is in L(V) then the following are equivalent:
(i) T is invertible.
(ii) T is 1:1.
(iii) T is onto.
Proof: (i) =>(ii). Immediate.
(ii)=>(iii) . Dim V = Dim(N(T)) + Dim(R(T)). Since T is 1:1, N(T)={0}, so Dim(N(T))= 0 and thus Dim V = Dim (R(T)) so R(T) = V and T is onto.
(iii) =>(i) Dim V = Dim(N(T)) + Dim(R(T)) Since T is onto, R(T) = V... so Dim(N(T)) = 0. ... so N(T) = {0} and T is 1:1, so T is invertible.
Connection to square matrices:
A is invertible is equivalent to....Systems of equations statements.
[Motivation]
Look at Coke/Pepsi example B: T(x,y) =(2/3 x +1/4 y, 1/3 x+3/4 y)= (x,y)A
T(v_{0}) = v_{1}, T(v_{1}) = v_{2}.... T(v_{k})=T(v_{k+1}).
v_{2}=T(v_{1}) = TT(v_{0});... T(v_{k})=T^{k}(v_{0}) = (x_{0},v_{0})A^{k}.
We considered the equations: 2/3 x +1/4 y = cx and 1/3 x+3/4 y = cy and showed that
there are exactly two distinct eigenvalues for example B,
c= 1 and c = 5/12 and two non trivial eigenspaces W_{1} and W_{5/12} .
Now we can use technology to find eigenvectors in each of these subspaces.
Matrix calculator, gave as a result that the eigenvalue 1 had an eigenvector `(1,4/3)=v_1` while 5/12 had an eigenvector `(1,1)= v_2`. These two vectors are a basis for R^{2}.
B=(v1,v2)
What is the matrix of T using this basis.
`M_B^B(T) = _B^BM(T) = ((1,0),(0,5/12))`
Using this basis and matrix makes it easy to see what happens when the transformation is applied repeatedly:
`M_B^B(T^n) = [M_B^B(T)]^n =((1,0),(0,5/12))^n=((1,0),(0,{5/12}^n))`
114
Recall this footnote on notation for Matrices: If the basis for V is B and for W is C as above and `R in L(V,W); R: V >W`,
the matrix of `R`, with respect to those bases can be denoted `M_B^C(R)` .
Note  this establishes a convention on the representation of a transformation.
The matrix for a vector `v` in the basis B is denoted `M_B(v)` and for `T(V)` is denoted `M_C(T(v))`. If we treat these as row vectors we have
`M_C(T(v)) = M_B(v) M_B^C(R) `.This all can be transposed using column vectors for the matrices of the vectors and transposing the matrix `M_B^C(R)` we have with this transposed view:
`M_C(T(v))^{tr} = M_B^C(R)^{tr} M_B(v)^{tr}`
To conform with the usual convention for function notation, the transposed version using column vectors matrices is used most frequently. When it is not ambiguous we will denote the transposed matrix by switching the position of B and C to the left of M:
` M_B^C(R)^{tr}= _C^BM(R) `.AND
`So _B^CM(R) _BM(v)=_CM(T(v)) `
If Z is another vector space over F with dim(Z) = r and basis D, and `U in L(W,Z)` then `M_B^C(T)M_C^D(U) = M_B^D(UT)` or using the transposed version (which is generally preferred for the niceness of its appearance)
`thus... _C^DM(U)_B^CM(T) = _B^DM(UT)`.
Let's prove this!
Outline: Choose your bases, then find the matrix coefficients for T and U. Now apply U to T(v) where v is in the basis of V and find the coefficient for a basis element of Z. compare this with how matrix multiplication works!
117
Change of basis [Again]: [Caveat: Some of the matrices in the work that follows may be transposed.]
Consider a linear operator T on a vector space V, `T: V > V, T in L(V,V)`.
So ... What is the relationship between the very nice matrix we had for the coke and pepsi model for T that results from using the basis B of eigenvectors and the matrix for T that uses the standard basis, `E = (e_1,e_2)`? `M_B^B(T) = ((1,0),(0, 5/12))`
`M_E^E(T) =((2/3,1/3),(1/4,3/4))`.
The key to understanding the relationship between these matrices is the identity map!
We consider the matrix for the identity operator using B for the source space and E for the target space. `M_B^E(Id) =((1,4/3),(1,1))`And for the identity operator using E for the source and B for the target, M_{E}^{B}(Id).
Notice that M_{E}^{B}(Id) M_{B}^{E}(Id) =M_{B}^{B}(Id* Id)=M_{B}^{B}(Id)= I_{n} ,the n by n identity matrix, and similarly M_{B}^{E}(Id) M_{E}^{B}(Id) =M_{E}^{E}(Id) = I_{n} . Thus both these matrices are invertible and each is the inverse of the other!
Furthermore:`M_B^B(T) =((1,0),(0,5/12))`.
Now we see how to express `M_B^B(T)` in terms of `M_E^E(T)` and vice versa:
`M_B^E(Id)M_E^E(T)M_E^B(Id) = M_B^B(IdTId) = M_B^B(T)`
and
`M_E^B(Id)M_B^B(T)M_B^E(Id) = M_E^E(IdTId) = M_E^E(T)`
Now let `P = M_E^B(Id)` and `Q = M_B^E(Id) = P^{1}`, then we have
`M_E^B(Id)M_B^B(T)M_B^E(Id) = P M_B^B(T) P^{1} = M_E^E(T)`
or
PM_{B}^{B}(T)= M_{E}^{E}(T)P
and
QM_{E}^{E}(T)P=M_{B}^{B}(T).
Change of Basis, Invertibility and similar matrices.
The previous example works in general:
The Change of Basis Theorem:
Suppose V is a f.d.v.s over F, dim(V) = n, and B and E are two bases for V. Suppose `T:V > V` is a linear operator, then
M_{B}^{E}(Id)M_{E}^{E}(T)M_{E}^{B}(Id)=M_{B}^{B}(T)
and
M_{E}^{B}(Id)M_{B}^{B}(T)M_{B}^{E}(Id)=M_{E}^{E}(T).
If we let P =M_{E}^{B}(Id) and Q = M_{B}^{E}(Id) = P^{1},
[QP =M_{B}^{E}(Id)M_{E}^{B}(Id) = M_{B}^{B}(Id)= I_{n} ]
then we havePM_{B}^{B}(T)Q =PM_{B}^{B}(T)P^{1}=M_{E}^{E}(T)or
PM_{B}^{B}(T)= M_{E}^{E}(T)Pand likewise QM_{E}^{E}(T)P=M_{B}^{B}(T).
Def'n: We say that two n by n matrices A and B are similar if there is an invertible n by n matrix P so that B = P^{1}AP and write A~B.
Proposition: i) A~A; ii) if A~B then B~A;iii) if A~B and B~C then A~C.
Proof Outline: i) P= I_{n} . ii) Use Q=P^{1} . iii) If C = Q^{1}BQ then C = Q^{1}P^{1}APQ ...
Cor. Suppose V is a f.d.v.s over F, dim(V) = n, and B and E are two bases for V. Suppose `T:V > V ` is a linear operator, then M_{B}^{B}(T) and M_{E}^{E}(T) are similar matrices.
There is a "converse" to the theorem based on the following
Proposition: Suppose P is an invertible n by n matrix.
The linear transformation `T_P:F^n > F^n` defined by the matrix P where E is the standard ordered basis for F^{n}and M_{E}^{E}(T_{P}) = P maps every basis B of F^{n} to a basis, T_{P}(B)= B' .
119
Eigenvectors, Eigenvalues, Eigenspaces, Matrices, Diagonalizability, and Polynomials!
Definition: Suppose T is a linear operator on V, then a is an eigenvalue for T if there is a nonzero vector v where T(v) = av. The vector v is called an eigenvector for T.
Proposition: a is an eigenvalue for T if and only if Null(TaId) is nontrivial.
Def'n:T is diagonalizable if V has a basis of eigenvectors.
T is diagonalizable if and only if M(T) is similar to a diagonal matrix, i.e., a matrix A where A_{i,j}=0 for indices i, j where i is not equal to j.
Fact: If T is diagonalizable with distinct eigenvalues a_{1},...,a_{n} , then S = (Ta_{1}Id)(Ta_{2}Id).... (Ta_{n}Id) = 0.
Proof: It suffices to show that for any v in a basis for V, T(v) = 0. Choose a basis for V of eigenvectors, and suppose v is an element of this basis with T(v) = a_{j} v. Then S(v)= (Ta_{1}Id)(Ta_{2}Id).... (Ta_{n}Id)(v) = (Ta_{1}Id)(Ta_{2}Id).... (Ta_{j}Id)... (Ta_{n}Id)(Ta_{j}Id)(v) = 0.
What about the Converse? If there are distinct scalars a_{1},...,a_{n} where S(v) = (Ta_{1}Id)(Ta_{2}Id).... (Ta_{n}Id)(v) = 0 for any v in V, is T diagonalizable? we will return to this later....!
A Quick trip into High School/Precalculus Algebra and Formal Polynomials: Recall... F[X]
F[X] = { f in F^{∞}, where f(n) = 0 for all but a finite number of n} =
{ formal polynomials with coefficients in F using the "variable" X}< F^{∞}.
X = (0,1,0,0,....). example: 2+X + 5X^{2} +7 X^{4} = (2,1,5,0,7,0,0,...)
Notice: F[X] is an algebra over F... that is it has a multiplication defined on its elements...
Definition: If `f,g in F[X]` with `f = (a_0,a_1,...,a_n,0,0,...) with a_n not 0` and `g = (b_0,b_1,...,b_k,0,0,...) with b_n not 0` then `f *g= (c_0,c_1,...,c_n,0,0,...)` where
`c_j = sum_{i=0}^{i=j} a_ib_{ji}`
In fact it has a multiplicative unity, 1 =(1,0,0,0...), and furthermore, this algebra has a commutative multiplication: if f,g are in F[X] then f*g = g*f.
Notice: If f is not 0, then deg(f) = ...., and
Theorem: If f and g are not 0 = (0,0,0...), then f*g is also nonzero, with deg(f*g) = deg(f) + deg(g).
1114
Polynomials and Algebras:
If A is any algebra over F with unity, and `f in F[X]`,
`f = (a_0 , a_1, ... ,a_n, 0 , 0 , ...)` then we have a function, f :A`>`A defined by
f `(t) = a_0 I + a_1t+ ... +a_nt^n` where I is the unity for A and `t in A`.
In particular (i)A can be the field F itself, so f `in P(F)`.
Example: F = Z_{2}. f = X^{2} + X in F[X]. Then f =(0,1,1,0,0,...) is not (0,0,0....) but f(t) = 0 for all t in F.
(ii) A can be L(V) where V is a finite dimensional vector space over F.
Then f (T) is also in L(V).
(iii) A can be M(n;F), the n by n matrices with entries from F.
Then f (M) is also in M(n;F).
The Division Algorithm, [proof?]
If g is not zero, for any f there exist unique q , r in F[X] where f = q*g +r and either (i) r = 0 or (ii) deg(r) < deg(g).
The Remainder and Factor Theorems [Based on the DA]
Suppose c is in F, for any f there exist unique q , r in F[X]
where f = q*(Xc) +r and r = f(c).
Suppose c is in F ,then f (c) = 0 if and only if f = (Xc)*q for some q in F[X].
1116
Roots and degree.
If c is in F and f (c) = 0, then c is called a root of f.
If f is not 0, and deg(f) = n then there can be at n distinct roots of f in F.
Factoring polynomials. A polynomial in F[X] is called reducible if there exist polynomials p and q, with deg(p)>0 and deg(q)>0 where f=p*q.
If deg(f )>0 and f is not reducible it is called irreducible (over F).
Example: X^{2} + 1 is irreducible over R but Reducible over C.
(I)The FTof Alg for C[X].
Theorem: If f is nonzero in C[X] with deg(f)>0, then there is a complex number r where f (r) = 0
(II) The FT of Alg for R[X].
Theorem: If f is nonzero in R[X] and f is irreducible, then deg(f)= 1 or 2.
Proof of II assuming (I): If f is in R[X] and deg(f)>2, then f is in C[X].
If r is a root of f and r is a real number then f is reducible by the factor theorem.
If r=a +ib is not a real number, then because the complex conjugate of a sum(product) of complex numbers is the sum (product) of the conjugates of the numbers, and the complex conjugate of a real number is the same real number, we can show that f(a+bi) =0 = f(abi). Now by the factor theorem (applied twice)
f = (X(a+bi))*(X(abi))*q=((Xa)^{2} + b^{2} )*qand deg(q) = deg(f ) 2 >0. Thus f is reducible.
Back to Linear Algebra, Eigenvalues and "the Minimal Polynomial for a Linear Operator":
Theorem: Suppose V is nontrivial f.d.v.s over the complex numbers, C and `T in L(V)`. Then T has an eigenvalue.
Comment: First consider this with the Coke/Pepsi example B: [Corrected 11/17]
T(x,y) =(2/3 x +1/4 y, 1/3 x+3/4 y).
Consider ( e_{1}= (1,0), T(e_{1}) = (2/3,1/3), T(T(e_{1} ))= (4/9+1/12, 2/9+3/16) ). This must be linearly dependent because it has 3 vectors in R^{2}. This gives some coefficients in R not all zero, where a_{0} Id(e_{1}) + a_{1}T(e_{1}) + a_{2}T^{2}(e_{1})=0. Thus we have f in R[X] , f = a_{0} + a_{1}X + a_{2}X^{2}
and f(T)(e_{1}) = 0. In fact, we can use f =(X1)(X5/12) . f(T)(e_{1})=(TId)(T5/12Id)(e_{1})= (TId)((T5/12Id)(e_{1}))=(TId)(2/35/12,1/3) = 0 Thus we find that (TId) (1/4,1/3)= 0, so (1/4,1/3) is an eigenvector for T with eigenvalue 1.
Now here is a Proof (outline): Suppose dim V = n >0.
Consider v, a nonzero element of V, and the set (v, T(v), T^{2}(v),T^{3}(v)....T^{n}(v)).
Since this set has n+1 vectors it must be linearly dependent. ...
...
This means there is a nonzero polynomial, f, in C[X] where f (T)(v) = 0.
Let m = deg(f ).
Using the FT of Alg for C we have that f = a (Xc_{1})... (Xc_{m}).
Now apply this to T as a product and ....
for some i and w (not 0), (Tc_{i}Id) (w) = 0. Thus T has an eigenvalue.
Theorem:V a fdvs /F, T in L(V,V) .
Then there is some nonzero polynomial f in F[X] where f (T) = 0,
i.e., for all v in V, f (T)(v)= 0.
Proof (outline). Suppose dim(V)=n. Consider the set (Id, T, T^{2},T^{3}....T^{n*n}).
Since this set has n*n+1 vectors in L(V) where dim(L(V))= n*n, so it must be linearly dependent. ...
...
This means there is a nonzero polynomial, f, in F[X] where f (T) = 0,
i.e., f(T)(v) = 0 for all v in V.
1118
Definition: Ann(T)={f in F[X] : f (T) = 0}. The previous Theorem shows Ann(T) has a non trival element.
Prop: f, g in Ann(T), h in F[X] then f+g and h*f are in Ann(T). [Ann(T) is an "ideal".]
Theorem (The minimal polynomial): There is a non zero monic polynomial in Ann(T) of smallest degree. This polynomial is unique and any polynomial in Ann(T) has this polynomial as a factor.
Proof: The previous theorem has shown Ann(T) has a nonzero polynomial element. Considering the degrees of the nonzero polynomials in Ann(T) there is a smallest positive degree, call it m and a polynomial g in Ann(T) with deg(g) = m. If g = bX^{m} +....terms of lower degree, where b is not 0,
then f = 1/b*g is also in Ann(T) and f is a monic polynomial.
Now suppose h is also in Ann(T) , then by the division algorithm, h = q*f + r where either r = 0 or deg(r)< m. But since h and f are in Ann(T), h  q*f = r is also in Ann(T). Since deg(f )=m, which is the lowest degree for an element of Ann(T), it must be that r = 0, so h =q*f. Now if h is also monic and deg(h) = m, then deg(q) = 0, and since h and f are both monic, q = 1, and h = f. Thus the non zero monic polynomial in Ann(T) of smallest degree is unique! EOP
Prop. (Min'l Poly meets eigenvalues) Suppose m in F[X] is the mininal polynomial for T. Then T has eigenvalue c if and only if Xc is a factor of m.
Proof: => Suppose c is an eigenvalue for T. Then W_{c } =Null(Tc) is a nontrivial subspace of V and for w in W_{c}
T(w) = cw is also in W_{c} . Let S(w)=T(w) for w in W_{c} . Notice that S is in L(W).
As a linear operator on W_{c} , (Xc)(S) = 0, so Xc is the minimal polynomial for S.
But for any w in W_{c} (and thus in V), m(S)(w) = m(T)(w) = 0, so m is in Ann(S), and Xc is a factor of m.
<= Suppose Xc is a factor of m. m= (Xc)*q. Since m is the minimal polynomial for T and deg(q)= deg(m)1,
q(T) is not the 0 operator. Thus there is some v in V where w =q(T)(v) is not 0.
But m(T)(v)=...=(TcId)(q(T)(v))=(TcId)(w) = 0. So c is an eigenvalue for T. EOP
1128
Discussion of determinants and characteristic polynomial for a matrix. Cayly Hamilton theorem discussed with its implication that the degree of the minimal polynomial for an n by n matrix is no greater than n. [Details to be added. ]
Cor. T is invertible if and only if the constant for m = m(0) is not 0.
Cor. If T is invertible then T^{1} = 1/m(0) (( mm(0))/X)(T))
Remark: This also can be applied to square matrices thus expressing the matrix inverse of a matrix M as a polynomial applied to M.
Example: Let `M=( (3, 5),(0,2))` then M has minimal polynomial: `m = (X3)(X2) = X^2 5X +6`. Since m(0)= 6, we have that `M^{1} = 1/6 (M5I) = 1/6((2,5),(0,3)) `. Check!
1130
Invariant Subspaces: V, T as usual.
Def'n:W is called an invariant subspace for T if for all w in W, T(w) is also in W... i.e. T(W)<W.
If W is an invariant subspace of T, then T:W>W is a linear operator as well, denoted T_{W}.
Fact: If W is an invariant subspace of T, then the minimal polyonmial of T_{W} is a factor of the minimal polynomial of T.
Invariant Subspaces, Diagonal and Block Matrices:
If V is a fdvs / F and T is in L(V) with W_{1} ,W_{2} ,...,W_{k} invariant subspaces for T and V = W_{1} `oplus`W_{2} `oplus`...`oplus` W_{k} .
and if the basis for V is B is composed of bases for each W_{1} ,W_{2} ,... ,W_{k} in order, then M(T) is composed of matrix blocks  each of which is M(T_{Wi}). Furthermore if m_{1} ,m_{2} ,... , m_{k} is the minimal polynomial for T restricted to W_{1} ,W_{2} ,... ,W_{k} then the minimal polynomial for T is the lowest common multiple of the polynomials m_{1} ,m_{2} ,... , m_{k.}
Examples of Matrices, characteristics values and minimal polynomials.
[Some references here to the characteristic polynomial from previous course work in Linear algebra.]
Example 1. `((3,0,0),(0,3,0), (0,0,2))` is a diagonal matrix that has "characteristic polynomial" `(x3)^2 (x2)` but minimal polynomial `(x3)(x2)`.
Example 2. `((3,1),(0,3))` is not diagonalizable. It has characteristic polynomial and minimal polynomial `(x3)^2`.
These examples are relevant because of the following
Prop. Suppose m in F[X] is the mininal polynomial for T.
T is diagonalizable if and only if there are distinct c_{1},...,c_{m} in F where m = (Xc_{1})... (Xc_{m})
Proof : => Suppose T is diagonalizable and T has eigenvalues c_{1},...,c_{m}. By the preceeding Proposition, for each c_{1},...,c_{m} , each (Xc_{1}),...,(Xc_{m}) is a factor of the minimal polynomial, and by our previous work, S=(Xc_{1})*...*(Xc_{m}) is in Ann(T) so m = S.
<=[ Modified from proof in Hoffman and Kunze Linear algebra 2nd Ed]
Suppose there are distinct c_{1},...,c_{m} in F where m = (Xc_{1})*...*(Xc_{m}).
Let W be the subspace spanned by all the characteristic vectors of T. So W = W_{1} +W_{2} +...+ W_{m} where W_{k} = Null(Tc_{k}).
We will show that V = W indirectly. Suppose V is not W.
Lemma: There is a vector v* not in W and a characteristic value c_{j} where w*=(Tc_{j} Id)(v*) is in W. [Proof is below.]
Now express w* in terms of vectors in W_{k}.
Then for any polynomial h,
h(T)(w*) = h(c_{1})w_{1} + ... +h(c_{k}) w_{k}.THUS...
Now m = (Xc_{j} )q and qq(c_{j} ) = (Xc_{j} )k.
q(T)(v*)q(c_{j} )(v*) = k(T)(Tc_{j} Id)(v*)= k(T)(w*) which is in W!
BUT 0 = m(T)(v*)=(Tc_{j} Id)(qT)(v*).... so q(T)(v*) is in W.
Thus q(c_{j} )(v*)=q(T)(v*) k(T)(w*) is in W.But we assumed v* is not in W, so q(c_{j} ) = 0. So the factor (Xc_{j} ) appeared twice in m! A Contradiction!
Proof of Lemma: We must find a vector v* not in W and a characteristic value c_{j} where w*=(Tc_{j} Id)(v*) is in W.
Suppose b is in V but not in W.
Consider C={ all polynomials f where f (T) (b) is in W}.
[There are some nontrivial elements of C since m(T)(b) = 0, so m is in C.]
Of all the elements of C, there is a unique nonzero monic polynomial of least degree which we will call g. [Proof left as exercise for Friday]
Then g is a factor of m. [Proof left as exercise for Friday.] Since b is not in W, g is not a constant, and so deg( g ) >0.
Since we know all the factors of m, for some c_{j} ,
(X c_{j} ) is a factor of g.So g= (X c_{j} ) * h, and because g was of minimal degree for polynomials where f (T) (b) is in W,
h(T)(b)=v* is not in W.But w* = g(T)(b) = (Tc_{j} Id)h(T)(b) = (Tc_{j} Id)(v*) is in W.
End of lemma's proof.
Remark: Every monic polynomial is the minimal polynomial for some linear operator:
Suppose `f` is the polynomial and degree of `f` is `n`, `f = a_0 + a_1 X + a_2 X^2 + ... + a_{n1}X^{n1} + X^n`.
Let T: `R^n > R^n` be defined by T(`e_i`) = `e_{i+1}` for `i = 1,2,...n1` and
T(`e_n`) = ` (a_0e_1 + a_1e_2 + a_2 e_3 + ... + a_{n1}e_n)`. Then the minimal polynomial of T is `f`.
Example: `f = (X2)(X1)^2 = (X2) (X^2 2X +1) = X^3 4X^2 3X 2 `. Then using the standard basis T has matrix:
`((0,0, 2),(1,0,3), (0,1, 4))` and `f` is the minimal polynomial for T.
Nilpotent Operators and Jordan Canonical Form.
Now what about operators where the minimal polynomial splits into powers of linears? or where the minimal polynomial has nonlinear irreducible factors?
First Consider the case of powers of linear factors.
The simplest is just a power of X. (or (Xc)).
Example: D: P_{3}(R) > P_{3}(R), the derivative. Then the minimal polynomial for D is X^{4}.
Definition: In general, an operator N is called nilpotent if for some k>0, N^{k} =0. The smallest such k is called the index of nilpotency for N, and if k is the index of nilpotency for N, then the minimal polynomial for N is X^{k} .
Proposition: If N is nilpotent of index k and dim V = k, then there is a basis for V , {b_{1},b_{2},...b_{k}} where N(b_{k})= 0 and N(b_{i}) = b_{i+1} for all i <k.
Proof: Since the minimal polynomial for N is X^{k}, there is some vector v* in V where N^{k1} (v*) is not zero but N^{k} (v*)=0. Let b_{1} = v* and b_{2} = N(b_{1}), b_{3} = N(b_{2}), ...,b_{k} = N(b_{k1}).
Then clearly N(b_{k} )= N^{k} (v*)=0. It suffices to show that (b_{1},b_{2},...b_{k}) is linearly independent. Suppose a_{1}b_{1} + a_{2}b_{2}+...+ a_{k}b_{k}= 0. Then a_{1}v* + a_{2}N(v*)+...+ a_{k}N^{k1} (v*)= 0. Now apply N to obtain N^{k1}(a_{1}b_{1} + a_{2}b_{2}+...+ a_{k}b_{k})= 0 or a_{1}N^{k1}b_{1} + a_{2}N^{k1}b_{2}+...+ a_{k}N^{k1}b_{k}= 0. But N^{k1}b_{j} =0 for j>1, so a_{1}N^{k1}b_{1}=0. But N^{k1} (v*) is not zero, so a_{1 }= 0. Now by using N^{k2}(a_{2}b_{2}+...+ a_{k}b_{k})= 0 a similar analysis shows that a_{2} =0. Continuing we can show that a_{3} =0, ..., a_{k1}
= 0. But that leaves a_{k} b_{k} = 0 and thus a_{k} = 0, so {b_{1},b_{2},...b_{k}} is linearly independent.
Alternatively: Let f = (a_{1}, a_{2},...,a_{k}) the polynomial of degree k1 with a_{1}, a_{2},...,a_{k }for coefficients. If f is not the zero polynomial and f(N)(v*)=0 then f must be a factor of X^{k} . But the assumption is that N^{k1} (v*) is not zero, so f must be the zero polynomial and all of the coefficients are 0. Hence, (b_{1},b_{2},...b_{k}) is linearly independent.
Example: Find the basis for D: P_{3}(R) > P_{3}(R).
Comments:
 Using the basis (b_{1},b_{2},...b_{k}) the matrix for N, M(N) has 0 everywhere except for 1's that are below the main diagonal.
 If T is an operator on a vector space V with dim V= k and the minimal polynomial for T is (Xc)^{k} .
Then let N = TcId and so N^{k} =0. So N is nilpotent with index of nilpotency k. Using the basis for V determined in the last proposition we have M(TcId ) = M(T)  cM(Id), So the matrix of T , M(T) = M(TcId) + cM(Id). This matrix is called the Jordan Block matrix of dimension k for the characteristic value c. J(c;k) The BIG Picture.
 Theorem (Jordan Canonical Form): Suppose T is a linear operator on V and m is the minimal polynomial for T.
If m = (Xc_{1})^{p1}... (Xc_{m})^{pm} then
there are T invariant subspaces W1,....,Wq with V = W1`oplus`... `oplus`Wq. Furthermore the minimal polynomial for T restricted to each subspace Wi is of the form (Xc_{j})^{ri}where dim(Wi) =ri and ri=<pj and ri= pj for at least one i and each j.
 Thus there is a basis for V so that using that basis, M(T) is a block matrix where each block has the form of the J(c_{j} ; ri). This matrix is called the Jordan Canonical Form for T.
Fill in examples of possible matrices for T when m = (X 2)^{3}(X+3)^{2}X^{3} (X1)^{ where dim (V) =12. Discuss uniqueness.}
 By the Fundamental Theorem of Algebra, If V is a finite dimensional vector space over C, then any linear operator on V has a basis for which M(T) is a Jordan Canonical Form.
 Corollary: For T a linear operator on V, a finite dimensional vs over R or C, the degree of the minimal polynomial for T is not greater than the dimension of V.
Proof: By the Jordan Theorem, the degree of m is p1 + p2+ ....+ pm which is no greater than the r1+r2+ ... +rq = dim(V).
 Application to powers of matrices:
Suppose A is a n by n matrix with complex coefficients. Then there is a linear operator T on C^{n } that has A for it's matrix using the standard basis for C^{n }. By the Jordan Theorem, there is a basis for C^{n } where M(T)=J is in Jordan form. Thus there is an inverible matrix S where A =S1 J S.
Now consider A^{n } = (S^{1} J S )^{n }= S^{1}J ^{n } S.
So we can determine the behavior of A^{n} by studying powers of the Jordan blocks.
Do an example with J( 1/2;3), J( 1;3), J(2;3) , and J( i ;3).
What conclusions can we infer about the convergence of A^{n} for large n based on the eigenvalues of A?
If any eigenvalue ,c, has c>1 then the sequence diverges. If c = 1 and c is not 1, then the sequence will not converge, c= 1 and the block is J(1;k) with k > 1, then the sequence diverges.
Otherwise, the sequence will converge! Look at J(c;3) and J(c;4) to see with examples... using technology.
 Application to Markov Chains:
A markov chain consist of a finite number, n, of states and a matrix T where T(i,j) = the probability that something currently in state j will move to state i after one period of transition. Thus 0 <= T(i,j) <= 1 for all 1 <= i,j <= n. It is assumed the only possible changes of state from j to i are listed, so T(1,j) + T(2,j) +... +T(n,j) = 1 for any j. If v = (v_{1}, v_{2}, ... ,v_{n}) gives the initial distribution of objects in the various states, then T(v) is the likely distribution of objects after one period of transition. T(v) _{i} = T(i,1)v_{1} + T(i,2)v_{2} +... +T(i,n)v_{n} , so T gives a linear operator on R^{n }and C^{n}. Notice that using the standard basis, M(T) = T. If we assume that the probabilities of T remain the same for every period of transition, then T is called a Markov transition matrix. Given the initial distribution v, the distribution after k periods of transition is T(...(T(v))...) = T^{k}(v). Treating T as a linear operator on C^{n}, there is a basis for C^{n} that is composed of Jordan blocks. Thus, the question of what happens to the distribution between the states in the long run is determined by the eigen values of T. T is called a regular Markov process, if for some k, all the entries of T^{k }are positive.
 DO example to illustrate how this works on graph.
Theorem: If T is a regular Markov process, then 1 is an eigenvalue for T and all other eigenvalues of T have magnitude less than 1. Furthermore: Dim Null(TId)=1 and the power of X1 in the minimal polynomial of T is 1.
 Corollary: For k large, T^{k}(v) is close to the unique vector v*in C^{n }where
T(v*)=v* and v_{1} + v_{2} +... +v_{n} =v*_{1} + v*_{2} +... +v*_{n}.Illustrate this with example... and technology.
 Use the result to find long run for a 5 state markov process using graph of transitions and technology to find the limit as the solution to T(v*)=v* or (TId)(v*)=0.
 Proof (outline ..based in part on Friedberg, Insel, Spence): Suppose T is a regular Markov process and let J be the Jordan matrix for T.
Fact: If c is an eigenvalue for T, then c is an eigenvalue for the transpose of T, T^{t}.
Proof of fact (outline): check: f (T)^{t} = f(T^{t}), so T and T^{t} have the same minimal polynomial, hence the same eigenvalues.
Lemma 1: If c is an eigenvalue for T, then c ≤ 1.
Proof of Lemma 1: Consider c as an eigenvalue for T^{t}. Suppose x be an eigenvector with eigenvalue c for T^{t}. Then (T^{t}(x))_{i }=T(1,i)x_{1} + T(2,i)x_{2} +... +T(n,i)x_{n} = cx_{i} for each i. Let b = Max{x_{1},x_{2},...,x_{n}}
Then for some k,
c b = c x_{k} = T(1,k)x_{1} + T(2,k)x_{2} +... +T(n,k)x_{n}
≤T(1,k)x_{1}+ T(2,k)x_{2} +... +T(n,k)x_{n} [triangle inequality}
≤T(1,k)x_{1} + T(2,k)x_{2} +... +T(n,k)x_{n} [ ab=a b ]
≤T(1,k)b + T(2,k)b +... +T(n,k)b [ b =Max{x_{1},x_{2},...,x_{n}}]
≤ [T(1,k) + T(2,k) +... +T(n,k)]b = b
since T(1,j) + T(2,j) +... +T(n,j) = 1 and 0≤ T(i,j). Lemma 2: 1 is an eigenvalue for T.
Proof: 1 an eigenvalue for T^{t}: T(1,j)1 + T(2,j)1 +... +T(n,j)1 = 1 so T^{t} has an eigenvector of (1,...,1) = u. Lemma 3: J has single blocks for c=1, of the form J(1;1).
Powers of T have real number entries that are never larger than 1. [Prove as exercise.]
But if there is a Jordan block in J with J(1;k) with k>1, the powers will get much larger than 1.
 Lemma 4: If c is an eigenvalue for T with c=1, then c = 1 and Dim(Null(TId))= 1.
Proof: Since T is a regular Markov process, we may assume that all entries of T are positive.
First, since c = 1, all the previous inequalities in Lemma 1 are actually equalities for any eigenvector x for the eigenvalue c.
b =c b = c x_{k} = T(1,k)x_{1} + T(2,k)x_{2} +... +T(n,k)x_{n}
=T(1,k)x_{1}+ T(2,k)x_{2} +... +T(n,k)x_{n} (*)
=T(1,k)x_{1} + T(2,k)x_{2} +... +T(n,k)x_{n}
=T(1,k)b + T(2,k)b +... +T(n,k)b (**)
=[T(1,k) + T(2,k) +... +T(n,k)]b = b .
And we already have that T(1,k) + T(2,k) +... +T(n,k) = 1 Notice that for any complex numbers a and b, a+b = a + b only if a and b are "colinear", that is, there is some complex number z with z = 1 and nonnegative real numbers r and s where a = rz and b = sz.
Using this fact for (*) there must be some complex number z* with z* = 1 and real numbers r_{j} where
r_{j} z* =T(j,k)x_{j }for all j. From (**) , since b > 0, either T(j,k) = 0 or x_{j}=b, but the assumption of regularity is that T(j,k)>0, so x_{j}= b > 0 for all j. But then r_{j} z*/T(j,k)=x_{j }so
b =x_{j} = r_{j} z* /T(j,k)_{ } = r_{j} z* /T(j,k) = r_{j}/T(j,k) for all j. Thus b z* =x_{j }for all j.
SO.... x = bz*(1,1,...,1) and c = 1.
Thus if If c is an eigenvalue for T with c=1, then c = 1 and if x is an eigenvector with eigenvalue 1 then x is a scalar multiple of (1,1,...,1), so Dim(Null(TId))= 1
EOP for Lemma.
 To Finish the Proof of the Theorem note that since Dim(Null(TId))= 1, the factor of the minimal polynomial that corresponds to the eigenvalue 1 can only be (X1), since all blocks in the Jordan matrix from the eigenvalu 1 are of the form J(1,1).
Proof of the Corollary: By the Jordan Theorem, there is a basis for C^{n } where M(T)=J is in Jordan form. Thus there is an inverible matrix S where T =S^{1} J S.
Now consider T^{n } = (S^{1} J S )^{n }= S^{1}J ^{n } S. Since T has only 1 for a complex eigenvalue of magnitude 1, and all other eigenvalues have magnitude less than 1, and since there is only one block ( J(1,1) ) for the eigenvalue 1. For k large T^{k}, is close to the matrix S^{1}^{ }K S where K(1,1) = 1 and K(i,j) = 0 for (i,j) different from (1,1).
Now one can show that S^{1}^{ }K S is the square matrix which has each column equal to the eigenvector with eigenvalue 1and with the sum of its components equal to one.
From this it can be shown that for any v, T^{k}(v) is close to the unique vector v*in C^{n }where
T(v*)=v* and v_{1} + v_{2} +... +v_{n} =v*_{1} + v*_{2} +... +v*_{n}.
Do Example. We looked at a 3 by 3 example of a regular Markov chain matrix modelling the movement of rental cars between SF, LA, and Vegas. We assumed there are 400 cars in the fleet. To find the v* suggested by the Theorem, we need to solve Tv* = v* and 400=v*_{1} + v*_{2} +v*_{3} . This is equivalent to solving the system of equations (TI)v* = 0 with 400=v*_{1} + v*_{2} +v*_{3} . Since the columns of T add to 1, the matrix TI has linearly dependent rows (they add to 0). replacing the final row with 1 1 1 and 0 with (0,0,400) the equations had a unique solution, 400(3/11,6/11, 2/11). Obviously, if the total number of cars were N, then the result would be N(3/11,6/11, 2/11). The proportions are fixed then by the characteristic vector (3/11,6/11, 2/11). Furthermore, this result is independent of the starting distribution, so if all cars started in SF, the result would be the same, and similarly for LA or Vegas. So for n large T^{n} is approximately the matrix with each column equal to the transpose of (3/11,6/11, 2/11).
 Notice that T ^{n} > T* where each column of T* is v* with v*_{1}+... + v*_{n} = 1
Proof of JCF Theorem: ?? Outline