Instruction

Read the instruction carefully and think about how to develop R code to answer each questions.

Question 1

Consider matrices

\[ \mathbf{A} = \begin{bmatrix} 7 & 1 \\ 4 & -3 \\ 2 & 0 \end{bmatrix} ~~ \mathbf{B} = \begin{bmatrix} 2 & 1 & 7 \\ 0 & -1 & 4 \end{bmatrix} ~~ \mathbf{C} = \begin{bmatrix} 1 & 2 & -1 \\ 2 & 2 & 4 \\ -1 & 4 & 7 \end{bmatrix} \]


  A <- matrix(c(7,4,2,1,-3,0),ncol=2)
  B <- matrix(c(2,0,1,-1,7,4),ncol=3)
  C <- matrix(c(1,2,-1,2,2,4,-1,4,7),ncol=3)
review their algebra and properties

01: edit/access

a) add a new row column \(\mathbf{b}_3\) to matrix \(\mathbf{b}\)

\[ \mathbf{\acute{B}} = \begin{bmatrix} \mathbf{B} \\ \mathbf{b}_3 \end{bmatrix} \qquad \mbox{ when }\qquad \mathbf{b}_3 = \begin{bmatrix} 3 & -8 & 0 \end{bmatrix} \]

  b3.row  <- c(3,-8,0)
  B.acute <- rbind(B,b3.row)
add an identity matrix as column of matrix \(\mathbf{b}\)

\[ \mathbf{\acute{B}} = \begin{bmatrix} \mathbf{B} & \mathbf{I} \end{bmatrix} \]

  cbind(B,diag(2))
##      [,1] [,2] [,3] [,4] [,5]
## [1,]    2    1    7    1    0
## [2,]    0   -1    4    0    1
b) access sub matrix first two columns and rows of matrix \(\mathbf{C}\)
  C[c(1,2),c(1,2)]
##      [,1] [,2]
## [1,]    1    2
## [2,]    2    2

note

  • this is a trick from two-phase simplex method when constraints are not all less than equal.
  • dim(), ncol(), nrow() find dimension/columns/rows of matrix/array
  • colnames()/rownames get column/row names

02: Multiplication

c) find relationship among \((\mathbf{AB})\mathbf{C}\), \(\mathbf{A(BC)}\), \(\mathbf{C^T B^T A^T}\) and \(\mathbf{(ABC)^T}\)
### SOLUTION TO QUESTION 1A ###
  ##-- check class and bytecode as well
  identical((A%*%B)%*%C , A%*%(B%*%C))
## [1] TRUE
  all.equal( t(C)%*%t(B)%*%t(A), t(A%*%B%*%C) )
## [1] TRUE

note

  • identical() compare two objects at many levels, e.g., identical(3.0,3L)
  • all.equal() compare multiple objects

03: det()

d) find relationship among \(\det(\mathbf{C})\) \(\det(\mathbf{C^3})\) \(\det(\mathbf{C})^3\), \(\det(\mathbf{C^{-1}})\) and \(\det(10\mathbf{C})\)
### SOLUTION TO QUESTION 1B ###
  all.equal(det(C)^3, det(C%*%C%*%C))

  all.equal(det(10*C),10^ncol(C)*det(C))

04: eigen()

e) find eigenvalue and eigenvector of \(\mathbf{C}\) and \(\mathbf{C^T}\)
### SOLUTION TO QUESTION 1C ###
  eigen.C <- eigen(C)

  eigen.Ct<- eigen(t(C))
  
  all.equal(eigen.C$values,eigen.Ct$values)

note

  • what is mean by equal, here?
  • why eigen(C) and eigen(t(C)) are equal?

05: power

f) find triangle decomposition of \(\mathbf{C}\) or (\(\mathbf{C} \equiv \mathbf{S^{-1}} \boldsymbol{\Lambda} \mathbf{S}\)) and show that \(\mathbf{C}^2 = \mathbf{S^{-1}} \boldsymbol{\Lambda}^2 \mathbf{S}\)
### SOLUTION TO QUESTION 1D ###

  eValue <- eigen(C)$values
  lambda <- eigen(C)$vectors
  
  C.dec  <- lambda%*%(diag(3)*eValue)%*%solve(lambda)
  
  C2.dec <- lambda%*%(diag(3)*eValue)%*%(diag(3)*eValue)%*%solve(lambda)
  
  all.equal(C%*%C,C2.dec)
  
  identical(C%*%C,C2.dec)

note

  • compare result with svd(C) as they are another possible popular decomposition of \(\mathbf{C}\)

Question 2

consider the following matrix \(\mathbf{E}\)

\[ \mathbf{E} = \begin{bmatrix} 0.50 & 0.87 & 1.00 & 0.87 & 0.50 & 0.00 & -0.50 \\ 0.87 & 0.50 & 0.00 & -0.50 & -0.87 & -1.00 & -0.87 \\ \end{bmatrix} \]

E <- matrix( 
  c(5.000000e-01,8.660254e-01,  8.660254e-01,  5.000000e-01,  1.000000e+00,
    6.123032e-17  ,8.660254e-01, -5.000000e-01,  5.0000e-01, -8.660254e-01,
    1.224606e-16, -1.000000e+00, -5.000000e-01, -8.660254e-01),
  nrow=2)
explain the meaning of the following multiplication and visualize the result comparing with matrix \(\mathbf{E}\)

01: scale

Matrix multiplication with positive diagonal matrix \(\mathbf{A}\) scale is scaling of each vector and det() is scaling factor (note: 1.0 is current ratio)

\[ \begin{bmatrix} 2 & 0 \\ 0 & 1.5 \end{bmatrix} \times \mathbf{E} \]

 plot(E[1,],E[2,],pch=16,col="black",cex=0.8,
       xlim=c(-3,3),ylim=c(-3,3),
       xlab="x-axis",ylab="y-axis")
 abline(h=seq(-3,3,0.5),col="gray")
 abline(v=seq(-3,3,0.5),col="gray")

 scale <- c(2.000000,1.5)*diag(2)
 E.scale <- scale %*% E
 points(E.scale[1,],E.scale[2,],pch=16,col="brown",cex=0.8 ) 


02: reflection

Matrix multiplication with reverse diagonal matrix that has det() = -1 is selection of each vector

\[ \begin{bmatrix} 0 & 1 \\ 1 & 0 \end{bmatrix} \times \mathbf{E} \]

 plot(E[1,],E[2,],pch=16,col="black",cex=0.8,
       xlim=c(-3,3),ylim=c(-3,3),
       xlab="x-axis",ylab="y-axis")
 abline(h=seq(-3,3,0.5),col="gray")
 abline(v=seq(-3,3,0.5),col="gray")
 
 ##-- reflection
 abline(a=0,b=1,col="red",lty=2) 
 E.ref <- matrix( c(0,1,1,0),ncol=2) %*% E
 points(E.ref[1,],E.ref[2,],pch=16,col="blue",cex=0.8 )


03: projection

In general, Matrix multiplication with any matrix that has det() \(\neq 1\) is projection (note: , e.g., projection usually means reduction of dimension, such as mapping projection or linear regression)

\[ \begin{bmatrix} \sin(\frac{1}{6}\pi) & -\cos(\frac{1}{6}\pi) \\ \cos(\frac{1}{6}\pi) & \sin(\frac{1}{6}\pi) \end{bmatrix} \times \mathbf{E} \] note This example is a special case, which are

  • rotating x-axis and y-axis \(\frac{1}{6}\pi\)
  • perpendicular transformation (orthogonal, and square)
  • no scaling as det()=1
 plot(E[1,],E[2,],pch=16,col="black",cex=0.8,
       xlim=c(-3,3),ylim=c(-3,3),
       xlab="x-axis",ylab="y-axis")
 abline(h=seq(-3,3,0.5),col="gray")
 abline(v=seq(-3,3,0.5),col="gray")
 
 

 odd <- seq(from=1,to=ncol(E),by=2) 
 text(E[1,odd]-.1,E[2,odd]-.1,odd,col="black",cex=0.9)

 abline(h=0,col="red")
 abline(v=0,col="red")
 
 theta <- pi/6
 
 rotat <- matrix( c( sin(theta), cos(theta),-cos(theta),sin(theta)),nrow=2)
 E.rotat <- rotat%*% E
 

 points(E.rotat[1,],E.rotat[2,],pch=16,col="blue",cex=0.8)
 text(E.rotat[1,odd]+0.2,E.rotat[2,odd]+0.2,odd,col="dark green",cex=0.9)

More Explaination

  • Every points are moved \(\frac{\pi}{6}\) counter-clock-wise of \((0,0)\)
  • some points are overlapped. Try \(\theta = \frac{\pi}{7}\)

04: SVD

Single Value Decomposition (SVD) is sequential projection which combination of rotate/scale/re-rotate. This is one way to interprete multiplication of matrices
For example, consider matrix \(\mathbf{A}\) that can be decomposed into:

\[ \mathbf{T}=\mathbf{U} \, \boldsymbol{\Lambda} \, \mathbf{V}^{T} \qquad \qquad \begin{bmatrix} 3 & 1 \\ 1 & 3 \end{bmatrix} = \begin{bmatrix} \sqrt{\frac{1}{2}} & \sqrt{\frac{1}{2}} \\ \sqrt{\frac{1}{2}} & - \sqrt{\frac{1}{2}} \end{bmatrix} \, \begin{bmatrix} 4 & 0 \\ 0 & 2 \end{bmatrix} \, \begin{bmatrix} \sqrt{\frac{1}{2}} & \sqrt{\frac{1}{2}} \\ \sqrt{\frac{1}{2}} & - \sqrt{\frac{1}{2}} \end{bmatrix} \]

The multiplication of matrices \(\mathbf{T}\) and \(\mathbf{E}\), particularly

\[ \mathbf{T} \, \mathbf{E} = \mathbf{U} \, \mathbf{\Lambda} \, \mathbf{E_{V}} \]

where,

\[ \mathbf{E_{V}} \equiv \begin{bmatrix} \sqrt{\frac{1}{2}} & \sqrt{\frac{1}{2}} \\ \sqrt{\frac{1}{2}} & - \sqrt{\frac{1}{2}} \end{bmatrix} \, \begin{bmatrix} 0.50 & 0.87 & 1.00 & 0.87 & 0.50 & 0.00 & -0.50 \\ 0.87 & 0.50 & 0.00 & -0.50 & -0.87 & -1.00 & -0.87 \\ \end{bmatrix} \]

  ##-- plot points in matrix E   
 plot(E[1,],E[2,],pch=16,col="black",cex=0.8,
       xlim=c(-4.0,4.0),ylim=c(-4.0,4.0),
       xlab="x-axis",ylab="y-axis")
 abline(h=seq(-4,4,0.5),col="gray")
 abline(v=seq(-4,4,0.5),col="gray")
 
 T <- matrix(c(3,1,1,3),ncol=2) 
 
 ##-- plot points of matrix E after projection T 
 E.fin <- T %*% E
 points(E.fin[1,],E.fin[2,],pch=16,col="brown",cex=0.5 )
 
 
 ##-- svd decomposition; Why -1?
 T.svd  <- svd(T)
 
 rotat1 <-  -1*T.svd$v
 scale  <-  T.svd$d*diag(2)
 rotat2 <-  -1*T.svd$u
 
 ##-- decomposition of multipication. 
 E.rotat1 <- rotat1 %*% E
 E.rotScl <- scale %*% E.rotat1  
 E.fin2   <- rotat2%*%E.rotScl
 
 odd <- seq(from=1,to=ncol(E),by=2)
 
 ##-- projection of matrix E after first rotation 
 points(E.rotat1[1,],E.rotat1[2,],pch=15,col="green",cex=0.8 )
 text(E.rotat1[1,odd],E.rotat1[2,odd],odd,col="black",cex=0.9)
 
 ##-- projection of matrix E after first rotation & scale
 points(E.rotScl[1,],E.rotScl[2,],pch=15,col="orange",cex=0.8 )
 text(E.rotScl[1,odd],E.rotScl[2,odd],odd,col="black",cex=0.9)
 
 ##-- projection of matrix E after first rotation & scale & second rotation 
 points(E.fin2[1,],E.fin2[2,],pch=1,col="brown",cex=1.2 )
 text(E.fin2[1,odd]+.3,E.fin2[2,odd]+.3,odd,col="brown",cex=1.5)


EXTRA

Based on this observation, what is geometric meaning when we multiply \(\mathbf{E}\) by non-square matrix \(\mathbf{A}\)?

\[ \mathbf{A} = \begin{bmatrix} 1 & 0 \\ sin(\frac{\pi}{6}) & -cos(\frac{\pi}{6})\\ cos(\frac{\pi}{6}) & sin(\frac{\pi}{6}) \end{bmatrix} \qquad , \qquad \mathbf{A} \, \mathbf{E} =? \]


Question 3

Consider SLE

\[ \mathbf{A} \mathbf{x} = \mathbf{b} \qquad \mbox{particularly, }\qquad \begin{bmatrix} 1 & 1 & 0 & 0 \\ 0 & 0 & 1 & 1 \\ 1 & 0 & 1 & 0 \\ 0 & 1 & 0 & 1 \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \\ x_3 \\ x_4 \end{bmatrix} = \begin{bmatrix} 2 \\ 3 \\ 4 \\ 1 \end{bmatrix} \]


  A <- matrix(c(1,0,1,0,1,0,0,1,0,1,1,0,0,1,0,1),ncol=4)

01: independent

a) show that \(\det(\mathbf{A}) = \textbf{0}\)
### SOLUTION TO QUESTION 2A ###

  ##-- find det of matrix A
  det(A) 

  ##-- determin rank of matrix A
  qr(A)$rank 

note

  • what are relationship between det() and rank()? (hint min(dim(A))

02: infinite solution

b) show that this linear has infinite number of solution
### SOLUTION TO QUESTION 2N ###
 identical( A %*% matrix( c(1,1,3,0),ncol=1) ,b)
 identical( A %*% matrix( c(2,0,2,1),ncol=1) ,b)

note

\[ \begin{bmatrix} 1 \\ 1 \\3 \\0 \end{bmatrix} \qquad \mbox{and} \qquad \begin{bmatrix} 2 \\ 0 \\ 2 \\ 1 \end{bmatrix} \]

note

  • How to find these two vectors?
  • Does the average of these two vectors work?

03: SLE

c) can you solve unique \(x\) for system of linear equation \(\mathbf{\acute{A} | \acute{b} }\), when \(\acute{\mathbf{A}}\) and \(\acute{\mathbf{b}}\) are reduced matrix \(\mathbf{A}\) and reduced vector \(\mathbf{b}\)
### SOLUTION TO QUESTION 2C ###

  
 nonBasic1 <- c(1,2,3)
 solve(A[nonBasic1,nonBasic1],b[nonBasic1])

note * Why need to set only one col/row to zero? * There are four possible non basis solution. Can you find other three?


Question 4

Plot the level set of the following functions (using R command contour())

01: \(f(x,y)\)

a) \(f(x,y) = (x + y)\,e^{-(x + y)}\)
### SOLUTION TO QUESTION 3A ###
  fExpr <- expression( (x+y)*exp( -(x+y)) )
  fFunc <- function(x,y){ }
  body(fFunc) <- fExpr
  xRange<- seq(0,5,0.5)
  yRange<- seq(0,5,0.5)  
  value <- outer(xRange,yRange,fFunc)
  colnames(value) <- yRange
  rownames(value) <- xRange
  contour(value,x=xRange,y=yRange)

note * where is minima (a point in which returns minimal value)? * can you find the root (\(f(x,y)=[0,0]^T\) ) of this function


02: \(g(x_1,x_2)\)

b) \(g(\mathbf{x}) = 100(x_2-x_1^2)^2 + (1-x_1)^2\)
### SOLUTION TO QUESTION 3B ###
  gExpr <- expression( 100*(x2-x1^2)^2 + (1-x1)^2 )
  gFunc <- function(x1,x2){ }
  body(gFunc) <- gExpr
  x1Range<- seq(-5,5,0.5)
  x2Range<- seq(-5,5,0.5)  
  value <- outer(x1Range,x2Range,gFunc)
  colnames(value) <- x1Range
  rownames(value) <- x2Range
  contour(value,x=x1Range,y=x2Range,levels = c(10,100,1000,10000))

note * where is minima (a point in which returns minimal value)? * can you find the root (\(g(x,y)=[0,0]^T\) ) of this function


Question 5

Consider function \(f: \mathbb{R} \rightarrow \mathbb{R}\)

\[f(x) = \frac{2x^6-4x^4+9x^2+12x-18}{x^4 + x^2 -12x - 12}\]


01: plot

a) plot function \(f(x)\) for \(x \in [-100,100]\) using R command
### SOLUTION TO QUESTION 4A ###

  fExpr <- expression( (2*x^6-4*x^4+9*x^2+12*x-18)/(x^4+x^2-12*x-12)  )

  curve((2*x^6-4*x^4+9*x^2+12*x-18)/(x^4+x^2-12*x-12),-5,5
        ,xlab="x",ylab="f(x)")
  abline(h=0,col="dark gray")
  abline(v=0,col="dark gray")

note

  • can you find root (\(f(x)=0\)) of this function

02: FOC

b) find the optimal solution using FOC
### SOLUTION TO QUESTION 4B ###
  fFunc <- function(x){}
  body(fFunc) <- fExpr
  
  fFirst <- function(x){}
  body(fFirst) <- D(fExpr,"x")
  
  xOpt <- uniroot(fFirst,c(-1,4.5))
  xOpt
## $root
## [1] 3.198986
## 
## $f.root
## [1] -0.0009457353
## 
## $iter
## [1] 9
## 
## $init.it
## [1] NA
## 
## $estim.prec
## [1] 6.103516e-05
  xRange <- seq(-5,5,0.1)
  plot(xRange,fFunc(xRange),type="l",col="black")
  points(xRange,sapply(xRange,fFirst),type="l",col="green")
  abline(h=0,col="red")
  points(xOpt$root,fFunc(xOpt$root),pch=16,col="red",cex=2)

note

  • does the approach correct?
  • how do we get maxima and minima?

02: optimize()

c) find the optimal solution using optimize()
### SOLUTION TO QUESTION 4B ###
  optimize(fFunc,c(-100,100),maximum = F)
## $minimum
## [1] 2.437623
## 
## $objective
## [1] -170441.8

note

  • how this function difference from first-order derivative?
  • can we apply to get all maxima and minima?

Question 6

find first- and second-order derivative of following functions and evaluate them at given coordinate

01: \(f_1(x)\)

evaluate function \(f_1(x)\) at \(x=3\)

\[ f_1(x) = x^2-7x+3e^xsin(\frac{1}{6}\pi \cdot x) \]

  f1.expr <- expression(x^2 - 7*x + 3*exp(x)*sin( 1/6 * pi * x))

  eval(D(f1.expr,"x"),envir=list(x=3))
## [1] 59.25661
  eval(D(D(f1.expr,"x"),"x"),envir=list(x=3)) 
## [1] 45.73692
  firstText <- c("fist-order: ", eval(D(f1.expr,"x"),envir=list(x=3)) )
  
  seconText <-c("second-order:",eval(D(D(f1.expr,"x"),"x"),envir=list(x=3)))
  
  print(firstText)
## [1] "fist-order: "    "59.256610769563"
  print(paste(seconText,collapse=""))
## [1] "second-order:45.7369188016184"

02: \(f_2(x_1,x_2)\)

evaluate function \(f_2(x1,x2)\) at \((x1,x2)=(3,5)\)

\[ f_2(x_1,x_2) = x_1^2 + sin(\frac{1}{2}\pi \cdot x1 \cdot x2) \cdot x1 \cdot x2 + x_2^2 \]

  f2.expr <- expression(x1^2 + sin(1/2*pi*x1*x2) * x1*x2 + x2^2)
  eval(D(f2.expr,"x1"),envir=list(x1=1,x2=1))
## [1] 3
  eval(D(f2.expr,"x1"),envir=list(x1=1,x2=1))
## [1] 3
  eval(D(D(f2.expr,"x1"),"x1"),envir=list(x1=1,x2=1)) 
## [1] -0.4674011
  eval(D(D(f2.expr,"x1"),"x2"),envir=list(x1=1,x2=1)) 
## [1] -1.467401
  eval(D(D(f2.expr,"x2"),"x1"),envir=list(x1=1,x2=1)) 
## [1] -1.467401
  eval(D(D(f2.expr,"x2"),"x2"),envir=list(x1=1,x2=1)) 
## [1] -0.4674011

note

  • can you arrange the results into gradient vector and hessian matrix?
  • apply deriv3()

03: \(f_3(x_1,x_2,x_3)\)

evaluate function \(f_3(x_1,x_2,x_3)\) at \((x_1,x_2,x_3)=(0,0,0)\)

\[ f_1(x_1,x_2,x_3) = 100(x_3 - x_2^2)^2 + (x_2 -1)^2 + 10(x_2 - x_1^2)^2 + (x_1 -1)^2 \]

  f3.expr <- expression( 100*(x3-x2^2)^2 + (x2-1)^2 + 100*(x2-x1^2)^2 + (x1-1)^2 )

  f3.grad <- function(x1,x2,x3){
    c(eval(D(f3.expr,"x1"),envir=list(x1,x2,x3)),
      eval(D(f3.expr,"x2"),envir=list(x1,x2,x3)),
      eval(D(f3.expr,"x3"),envir=list(x1,x2,x3))
    )
  }
  ##-- this code block is NOT executed  
  f3.hess <- function(x1,x2,x3){
     return("???")
  }

note

  • this function is called Rosendbrock function (Banana function) the minimal value of function is \(f_3(x_1=1,x_2=1,x_3=1) = 0\)
  • apply deriv3() and compare the result

Question 7

find the gradient vector and hessian of the following function at given points.

01: \(f_1(x,y)\)

a) \(f_1(x,y) = x^2 +y^3-3\,x- 3\,y + 5\) at \((-1,2)\)
  ##-- this code block is NOT executed
  f1Expr <- expression(x^2 + y^3 - 3*x-3*y+5)
  ##-- this code block is NOT executed
  f1Expr <- expression(x^2 + y^3 - 3*x-3*y+5)
  f1All  <- function(x,y){}
  body(f1All) <- deriv3(f1Expr,c("x","y"))
  f1All.value <- f1All(-1,2)
  f1All.grad  <- as.vector(attr(f1All.value,"gradient"))
  f1All.hess  <- matrix(attr(f1All.value,"hessian"),ncol=2)

note

  • Does hessian matrix is positive definite?

02: \(f_2(x,y,z)\)

b) \(f_2(x,y,z) = 100(y-x^2)^2 + (1-x^2)\) at \((1/2,4/3)\)
  ##-- this code block is NOT executed
  f2Expr <- expression( 100*(y-x^2)^2 + (1-x^2) )
  ##-- this code block is NOT executed
  f2Expr <- expression( 100*(y-x^2)^2 + (1-x^2) )
  f2All  <- function(x,y){}
  body(f2All) <- deriv3(f2Expr,c("x","y"))
  f2All.value <- f2All(1/2,4/3)
  f2All.grad  <- as.vector(attr(f2All.value,"gradient"))
  f2All.hess  <- matrix(attr(f2All.value,"hessian"),ncol=2)

note

  • Does hessian matrix is positive definite?

03: \(f_3(x,y,z)\)

c) \(f_3(x,y,z) = \cos(x\,y) - \sin(x\,z)\) at \((0,\pi,\pi/2)\)
    ##-- this code block is NOT executed
  f3Expr <- expression( cos(x*y)-sin(x*z) )
    ##-- this code block is NOT executed
  f3Expr <- expression( cos(x*y)-sin(x*z) )
  f3All  <- function(x,y,z){}
  body(f3All) <- deriv3(f3Expr,c("x","y","z"))
  f3All.value <- f3All(0,pi,pi/2)
  f3All.grad  <- as.vector(attr(f3All.value,"gradient"))
  f3All.hess  <- matrix(attr(f3All.value,"hessian"),ncol=3)

note

  • Does hessian matrix is positive definite?


Copyright 2019   Oran Kittithreerapronchai.   All Rights Reserved.   Last modified: 2023-51-13,