\documentclass[12pt]{article}
\include{preamble}
\newtoggle{spacingmode}
%\toggletrue{spacingmode} %STUDENTS: DELETE or COMMENT this line
\newtoggle{professormode}
%\toggletrue{professormode} %STUDENTS: DELETE or COMMENT this line
\newcommand{\spc}[1]{\iftoggle{spacingmode}{\\ \vspace{#1cm}}}
\title{MATH 241 Spring 2015 Homework \#5}
\author{Student Jason Farkas Section B} %STUDENTS: write your name here
\iftoggle{professormode}{
\date{Due 5PM, Tuesday, Mar 17, 2015 \\ \vspace{0.5cm} \small (this document last updated \today ~at \currenttime)}
}
\renewcommand{\abstractname}{Instructions and Philosophy}
\begin{document}
\maketitle
\iftoggle{professormode}{
\begin{abstract}
The path to success in this class is to do many problems. Unlike other courses, exclusively doing reading(s) will not help. Coming to lecture is akin to watching workout videos; thinking about and solving problems on your own is the actual ``working out''. Feel free to \qu{work out} with others; \textbf{I want you to work on this in groups.}
Reading is still \textit{required}. But for this homework set, I can't find anything from the 7th edition of Ross except the first few pages of Chapter 4 that are \qu{worth it} for you to read.
The problems below are color coded: \ingreen{green} problems are considered \textit{easy} and marked \qu{[easy]}; \inorange{yellow} problems are considered \textit{intermediate} and marked \qu{[harder]}, \inred{red} problems are considered \textit{difficult} and marked \qu{[difficult]}, \inpurple{purple} problems are extra credit. The \textit{easy} problems are intended to be ``giveaways'' if you went to class. Do as much as you can of the others; I expect you to at least attempt the \textit{difficult} problems.
This homework is worth 100 points but the point distribution will not be determined until after the due date. Late homework will be penalized 10 points per day.
Between 1--15 points are arbitrarily given as a bonus (conditional on quality) if the homework is typed using \LaTeX. Links to instaling \LaTeX~and program for compiling \LaTeX~is found on the syllabus. You are encouraged to use \url{overleaf.com} (make sure you upload both the hwxx.tex and the preamble.tex file). If you are handing in homework this way, read the comments in the code; there are two lines to comment out and you should replace my name with yours and write your section. If you are asked to make drawings, you can take a picture of your handwritten drawing and insert as a figure or leave space using the \qu{$\backslash$vspace} command and draw them in after printing or attach them stapled.
The document is available with spaces for you to write your answers. If not using \LaTeX, print this document and write in your answers. \textbf{Handing it in without this printout is NO LONGER ACCEPTABLE.} Keep this page printed for your records. Write your name and section below where section A is if you're registered for the 9:15AM--10:30AM lecture and section B is if you're in the 12:15PM-1:30PM lecture.
\end{abstract}
\thispagestyle{empty}
\vspace{1cm}
NAME: \line(1,0){250} ~~SECTION (A or B): \line(1,0){35}
\pagebreak
}
\iftoggle{professormode}{
\paragraph{Random Variables} We now begin question about the second unit of this class: r.v.'s. \\ \\
}
\problem In class we spoke about how random variables map outcomes from the sample space to a number \ie $X: \Omega \rightarrow \reals$. That is they are set functions, just like the probability function which is $\mathbb{P}: 2^\Omega \rightarrow \zeroonecl$. We will be investigating this concept here.
\iftoggle{professormode}{
\begin{figure}[htp]
\centering
\includegraphics[width=2.5in]{rv.jpg}
\end{figure}
\FloatBarrier
}
\begin{enumerate}
\easysubproblem Here is a way to produce $X \sim \bernoulli{\half}$ using the $\Omega$ from a roll of a die. Map outcomes 1,2,3 to 0 and outcomes 4,5,6 to 1. This works because
\beqn
&&\prob{X=0} = \prob{\braces{\omega : X(\omega) = 0}} = \prob{\braces{1} \cup \braces{2} \cup \braces{3}} = 1/2 ~~\text{and} \\
&&\prob{X=1} = \prob{\braces{\omega : X(\omega) = 1}} = \prob{\braces{4} \cup \braces{5} \cup \braces{6}} = 1/2.
\eeqn
Describe three other scenarios or devices that produce their own $\Omega$'s that also result in $X \sim \bernoulli{\half}.$ \spc{6}\newline ANSWER: Flipping two coins where we map outcome of landing on the same side to and landing on opposite sides as 0. Having a bag of 10 balls where 5 are blue and choosing 1 ball at random where we map other colors to 1 and the color blue balls to 0. Lastly, Where we spin a wheel of two colors that turns evenly and one color maps to 0, the other to 1.
\intermediatesubproblem We talked about in class how the sample space no longer needs to be considered once the random variable is described. Why? Use your answer to (a) to inspire this answer. Write it \textit{in English} below. \spc{3} \newline ANASWER: All we care about is the results which pop out, we don't care how we got to the results of mapping to a 1 and mapping to a zero, at the end of the day they are the same thing. We don't care what went on, only the results of the mapping.
\hardsubproblem Back to philosophy... Let's say $X$ models the price difference that IBM stock moves in one day of trading. For instance, if the stock closed yesterday at \$56.24 and today it closed at \$57.24, the random variable would be \$1 for today. According to our definition of a random variable, there is a sample space with outcomes being drawn ($\omega \in \Omega$) that is \qu{controlling} the value of $X$. Describe it the best you can \textit{in English}. There are no right or wrong answers here, but your answer must be coherent and demonstrate you understand the question. \spc{6} \newline ANSWER: There is a sample space of all the possible price differences that could have occurred between the two days. A difference of a \$1 is one potential outcome, which was actually drawn.
\end{enumerate}
\problem We will now study probability mass functions (PMF's) denoted as $p(x)$ and cumulative distribution functions (CDF's) denoted as $F(X)$ and review the r.v.'s we did in class.
% pretty good link to help http://sites.stat.psu.edu/~lsimon/stat250/sp98/handouts/handout07.pdf
\begin{enumerate}
\easysubproblem Draw the PMF for $X \sim \bernoulli{p}$. \spc{4}
\easysubproblem Draw the CDF for $X \sim \uniformdiscrete{1,3,4,9}$. \spc{6}
\intermediatesubproblem Using the r.v. from the previous question, what is $\prob{X \in (3,9)}$? I am trying to trick you here. \spc{3}
\easysubproblem Take a r.v. $X$ with $\support{X} = \zeroonecl$. Is this a \qu{discrete r.v.?} Yes / no and explain. \spc{1.5}\newline ANSWER: No because it does not have a countable infinite amount in its support.
\hardsubproblem In class we defined the Bernoulli r.v. as:
\beqn
X \sim \begin{cases}
1 \withprob p \\
0 \withprob 1-p
\end{cases}
\eeqn
and put its PMF on the board. Write $p(x)$ for $X \sim \bernoulli{p}$ that is only valid for not only all values in the $\support{X}$ but all values in $\reals$. Use the indicator function and set theory notation. \spc{3}\newline \includegraphics[width=4in]{indica1__14.png}% check size and answer
\hardsubproblem What is the parameter space of $X$ where $X \sim \bernoulli{p}$ and why? \spc{3} \newline ANSWER: $X \in [0,1] $ This is because it take a parameter of a probability which is only in between 0 and 1. Although 0 and 1 are degenerate.
\hardsubproblem Sometimes knowing the $\Omega$ matters a little bit. Let's say $X_1 \sim \bernoulli{\half}$ is generated from one coin and $X_2 \sim \bernoulli{\half}$ is generated from another coin independently tossed. Create a new r.v. $T = X_1 + X_2$. Describe its PMF using the $\sim$ notation like in the previous problem. Thus write \qu{$T \sim$} something. \spc{8} \newline ANSWER: The number represents the number of Heads, ( but could also be the number of tails. \newline \beqn
T \sim \begin{cases}
0 \withprob \frac{1}{4} \\
1 \withprob \frac{1}{2} \\
2 \withprob \frac{1}{4}
\end{cases}
\eeqn % NO CLUE IF THIS IS CORRECT
\hardsubproblem Consider the PMF we discussed for $X \sim \bernoulli{\half}$. Does $\myint{x}{}{}{p(x)} = F(x) + C$ where the constant $C \in \reals$? Explain. Think carefully about what integration really means. \spc{6} \newline ANSWER: No, it is discrete and therefore there is no curve to take the area under it. % TALK TO ERIC ABOUT THIS It may be no due to horizontal vs vertical
\hardsubproblem How about the opposite? Consider the CDF we discussed for $X \sim \bernoulli{\half}$. Does $\text{d} / \text{d}x[F(x)] = p(x)$? Explain. Think carefully about what differentiation really means. \spc{4} \newline ANSWER: No, the limit from the left doesn't equal the limit from the right % No Clue...
\end{enumerate}
\iftoggle{professormode}{
\paragraph{Hypergeometric Distribution} Since we haven't covered much else, this majority of this assignment will be about this distribution.\\ \\
}
\problem The hypergeometric is sampling \qu{without replacement.} Imagine you have this bag of marbles with 37 marbles and 17 of them are black. We will define a \qu{success} as drawing a black marble.
\iftoggle{professormode}{
\begin{figure}[htp]
\centering
\includegraphics[width=2.5in]{marble.jpg}
\end{figure}
\FloatBarrier
}
\begin{enumerate}
\easysubproblem Let's say you draw one marble. Call this r.v. $X$. Is it hypergeometric? \spc{0.2} \newline ANSWER: Yes. % confirm with eric
\easysubproblem The hypergeometric distribution has three parameters. What are the parameters for $X$? \spc{2} \newline ANSWER: The parameters indicate the size of the sample $n$ -which is 1 in the last example, $K$ - the amount of successes in the total , and $N$ the total amount in the group. There are also other potential parameters that will provide the same information .
\easysubproblem Write, but do not draw, the PDF, $p(x)$ for the r.v. $X$ where $x$ is the number of successes. \spc{2} % IDK HOW
\newline $p(x) $ = $ \frac{ {17 \choose{x}} { {37-17} \choose {1-x} } }{ {37 \choose {1} } } $
\easysubproblem What is the support of this r.v.? \spc{2} \newline ANSWER: $\support{X} = \{ 0, ... , 1 \} $%$\support{X} = \{ \max{0, n-(N-k)}, ... , \min {n,k} \} $
% VERY CONFUSED BETWEEN THESE TWO
\intermediatesubproblem There is another variable we learned about in class with this same support. Show that $X$ is distributed as this type of r.v. and find its parameter(s). \spc{2}\newline ANSWER: This is a $X \sim \bernoulli{\frac{17}{37}}$ % $X_1 \sim \bernoulli{N/K}$
\easysubproblem Now imagine you draw 4 marbles without replacement. Call this r.v. $X$ (and forget about the previous r.v. $X$ from this question, parts a-e). How is $X$ distributed? Use the notation in class and find its parameters. \spc{2} \newline ANSWER: $ X \sim Hyper(4,17,37) $
\easysubproblem What is the support of $X$? \spc{2} \newline $\support{X} = \{ 0, ... , 4 \} $ %$\support{X} = \{ \max{0, n-(N-k)}, ... , \min {n,k} \} $
\easysubproblem Write, but do not draw, the PMF of $X$. \spc{3} \newline $p(x) $ = $ \frac{ {17 \choose{x}} { {20} \choose {4-x} } }{ {37 \choose {4} } } $
\easysubproblem Draw the PMF of $X$. \spc{6}
\easysubproblem Draw the CDF of $X$. \spc{6}
\easysubproblem What is the probability of getting 4 successes in a row? Use the PMF. \spc{3} \newline .036
\easysubproblem What is the probability of getting 4 successes in a row? Use conditional probability. This should yield the same answer. \spc{3} \newline ANSWER: I'm not sure by what you mean to use conditional probability. What condition?
\easysubproblem Now imagine you draw 27 marbles without replacement. Call this r.v. $X$ (and forget about the previous r.v. $X$). How is $X$ distributed? Use the notation in class and find its parameters. \spc{2} \newline ANSWER: $ X \sim Hyper(27,17,37) $
\easysubproblem What is the support of $X$? Why is $0 \notin \support{X}$? \spc{4} \newline ANSWER: $\support{X} = \{ 7, ... , 17 \} $ It is impossible to choose a total of 27 failures out of a maximum of 20 failures. Therefore there will be a minimum of at least 7 successes chosen. %$\support{X} = \{ \max{0, n-(N-k)}, ... , \min {n,k} \} $
\easysubproblem Write, but do not draw, the PMF of $X$. \spc{3} \newline ANSWER: $p(x) $ = $ \frac{ {17 \choose{x}} { {20} \choose {27-x} } }{ {37 \choose {27} } } $
\hardsubproblem Find the mode of this distribution. \qu{Mode} is defined as the most likely outcome result. \spc{4} \newline ANSWER: You are most likely to get 12 successes
\end{enumerate}
\problem Generally, the hypergeometric has three parameters. We will solve for its support here under several disjoint conditions and then in class we will generalize it. Call $X$ a hypergeometric r.v. with all its parameters free - meaning they can take on any value, so please use the notation $n,~K,~N$ in your answers as we did in class.
\begin{enumerate}
\easysubproblem Using the usual parameterization of the hypergeometric, describe the parameter space. You need to say what sets each of the parameters \qu{lives} in. \spc{4} \newline ANSWER: $ N \in \{ 2,3,4,.... \} $ \newline $ K \in \{ 1,2,....,N-1 \} $ \newline $ n \in \{ 2,3,...., N-1 \} $
\easysubproblem Write, but do not draw, the PMF of $X$. \spc{3} \newline ANSWER: $p(x) $ = $ \frac{ {K \choose{x}} { {N-K} \choose {n-x} } }{ {N \choose {n} } } $
\intermediatesubproblem $x$ is the free variable in $p(x)$ which you wrote in (b) and it designates the number of successes. Show that successes and failure are essentially the same thing by finding $p(n-x)$ and replacing $K$ with $N-K$. What does this teach you? \spc{2.5} \newline \newline $p(n- x) $ = $ \frac{ {{N-K} \choose{n-x}} { {K} \choose {x} } }{ {N \choose {n} } } $ This shows that what we call a success or failure is arbitrary, and doesn't effect the math which is based on selected a desired result from a larger group without replacement.
\intermediatesubproblem Let's say $n \leq K$ and $n \leq N-K$. What is the support of $X$ in this situation? \spc{3} \newline ANSWER: \{ 0,1,...,n \}
\intermediatesubproblem Let's say $n \leq K$ and $n > N-K$. What is the support of $X$ in this situation? \spc{3} \newline ANSWER: \{ n-(N-K) ,..., n \}
\intermediatesubproblem Let's say $n > K$ and $n \leq N-K$. What is the support of $X$ in this situation? \spc{3} \newline ANSWER: \{ 0,...,K \}
\hardsubproblem Let's say $n > K$ and $n > N-K$. What is the support of $X$ in this situation? \spc{2.5} \newline ANSWER: \{n-(N-K),..., K \}
\extracreditsubproblem Describe the CDF of the general hypergeometric r.v. \spc{6}
\end{enumerate}
\problem We will look at hypergeometric distributions with large $N$. If $N$ is really large, sampling without replacement can be approximated by sampling with replacement. In the limit, it is sampling with replacement.
\begin{enumerate}
\easysubproblem We will now begin deriving the binomial in pieces. Parameterize a hypergeometric by setting $K = pN$. What is the parameter space for $p$? \spc{2} \newline ANSWER: $ X \sim Hyper(n,pN,N) $ \newline parameter space of $p \in \{ \frac{1}{2}, ..., \frac{K}{N}, ...., \frac{N-1}{N} \}$
\easysubproblem Write the PMF $p(x)$ for this r.v. using the $p$ parameterization using $x$ as the free variable. \spc{3} \newline ANSWER: = $ \frac{ {{N-pN} \choose{n-x}} { {pN} \choose {x} } }{ {N \choose {n} } } $
\easysubproblem What limit do we take and why are we taking this limit? \spc{3} \newline ANSWER: We take the limit as $ N -> \infty $because we are looking at a really large $N$
\easysubproblem Rewrite the PMF without choose notation using only factorials and simplify the fraction by moving the factorial terms from denominator, $\binom{N}{n}$, to the numerator. \spc{4}\newline ANSWER: $ \frac{ (N-pN)! (pN)! n! (N-n)! }{ (n-x)! (N-pN-n+x)! x! (pN-x)! } $
\easysubproblem Which three terms can you factor out from the limit expression? Show that they are equivalent to $\binom{n}{x}$. \spc{7} \newline ANSWER: Since n! and x! and (n-x)! are constants they are irrelevant when taking the limit as $N -> \infty $ and therefore can be factored out to $ \frac{ n! }{ (n-x)! x! } = {{n} \choose{x}} $% END HERE
\end{enumerate}
\end{document}