2 Sets, Operations, and Functions
2.1 Sets
Sets are the fundamental building blocks of mathematics. Events are not inherently numerical: the onset of war or the stock market crashing is not inherently a number. Sets can define such events, and we wrap math around so that we have a transparent language to communicate about those events. Combining sets with operations, relations, metrics, measures, etc… allows us to define useful mathematical structures. For example, the set of real numbers (\(\mathbb{R}\)) has a notion of order as well as defined operations of addition and multiplication.
Set : A set is any well defined collection of elements. If \(x\) is an element of \(S\), \(x \in S\).
Examples:
- The set of choices available to a player in Rock-Paper-Scissors \(\{\text{Rock}, \text{Paper}, \text{Scissors}\}\)
- The set of possible outcomes of a roll of a six-sided die \(\{1, 2, 3, 4, 5, 6\}\)
- The set of all natural numbers \(\mathbb{N}\)
- The set of all real numbers \(\mathbb{R}\)
Common mathematical notation relevant to sets:
- \(\in\) = “is an element of”; \(\notin\) = “is not an element of”
- \(\forall\) = “for all” (univeral quantifier)
- \(\exists\) = “there exists” (existential quantifier)
- \(:\) = “such that”
Subset: If every element of set \(A\) is also in set \(B\), then \(A\) is a subset of \(B\). \(A \subseteq B\). If, in addition to being a subset of \(B\), \(A\) is not equal to \(B\), \(A\) is a proper subset \(A \subset B\).
Empty Set: a set with no elements. \(S = \{\}\). It is denoted by the symbol \(\emptyset\).
Cardinality: The cardinality of a set \(S\), typically written \(|S|\) is the number of members of \(S\).
Many sets are infinite. For example, \(\mathbb{N}\) the set of natural numbers \(\mathbb{N} = \{0, 1, 2, 3, 4, \dotsc\}\)
- Sets with cardinality less than \(|\mathbb{N}|\) are countable
- Sets with the same cardinality as \(|\mathbb{N}|\) are countably infinite
- Sets with greater cardinality than \(|\mathbb{N}|\) are uncountably infinite (e.g. the real numbers).
Set operations:
- Union: The union of two sets \(A\) and \(B\), \(A \cup B\), is the set containing all of the elements in \(A\) or \(B\). \(A_1 \cup A_2 \cup \cdots \cup A_n = \bigcup_{i=1}^n A_i\)
- Intersection: The intersection of sets \(A\) and \(B\), \(A \cap B\), is the set containing all of the elements in both \(A\) and \(B\). \(A_1 \cap A_2 \cap \cdots \cap A_n = \bigcap_{i=1}^n A_i\)
- Complement: If set \(A\) is a subset of \(S\), then the complement of \(A\), denoted \(A^C\), is the set containing all of the elements in \(S\) that are not in \(A\).
Properties of set operations:
- Commutative: \(A \cup B = B \cup A\); \(A \cap B = B \cap A\)
- Associative: \(A \cup (B \cup C) = (A \cup B) \cup C\); \(A \cap (B \cap C) = (A \cap B) \cap C\)
- Distributive: \(A \cap (B \cup C) = (A \cap B) \cup (A \cap C)\); \(A \cup (B \cap C) = (A \cup B) \cap (A \cup C)\)
- de Morgan’s laws: \((A \cup B)^C = A^C \cap B^C\); \((A \cap B)^C = A^C \cup B^C\)
- Disjointness: Sets are disjoint when they do not intersect, such that \(A \cap B = \emptyset\). A collection of sets is pairwise disjoint (mutually exclusive) if, for all \(i \neq j\), \(A_i \cap A_j = \emptyset\). A collection of sets form a partition of set \(S\) if they are pairwise disjoint and they cover set \(S\), such that \(\bigcup_{i = 1}^k A_i = S\).
Example 2.1 (Sets) Let set \(A\) be \(\{1, 2, 3, 4\}\), \(B\) be \(\{3, 4, 5, 6\}\), and \(C\) be \(\{5, 6, 7, 8\}\). Sets \(A\), \(B\), and \(C\) are all subsets of the \(S\) which is \(\{1, 2, 3, 4, 5, 6, 7, 8, 9, 10\}\)
Write out the following sets:
- \(A \cup B\)
- \(C \cap B\)
- \(B^c\)
- \(A \cap (B \cup C)\)
Exercise 2.1 (Sets) Suppose you had a pair of four-sided dice. You sum the results from a single toss.
What is the set of possible outcomes?
Consider subsets \(A=\{2, 8\}\) and \(B=\{2,3,7\}\) of the sample space you found. What is
- \(A^c\)
- \((A \cup B)^c\)
2.2 Metric spaces
A metric space is a set that has a notion of distance - called a “metric” - defined between any two elements (sometimes referred to as “points”).
The distance function \(d(x,y)\) defines the distance between element \(x\) and element \(y\)
- The real numbers \(\mathbb{R}\) have a single distance function: \(d(x,y) = |x - y|\)
- In higher-dimensional real space (e.g. \(\mathbb{R}^2)\), we can define multiple distance metrics between \(x=(x_1, x_2)\) and \(y=(y_1, y_2)\)
- “Euclidean” distance: \(d(x, y) = \sqrt{(x_1 - y_1)^2 + (x_2 - y_2)^2}\)
- “Taxicab” distance: \(d(x, y) = |x_1 - y_1| + |x_2 - y_2|\)
- Chebyshev distance: \(d(x, y) = \text{max}\{|x_1 - y_1| + |x_2 - y_2|\}\)
- All of these generalize to \(\mathbb{R}^n\)
A metric is a function that satisfies the following axioms
- A distance between a point and itself is zero \(d(x,x) = 0\)
- The distance between two points is strictly positive \(d(x,y) > 0 \forall x \neq y\)
- Distance from \(x\) to \(y\) is the same as the distance from \(y\) to \(x\) (\(d(x,y) = d(y,x)\))
- The “triangle inequality” holds: \(d(x,z) \le d(x,y) + d(y,z)\)
Once we have a metric space, we can define some additional useful concepts
Ball: A ball of radius \(r\) centered at \(x_0\) is a set that contains all points with a distance less than \(r\) from \(x_0\).
Sphere: A sphere of radius \(r\) centered at \(x_0\) is the set that contains all points with a distance exactly \(r\) from \(x_0\).
Interior Point: The point \(x\) is an interior point of the set \(S\) if \(x\) is in \(S\) and if there is some \(\epsilon\)-ball around \(x\) that contains only points in \(S\). The interior of \(S\) is the collection of all interior points in \(S\). The interior can also be defined as the union of all open sets in \(S\).
- If the set \(S\) is circular, the interior points are everything inside of the circle, but not on the circle’s rim.
- Example: The interior of the set \(\{ (x,y) : x^2+y^2\le 4 \}\) is \(\{ (x,y) : x^2+y^2< 4 \}\) .
Boundary Point: The point \(\mathbf x\) is a boundary point of the set \(S\) if every \(\epsilon\)-ball around \(\mathbf x\) contains both points that are in \(S\) and points that are outside \(S\). The boundary is the collection of all boundary points.
- If the set \(S\) is circular, the boundary points are everything on the circle’s rim.
- Example: The boundary of \(\{ (x,y) : x^2+y^2\le 4 \}\) is \(\{ (x,y) : x^2+y^2 = 4 \}\).
Open: A set \(S\) is open if for each point \(\mathbf x\) in \(S\), there exists an open \(\epsilon\)-ball around \(\mathbf x\) completely contained in \(S\).
- If the set \(S\) is circular and open, the points contained within the set get infinitely close to the circle’s rim, but do not touch it.
- Example: \(\{ (x,y) : x^2+y^2<4 \}\)
Closed: A set \(S\) is closed if it contains all of its boundary points.
- Alternatively: A set is closed if its complement is open.
- If the set \(S\) is circular and closed, the set contains all points within the rim as well as the rim itself.
- Example: \(\{ (x,y) : x^2+y^2\le 4 \}\)
- Note: a set may be neither open nor closed. Example: \(\{ (x,y) : 2 < x^2+y^2\le 4 \}\)
2.3 Operators; Sum and Product notation
Addition (+), Subtraction (-), multiplication and division are basic operations of arithmetic. In statistics or calculus, we will often want to add a sequence of numbers that can be expressed as a pattern without needing to write down all its components. For example, how would we express the sum of all numbers from 1 to 100 without writing a hundred numbers?
For this we use the summation operator \(\sum\) and the product operator \(\prod\).
Summation:
\[\sum\limits_{i=1}^{100} x_i = x_1+x_2+x_3+\cdots+x_{100}\]
The bottom of the \(\sum\) symbol indicates an index (here, \(i\)), and its start value \(1\). At the top is where the index ends. The notion of “addition” is part of the \(\sum\) symbol. The content to the right of the summation is the meat of what we add. While you can pick your favorite index, start, and end values, the content must also have the index.
A few important features of sums:
- \(\sum\limits_{i=1}^n c x_i = c \sum\limits_{i=1}^n x_i\)
- \(\sum\limits_{i=1}^n (x_i + y_i) = \sum\limits_{i=1}^n x_i + \sum\limits_{i=1}^n y_i\)
- \(\sum\limits_{i=1}^n c = n c\)
Product:
\[\prod\limits_{i=1}^n x_i = x_1 x_2 x_3 \cdots x_n\]
Properties:
- \(\prod\limits_{i=1}^n c x_i = c^n \prod\limits_{i=1}^n x_i\)
- \(\prod\limits_{i=k}^n c x_i = c^{n-k+1} \prod\limits_{i=k}^n x_i\)
- \(\prod\limits_{i=1}^n (x_i + y_i) =\) a total mess
- \(\prod\limits_{i=1}^n c = c^n\)
Other Useful Operations
Factorials!:
\[x! = x\cdot (x-1) \cdot (x-2) \cdots (1)\]
Modulo: Tells you the remainder when you divide the first number by the second.
- \(17 \mod 3 = 2\)
- \(100 \ \% \ 30 = 10\)
- Example 2.2 (Operators)
\(\sum\limits_{i=1}^{5} i =\)
\(\prod\limits_{i=1}^{5} i =\)
\(14 \mod 4 =\)
\(4! =\)
Exercise 2.2 (Operators) Let \(x_1 = 4, x_2 = 3, x_3 = 7, x_4 = 11, x_5 = 2\)
\(\sum\limits_{i=1}^{3} (7)x_i\)
\(\sum\limits_{i=1}^{5} 2\)
\(\prod\limits_{i=3}^{5} (2)x_i\)
2.4 Introduction to Functions
A function is a mapping, or transformation, that relates members of one set to members of another set. For instance, if you have two sets: set \(A\) and set \(B\), a function from \(A\) to \(B\) maps every value \(a\) in set \(A\) such that \(f(a) \in B\). Functions can be “many-to-one”, where many values or combinations of values from set \(A\) produce a single output in set \(B\), or they can be “one-to-one”, where each value in set \(A\) corresponds to a single value in set \(B\). A function by definition has a single function value for each element of its domain. This means, there cannot be “one-to-many” mapping.
Dimensionality: \({\mathbf R}^1\) is the set of all real numbers extending from \(-\infty\) to \(+\infty\) — i.e., the real number line. \({\mathbf R}^n\) is an \(n\)-dimensional space, where each of the \(n\) axes extends from \(-\infty\) to \(+\infty\).
- \({\mathbf R}^1\) is a one dimensional line.
- \({\mathbf R}^2\) is a two dimensional plane.
- \({\mathbf R}^3\) is a three dimensional space.
Points in \({\mathbf R}^n\) are ordered \(n\)-tuples (just means an combination of \(n\) elements where order matters), where each element of the \(n\)-tuple represents the coordinate along that dimension.
For example:
- \({\mathbf R}^1\): (3)
- \({\mathbf R}^2\): (-15, 5)
- \({\mathbf R}^3\): (86, 4, 0)
Examples of mapping notation:
Function of one variable: \(f:{\mathbf R}^1\to{\mathbf R}^1\)
- \(f(x)=x+1\). For each \(x\) in \({\mathbf R}^1\), \(f(x)\) assigns the number \(x+1\).
Function of two variables: \(f: {\mathbf R}^2\to{\mathbf R}^1\).
- \(f(x,y)=x^2+y^2\). For each ordered pair \((x,y)\) in \({\mathbf R}^2\), \(f(x,y)\) assigns the number \(x^2+y^2\).
We often use variable \(x\) as input and another \(y\) as output, e.g. \(y=x+1\)
Example 2.3 (Functions) For each of the following, state whether they are one-to-one or many-to-one functions.
For \(x \in [0,\infty]\), \(f : x \rightarrow x^2\) (this could also be written as \(f(x) = x^2\)).
For \(x \in [-\infty, \infty]\), \(f: x \rightarrow x^2\).
Exercise 2.3 (Functions) For each of the following, state whether they are one-to-one or many-to-one functions.
For \(x \in [-3, \infty]\), \(f: x \rightarrow x^2\).
For \(x \in [0, \infty]\), \(f: x \rightarrow \sqrt{x}\)
Some functions are defined only on proper subsets of \({\mathbf R}^n\).
- Domain: the set of numbers in \(X\) at which \(f(x)\) is defined.
- Range: elements of \(Y\) assigned by \(f(x)\) to elements of \(X\), or \(f(X)=\{ y : y=f(x), x\in X\}\) Most often used when talking about a function \(f:{\mathbf R}^1\to{\mathbf R}^1\).
- Image: same as range, but more often used when talking about a function \(f:{\mathbf R}^n\to{\mathbf R}^1\).
Some General Types of Functions
Monomials: \(f(x)=a x^k\)
\(a\) is the coefficient. \(k\) is the degree.
Examples: \(y=x^2\), \(y=-\frac{1}{2}x^3\)
Polynomials: sum of monomials.
Examples: \(y=-\frac{1}{2}x^3+x^2\), \(y=3x+5\)
The degree of a polynomial is the highest degree of its monomial terms. Also, it’s often a good idea to write polynomials with terms in decreasing degree.
2.5 Logarithms and Exponents
Exponential Functions: Example: \(y=2^x\)
Relationship of logarithmic and exponential functions: \[y=\log_a(x) \iff a^y=x\]
The log function can be thought of as an inverse for exponential functions. \(a\) is referred to as the “base” of the logarithm.
Common Bases: The two most common logarithms are base 10 and base \(e\).
- Base 10: \(\quad y=\log_{10}(x) \iff 10^y=x\). The base 10 logarithm is often simply written as “\(\log(x)\)” with no base denoted.
- Base \(e\): \(\quad y=\log_e(x) \iff e^y=x\). The base \(e\) logarithm is referred to as the “natural” logarithm and is written as ``\(\ln(x)\)“.
Properties of exponential functions:
- \(a^x a^y = a^{x+y}\)
- \(a^{-x} = 1/a^x\)
- \(a^x/a^y = a^{x-y}\)
- \((a^x)^y = a^{x y}\)
- \(a^0 = 1\)
Properties of logarithmic functions (any base):
Generally, when statisticians or social scientists write \(\log(x)\) they mean \(\log_e(x)\). In other words: \(\log_e(x) \equiv \ln(x) \equiv \log(x)\)
\[\log_a(a^x)=x\] and \[a^{\log_a(x)}=x\]
- \(\log(x y)=\log(x)+\log(y)\)
- \(\log(x^y)=y\log(x)\)
- \(\log(1/x)=\log(x^{-1})=-\log(x)\)
- \(\log(x/y)=\log(x\cdot y^{-1})=\log(x)+\log(y^{-1})=\log(x)-\log(y)\)
- \(\log(1)=\log(e^0)=0\)
Change of Base Formula: Use the change of base formula to switch bases as necessary: \[\log_b(x) = \frac{\log_a(x)}{\log_a(b)}\]
Example: \[\log_{10}(x) = \frac{\ln(x)}{\ln(10)}\]
You can use logs to go between sum and product notation. This will be particularly important when you’re learning how to optimize likelihood functions.
\[\begin{eqnarray*} \log \bigg(\prod\limits_{i=1}^n x_i \bigg) &=& \log(x_1 \cdot x_2 \cdot x_3 \cdots \cdot x_n)\\ &=& \log(x_1) + \log(x_2) + \log(x_3) + \cdots + \log(x_n)\\ &=& \sum\limits_{i=1}^n \log (x_i) \end{eqnarray*}\]
Therefore, you can see that the log of a product is equal to the sum of the logs. We can write this more generally by adding in a constant, \(c\):
\[\begin{eqnarray*} \log \bigg(\prod\limits_{i=1}^n c x_i\bigg) &=& \log(cx_1 \cdot cx_2 \cdots cx_n)\\ &=& \log(c^n \cdot x_1 \cdot x_2 \cdots x_n)\\ &=& \log(c^n) + \log(x_1) + \log(x_2) + \cdots + \log(x_n)\\\\ &=& n \log(c) + \sum\limits_{i=1}^n \log (x_i)\\ \end{eqnarray*}\]
Example 2.4 (Logarithms) Evaluate each of the following logarithms
\(\log_4(16)\)
\(\log_2(16)\)
Simplify the following logarithm. By “simplify”, we actually really mean - use as many of the logarithmic properties as you can.
- \(\log_4(x^3y^5)\)
Exercise 2.4 Evaluate each of the following logarithms
- \(\log_\frac{3}{2}(\frac{27}{8})\)
Simplify each of the following logarithms. By “simplify”, we actually really mean - use as many of the logarithmic properties as you can.
\(\log(\frac{x^9y^5}{z^3})\)
\(\ln{\sqrt{xy}}\)
2.6 Graphing Functions
What can a graph tell you about a function?
- Is the function increasing or decreasing? Over what part of the domain?
- How ``fast” does it increase or decrease?
- Are there global or local maxima and minima? Where?
- Are there inflection points?
- Is the function continuous?
- Is the function differentiable?
- Does the function tend to some limit?
- Other questions related to the substance of the problem at hand.
2.7 Solving for Variables and Finding Roots
Sometimes we’re given a function \(y=f(x)\) and we want to find how \(x\) varies as a function of \(y\). Use algebra to move \(x\) to the left hand side (LHS) of the equation and so that the right hand side (RHS) is only a function of \(y\).
Example 2.5 (Solving) Solve for x:
\(y=3x+2\)
\(y=e^x\)
Solving for variables is especially important when we want to find the roots of an equation: those values of variables that cause an equation to equal zero. Especially important in finding equilibria and in doing maximum likelihood estimation.
Procedure: Given \(y=f(x)\), set \(f(x)=0\). Solve for \(x\).
Multiple Roots: \[f(x)=x^2 - 9 \quad\Longrightarrow\quad 0=x^2 - 9 \quad\Longrightarrow\quad 9=x^2 \quad\Longrightarrow\quad \pm \sqrt{9}=\sqrt{x^2} \quad\Longrightarrow\quad \pm 3=x\]
Quadratic Formula: For quadratic equations \(ax^2+bx+c=0\), use the quadratic formula: \[x=\frac{-b\pm\sqrt{b^2-4ac}}{2a}\]
Exercise 2.5 (Roots) Solve for x:
\(f(x)=3x+2 = 0\)
\(f(x)=x^2+3x-4=0\)
\(f(x)=e^{-x}-10 = 0\)
Special thanks to Shiro Kuriwaki for developing the original version of this tutorial↩︎
Special thanks to Shiro Kuriwaki and Yon Soo Park for developing the original module↩︎
Module originally written by Shiro Kuriwaki, Connor Jerzak, and Yon Soo Park↩︎
Module originally written by Connor Jerzak and Shiro Kuriwaki↩︎