Engineering Mathematics Formula Review

Calculus

Limits & Continuity: A limit $\lim_{x\to a}f(x)=L$ means $f(x)$ approaches $L$ as $x$ approaches $a$. Key limit rules include linearity and product/quotient rules. Notable limits: $\lim_{x\to0}\frac{\sin x}{x}=1$, $\lim_{n\to\infty}\left(1+\frac{1}{n}\right)^n=e$. A function $f(x)$ is continuous at $x=a$ if $\lim_{x\to a}f(x)=f(a)$ (volume1_removed (1).pdf). Discontinuities occur if limits differ from function value or do not exist.
Differentiation: The derivative $f'(x)=\lim_{h\to0}\frac{f(x+h)-f(x)}{h}$ gives the instantaneous rate of change. Basic rules: $(u+v)’=u’+v’$, $(cu)’=c,u’$, $(uv)’=u’v+uv’$, $(u/v)’=\frac{u’v-uv’}{v^2}$. By the chain rule, if $y=f(u)$ and $u=g(x)$, then $\frac{dy}{dx}=f'(g(x))\cdot g'(x)$. Higher derivatives: $f”(x)$, etc., obtained by differentiating repeatedly. L’Hôpital’s Rule: For indeterminate forms $0/0$ or $\infty/\infty$, $\displaystyle \lim_{x\to c}\frac{f(x)}{g(x)} = \lim_{x\to c}\frac{f'(x)}{g'(x)}$ (if this latter limit exists) (L’Hôpital’s rule – Wikipedia). This helps evaluate tricky limits.
Mean Value Theorem (MVT): If $f(x)$ is continuous on $[a,b]$ and differentiable on $(a,b)$, then there exists some $c\in(a,b)$ such that the instantaneous slope at $c$ equals the average slope over $[a,b]$ (Mean Value Theorem – Matheno.com | Matheno.com): $\displaystyle f'(c) = \frac{f(b)-f(a)}{,b-a,},. $ (Mean Value Theorem – Matheno.com | Matheno.com) A special case is Rolle’s Theorem: if additionally $f(a)=f(b)$, then $\exists c$ with $f'(c)=0$ (extrema in $(a,b)$). These theorems guarantee at least one stationary point in the interval.
Maxima and Minima: Critical points occur where $f'(x)=0$ or undefined. For a twice-differentiable function, $f”(x)>0$ at a critical point implies a local minimum, $f”(x)<0$ implies a local maximum. If $f”(x)=0$, higher-order tests or monotonicity analysis is needed. Endpoints of a closed interval should also be checked for absolute extrema.
Integration: The antiderivative or indefinite integral $F(x)=\int f(x),dx$ satisfies $F'(x)=f(x)$. Key rules: linearity $\int [af(x)+bg(x)]dx = a\int f(x)dx + b\int g(x)dx$, power rule $\int x^n dx = \frac{x^{n+1}}{n+1}+C$ (for $n\neq -1$), $\int e^x dx = e^x + C$, $\int \sin x dx = -\cos x + C$, $\int \cos x dx = \sin x + C$, etc. Integration by parts: $\displaystyle \int u,dv = u,v – \int v,du$ (Integration by Parts – Expii) (choose $u$ = function that simplifies upon differentiation). Trigonometric integrals and partial fractions are common techniques for more complex integrals. The definite integral $\int_a^b f(x),dx$ gives area under the curve and is evaluated by the Fundamental Theorem of Calculus: $\displaystyle \int_a^b f(x),dx = F(b)-F(a),, $ where $F'(x)=f(x)$ (The Fundamental Theorem of Calculus). Properties: $\int_a^b f(x)dx = -\int_b^a f(x)dx$; if $a<b<c$, then $\int_a^c f = \int_a^b f + \int_b^c f$. Improper integrals (infinite limits or discontinuous integrands) are evaluated as limits of definite integrals.
Improper Integrals & Convergence: $\int_a^{\infty} f(x)dx = \lim_{M\to\infty}\int_a^M f(x)dx$ (converges if limit exists finite). Similarly for integrals with vertical asymptotes in the interval. Common convergent integrals include $\int_1^\infty \frac{1}{x^p}dx$ which converges for $p>1$ and diverges for $p\le1$.
Multivariable Calculus: For a function of several variables $z=f(x,y,\dots)$, partial derivatives measure rate of change in one direction holding others constant (Partial Derivatives | Brilliant Math & Science Wiki). For example: $\displaystyle f_x(x,y) = \frac{\partial f}{\partial x} = \lim_{h\to0}\frac{f(x+h,y)-f(x,y)}{h},, $ (Partial Derivatives | Brilliant Math & Science Wiki) (treating $y$ constant). Similarly $f_y=\partial f/\partial y$ is defined. Higher-order partials like $f_{xx}, f_{xy}$ are obtained by successive differentiation. The gradient $\nabla f(x,y) = (f_x,;f_y)$ points in direction of steepest increase of $f$. Critical points for $f(x,y)$ occur where $\nabla f = (0,0)$. For a differentiable $f(x,y)$, a tangent plane at $(a,b)$ is given by $z \approx f(a,b) + f_x(a,b)(x-a)+f_y(a,b)(y-b)$. Multiple Integrals: The double integral $\displaystyle \iint_R f(x,y),dx,dy$ gives volume under $f$ over region $R$. It can be computed as an iterated integral: $\int_{x=x_1}^{x_2}!\int_{y=y_1(x)}^{y_2(x)} f(x,y),dy,dx$ (or swap order). For example, in rectangular bounds $[a,b]\times[c,d]$, $\iint_{[a,b]\times[c,d]} f(x,y),dx,dy = \int_a^b\int_c^d f(x,y),dy,dx$. Change of variables: in polar coordinates $(r,\theta)$, $x=r\cos\theta,;y=r\sin\theta$, the area element $dx,dy$ becomes $r,dr,d\theta$ (Jacobian). Thus $\iint f(x,y),dx,dy = \int_{\theta=\theta_1}^{\theta_2}\int_{r=r_1}^{r_2} f(r\cos\theta,;r\sin\theta),r,dr,d\theta$. Similarly, triple integrals can be evaluated in Cartesian or other coordinate systems (with appropriate Jacobian determinants).

Linear Algebra

Matrices and Vectors: An $m\times n$ matrix $A=[a_{ij}]$ represents a linear transformation. Matrix operations: addition and scalar multiplication (elementwise), and matrix multiplication (row-by-column dot products). Matrix multiplication is associative and distributive but not commutative in general. The $n\times n$ identity matrix $I_n$ has ones on the diagonal and zeros elsewhere (acts as multiplicative identity: $AI=IA=A$). A zero matrix has all entries 0. A matrix is symmetric if $A=A^T$. Two useful properties: $(A^T)^T = A$ and $(AB)^T = B^T A^T$.
Determinants: For a square matrix $A$, $\det(A)$ is a scalar characterizing certain properties. Key properties: $\det(AB)=\det(A)\det(B)$ (Eigenvalues and eigenvectors – Wikipedia), and $\det(A^T)=\det(A)$. The determinant of an $n\times n$ matrix can be computed by cofactor expansion or reduced via row operations (with careful sign/scale tracking). If any two rows are identical, $\det=0$. Swapping two rows multiplies $\det$ by $-1$; scaling a row by $c$ scales $\det$ by $c$. A matrix is singular (non-invertible) iff $\det(A)=0$ (Problem 10 A matrix (M \in M_{n \times n}(… [FREE SOLUTION] | Vaia). Conversely, if $\det(A)\neq0$, $A$ is invertible (has a unique inverse $A^{-1}$ satisfying $AA^{-1}=I$) (Problem 10 A matrix (M \in M_{n \times n}(… [FREE SOLUTION] | Vaia). For $2\times2$: $\det\begin{pmatrix}a&b\c&d\end{pmatrix}=ad – bc$. For $3\times3$: $\det\begin{pmatrix}a&b&c\d&e&f\g&h&i\end{pmatrix}=a(ei – fh) – b(di – fg) + c(dh – eg)$.
Inverse of a Matrix: $A^{-1}$ exists only if $\det(A)\neq0$. For $2\times2$, $A^{-1}=\frac{1}{ad-bc}\begin{pmatrix}d & -b-c & a\end{pmatrix}$. In general, inverses can be found via the adjugate (matrix of cofactors transposed) or by augmenting with $I$ and performing Gaussian elimination. If $AX=B$, solution is $X=A^{-1}B$.
Systems of Linear Equations: $A\mathbf{x}=\mathbf{b}$ has a solution if $\mathbf{b}$ is in the column space of $A$. The system is consistent iff $\mathrm{rank}(A)=\mathrm{rank}([A|\mathbf{b}])$. If $\det(A)\neq0$ (for a square system), there is a unique solution. If $\det(A)=0$, either no solution or infinitely many (dependent equations). Cramer’s rule: for $n$ independent equations, the solution components $x_i=\frac{\det(A_i)}{\det(A)}$, where $A_i$ is $A$ with its $i$-th column replaced by $\mathbf{b}$. (Useful for theoretical purposes; not used for large $n$ in practice.)
Eigenvalues and Eigenvectors: For an $n\times n$ matrix $A$, an eigenpair $(\lambda,\mathbf{v}\neq\mathbf{0})$ satisfies $A\mathbf{v} = \lambda \mathbf{v}$. Here $\lambda$ is an eigenvalue and $\mathbf{v}$ the corresponding eigenvector. To find eigenvalues, solve the characteristic equation $\det(A-\lambda I)=0$ (an $n$th-degree polynomial in $\lambda$). Eigenvectors are found by solving $(A-\lambda I)\mathbf{v}=\mathbf{0}$ for each eigenvalue. Important facts: The trace of $A$ (sum of diagonal entries) equals the sum of eigenvalues (Eigenvalues and eigenvectors – Wikipedia), and $\det(A)$ equals the product of eigenvalues (Eigenvalues and eigenvectors – Wikipedia) (counting algebraic multiplicity). For example, if eigenvalues are $\lambda_1,\lambda_2,\dots,\lambda_n$ then $\displaystyle \operatorname{tr}(A)=\sum_{i=1}^n \lambda_i$ and $\displaystyle \det(A)=\prod_{i=1}^n \lambda_i,. $ (Eigenvalues and eigenvectors – Wikipedia) (Eigenvalues and eigenvectors – Wikipedia) If $A$ is invertible, its eigenvalues are all non-zero and eigenvalues of $A^{-1}$ are $1/\lambda_i$ (Eigenvalues and eigenvectors – Wikipedia). Special matrices: A real symmetric matrix has all real eigenvalues and orthogonal eigenvectors. A matrix is diagonalizable if it has $n$ independent eigenvectors (can form $P$ such that $P^{-1}AP=D$ diagonal with eigenvalues on $D$’s diagonal). Definiteness: A symmetric matrix is positive definite if all eigenvalues $\lambda_i>0$, negative definite if $\lambda_i<0$, etc. (Eigenvalues and eigenvectors – Wikipedia). For example, positive definiteness $\iff \mathbf{x}^TA\mathbf{x}>0$ for all nonzero $\mathbf{x}$, which occurs iff all $\lambda>0$.
Examples: If $A=\begin{pmatrix}2 & 0\0 & 3\end{pmatrix}$, eigenvalues are $2,3$ (diag entries). If $A=\begin{pmatrix}4 & 1\2 & 3\end{pmatrix}$, solve $\det\begin{pmatrix}4-\lambda & 1\2 & 3-\lambda\end{pmatrix}=0$ $\implies (4-\lambda)(3-\lambda)-2 = \lambda^2 -7\lambda +10 =0$, giving eigenvalues $\lambda=5,2$. The sum $5+2=7$ equals $\operatorname{tr}(A)=4+3$ and product $5\cdot2=10$ equals $\det(A)$. Eigenvectors: for $\lambda=5$, solve $(A-5I)\mathbf{v}=0$, etc.
Matrix Decompositions: Any matrix can be reduced to echelon form by Gaussian elimination. The number of non-zero rows = rank. There are advanced factorizations: LU decomposition factors $A$ into a product of a lower and an upper triangular matrix (useful for solving systems efficiently), QR decomposition factors $A=QR$ (orthogonal $Q$ and upper-triangular $R$), etc. For symmetric matrices, spectral decomposition expresses $A = Q \Lambda Q^T$ with $Q$ orthonormal eigenvectors and $\Lambda$ diagonal of eigenvalues.

Probability & Statistics

Probability Basics: An experiment has sample space $S$ of outcomes. An event $A\subseteq S$ has probability $P(A)$ between 0 and 1, with $P(S)=1$ and $P(\emptyset)=0$. If events $A$ and $B$ are mutually exclusive (disjoint), $P(A\cup B)=P(A)+P(B)$. In general, the addition law: $P(A\cup B)=P(A)+P(B)-P(A\cap B)$. The complement $A^c$ satisfies $P(A^c)=1-P(A)$.
Conditional Probability: The probability of $A$ given $B$ (assuming $P(B)>0$) is $\displaystyle P(A\mid B) = \frac{P(A\cap B)}{P(B)}$ (Conditional Probability Formula: Meaning, Solved Examples). It represents the updated likelihood of $A$ when $B$ is known to occur. Independence: Events $A$ and $B$ are independent if $P(A\cap B)=P(A),P(B)$ (equivalently $P(A|B)=P(A)$). For independent events, knowledge of one does not change the probability of the other.
Bayes’ Theorem: This gives the probability of a cause $A$ given effect $B$ using prior information. For two events: $\displaystyle P(A\mid B) = \frac{P(B\mid A),P(A)}{P(B)},. $ Here $P(B)=P(B|A)P(A)+P(B|\neg A)P(\neg A)$ by total probability. Bayes’ theorem is especially useful when $B$ can occur from multiple mutually exclusive scenarios $A_i$: $P(A_i|B)=\frac{P(B|A_i)P(A_i)}{\sum_j P(B|A_j)P(A_j)}$. It underlies many inference problems (e.g. medical test accuracy, spam filtering).
Random Variables: A random variable $X$ assigns a real number to each outcome in $S$. The distribution of $X$ is described by its probability mass function (pmf) for discrete $X$ or probability density function (pdf) for continuous $X$. The expectation (mean) is the long-run average value: $\displaystyle E[X] = \sum_{x} x,P(X=x)$ for discrete $X$ ([GET ANSWER] let x be a random variable with the following probability distribution value x of x pxx 10 015 0 040 10 020 20 010 30 015 complete the following if necessary consult a list of formulas a fin 93719), or $E[X] = \int_{-\infty}^{\infty} x,f_X(x),dx$ for continuous $X$. Linear properties: $E[aX+b] = a,E[X]+b$ and $E[X+Y]=E[X]+E[Y]$. The variance $\mathrm{Var}(X)=E[(X-\mu)^2]$ (where $\mu=E[X]$) measures spread. Computation formula: $\displaystyle \Var(X) = E[X^2] – (E[X])^2$ (Probability and Random Variables, Lecture 10). Standard deviation $\sigma = \sqrt{\Var(X)}$. Common measures of a distribution’s center are the mean $E[X]$ and the median (the value $m$ such that $P(X\le m)=0.5$). The mode is the most likely value (peak of the distribution). For a symmetric distribution (like a normal distribution), $\text{mean}=\text{median}=\text{mode}$.
Discrete Distributions:
- Binomial($n,p$): Models the number of successes in $n$ independent trials with success probability $p$. $\displaystyle P(X=k)=\binom{n}{k}p^k(1-p)^{,n-k}$ for $k=0,1,\dots,n$ (Numeracy, Maths and Statistics – Academic Skills Kit). Mean $E[X]=np$, variance $\Var(X)=np(1-p)$ (Numeracy, Maths and Statistics – Academic Skills Kit). Example: Tossing 10 fair coins ($p=0.5$), the probability of exactly $k$ heads is $\binom{10}{k}/2^{10}$. The distribution is symmetric around $np$ when $p=0.5$. For general $p$, the mode (most likely $k$) is $\lfloor (n+1)p \rfloor$.
- Poisson($\lambda$): Models the count of events occurring in a fixed interval, when events happen independently at constant average rate $\lambda$. $\displaystyle P(X=k) = \frac{e^{-\lambda},\lambda^k}{k!},; k=0,1,2,\dots$ (Poisson Distributions | Definition, Formula & Examples) (Poisson Distributions | Definition, Formula & Examples). Mean $E[X]=\lambda$, variance $\Var(X)=\lambda$ (Poisson distribution – Wikipedia). The Poisson is the limit of Binomial$(n,p)$ as $n\to\infty$, $p\to0$ with $np=\lambda$. Example: If calls arrive at 5 per hour on average, the probability of exactly 3 calls in an hour is $e^{-5}5^3/3!$. (Note: For small $\lambda$, Poisson is skewed right; as $\lambda$ grows, it approaches symmetry.)
- Geometric($p$): Models the trial count of the first success (with success prob $p$ each trial). $P(X=k)=(1-p)^{k-1}p$ for $k=1,2,\dots$. Memoryless property: $P(X>m+n \mid X>m)=P(X>n)$. Mean $1/p$, variance $(1-p)/p^2$. (This distribution often appears in analysis of algorithms and reliability.)
Continuous Distributions:
- Uniform($a,b$): Continuous uniform distribution on $[a,b]$ has constant density. $f_X(x)=\frac{1}{,b-a,}$ for $x\in[a,b]$. $E[X]=\frac{a+b}{2}$, $\Var(X)=\frac{(b-a)^2}{12}$. As a special case, Uniform$(0,1)$ has mean $0.5$ and $\Var=1/12$. Many processes start with an assumption of uniform randomness over an interval.
- Normal (Gaussian) $(\mu,\sigma^2)$: Density: $f(x) = \frac{1}{\sigma\sqrt{2\pi}}\exp!\Big(-\frac{(x-\mu)^2}{2\sigma^2}\Big),. $ (Bell-shaped curve) Mean $E[X]=\mu$, variance $\Var(X)=\sigma^2$. The standard normal $Z\sim N(0,1)$ has $\mu=0,\sigma^2=1$; other normals can be standardized by $Z=\frac{X-\mu}{\sigma}$. 68–95–99.7% rule: For $N(\mu,\sigma^2)$, about 68% of values lie within $[\mu\pm\sigma]$, 95% within $[\mu\pm2\sigma]$, and 99.7% within $[\mu\pm3\sigma]$. If $X\sim N(\mu,\sigma^2)$, then linear combinations are also normal: e.g. $\bar{X} \sim N(\mu,\sigma^2/n)$ for sample mean of $n$ i.i.d. normals. Central Limit Theorem: the sum or average of a large number of independent random variables (under mild conditions) is approximately normal, regardless of their original distribution. This justifies the normal model in many situations.
- Exponential($\lambda$): Models waiting time between independent events occurring at rate $\lambda$. PDF: $f_X(x)=\lambda e^{-\lambda x}$ for $x\ge0$. CDF: $1-e^{-\lambda x}$. Mean $E[X]=1/\lambda$, $\Var(X)=1/\lambda^2$. Notably memoryless: $P(X>t+s \mid X>t)=P(X>s)$ (the future waiting time is independent of how much time has elapsed) (L’Hôpital’s rule – Wikipedia). The exponential is the continuous analogue of the geometric distribution. Example: If the average lifetime of a component is 5 years ($\lambda=0.2$ per year), the probability it lasts more than 8 years is $e^{-0.2\cdot8}\approx0.20$.
Expectation and Variance in Practice: For any distribution, the linearity of expectation is powerful: $E\left[\sum_i X_i\right]=\sum_i E[X_i]$ even if $X_i$ are dependent. The variance of a sum is $\Var(\sum_i X_i)=\sum_i\Var(X_i)+2\sum_{i<j}\Cov(X_i,X_j)$; for independent variables, $\Var(\sum_i X_i)=\sum_i\Var(X_i)$. Covariance $\Cov(X,Y)=E[XY]-E[X]E[Y]$, and $\Corr(X,Y)=\frac{\Cov(X,Y)}{\sigma_X\sigma_Y}$ measures linear association ($\Corr=0$ means uncorrelated, though not necessarily independent).
Statistical Measures: For a dataset or distribution:
- Mean ($\bar{x}$ or $\mu$): arithmetic average.
- Median: middle value (50th percentile); robust to outliers.
- Mode: most frequent value (peak of distribution).
- Variance ($\sigma^2$): average squared deviation from mean; standard deviation $\sigma$ is its square root, giving dispersion in original units.
- Range: max $-$ min, and Interquartile Range (IQR): $Q_3-Q_1$ (spread of middle 50%).
- Standard scores: $z$-score $(x-\mu)/\sigma$ measures how many SDs $x$ is above/below mean.

Using these formulas and concepts, a student can confidently tackle problems in limits, differentiation, integration, linear algebra computations, and probability calculations. Each topic’s key formulas (highlighted above) serve as a quick reference for solving typical engineering mathematics problems. (Mean Value Theorem – Matheno.com | Matheno.com)

Engineering Mathematics Formula Review

Calculus

Linear Algebra

Probability & Statistics

Be First to Comment

Leave a Reply Cancel reply