Chapter 16: Root Finding & Newton’s Method

Chapter 16: Root Finding & Newton’s Method#

Topics Covered:

What is a root and why do we need to find it numerically?
Polynomial roots with np.roots
Bisection method: slow but guaranteed
Newton’s method: fast but requires a derivative
scipy.optimize.brentq: robust black-box root finder
scipy.optimize.fsolve: systems of nonlinear equations
ChE application: van der Waals equation, reaction equilibrium

Motivation: Equations You Cannot Solve by Hand#

Chemical engineering is full of equations where the unknown appears in a complicated way — inside an exponential, a logarithm, or a cubic. For example:

Van der Waals equation of state (find molar volume $V$ at given $T$, $P$): $$\left(P + \frac{a}{V^2}\right)(V - b) = RT$$

Rearranging gives a cubic in $V$ — three possible roots, and no simple formula for the one you want.

Antoine equation (find boiling point $T_b$ at a given pressure $P$): $$\log_{10} P = A - \frac{B}{C + T_b}$$

Solvable analytically here, but the same structure embedded in a larger model is not.

Reaction equilibrium (find equilibrium conversion $X_e$): $$K_{eq}(T) = \frac{X_e^2}{(1 - X_e)^2}$$

For more complex stoichiometry the algebraic expression becomes intractable.

The strategy in all cases: rewrite as $f(x) = 0$ and find the root numerically.

import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import brentq, fsolve

16.1 What Is a Root?#

A root (or zero) of a function $f(x)$ is any value $x^*$ where: $$f(x^*) = 0$$

Almost any equation can be rewritten in this form. For example:

Original equation	Rewrite as $f(x) = 0$
$x^2 = 4$	$f(x) = x^2 - 4$
$e^x = 2x + 1$	$f(x) = e^x - 2x - 1$
$(P + a/V^2)(V-b) = RT$	$f(V) = (P + a/V^2)(V-b) - RT$

Geometrically, a root is where the curve $y = f(x)$ crosses zero — where it intersects the x-axis.

The key idea: always plot your function first to see how many roots there are and roughly where they are.

# Example: f(x) = e^x - 2x - 1  has two roots
def f_demo(x):
    return np.exp(x) - 2*x - 1

x = np.linspace(-1.5, 2.5, 300)

fig, ax = plt.subplots(figsize=(6, 4))
ax.plot(x, f_demo(x), 'steelblue', linewidth=2.5)
ax.axhline(0, color='k', linewidth=1)
# ax.scatter([0, np.log(2) + 0*1],  # roots: x=0 and x≈1.256
#            [0, 0], color='red', s=80, zorder=5, label='Roots')

# find second root precisely
root2 = brentq(f_demo, 1, 2)
ax.scatter([0, root2], [0, 0], color='red', s=80, zorder=5)
ax.annotate(f'x* = 0', (0, 0), textcoords='offset points', xytext=(8, 12), fontsize=10)
ax.annotate(f'x* ≈ {root2:.3f}', (root2, 0), textcoords='offset points', xytext=(6, 12), fontsize=10)
ax.set_xlabel('x'); ax.set_ylabel('f(x)')
ax.set_title(r'$f(x) = e^x - 2x - 1$: two roots')
ax.set_ylim(-2, 5)
plt.tight_layout()
plt.show()

../../_images/157266f9bf667c0cf7c95936d076ba0d74531404dd2ae049809a5d985fe80880.png

16.2 Special Case: Polynomial Roots with `np.roots`#

If your equation happens to be a polynomial, NumPy can find all roots at once — including complex ones — using np.roots. This is much faster than running a bracketed solver multiple times.

How it works#

np.roots finds roots by computing the eigenvalues of the companion matrix of the polynomial. For a degree-$n$ polynomial, the companion matrix is an $n \times n$ matrix whose eigenvalues are exactly the roots. This is a linear algebra approach — no iteration, no initial guess needed.

Syntax: pass the coefficients from highest to lowest degree:

\[p(x) = c_0 x^n + c_1 x^{n-1} + \cdots + c_n \quad \Rightarrow \quad \texttt{np.roots([}c_0,\, c_1,\, \ldots,\, c_n\texttt{])}\]

Example: van der Waals as a cubic polynomial#

Expanding the van der Waals equation $(P + a/V^2)(V - b) = RT$ gives a cubic in $V$:

\[PV^3 - (Pb + RT)V^2 + aV - ab = 0\]

We can find all three roots directly with np.roots and pick the physically meaningful one.

R_gas = 0.08314   # L·bar / (mol·K)
a_co2 = 3.658     # L²·bar / mol²
b_co2 = 0.04286   # L/mol
T_co2 = 310.0     # K
P_co2 = 70.0      # bar

# Cubic coefficients: P*V^3 - (P*b + R*T)*V^2 + a*V - a*b = 0
coeffs = [
    P_co2,
    -(P_co2 * b_co2 + R_gas * T_co2),
     a_co2,
    -a_co2 * b_co2
]

all_roots = np.roots(coeffs)

print("All roots of the van der Waals cubic:")
for i, r in enumerate(all_roots):
    if np.isreal(r):
        print(f"  Root {i+1}: V = {r.real:.5f} L/mol  (real)")
    else:
        print(f"  Root {i+1}: V = {r}  (complex — not physical)")

# The physical root is the real positive root greater than b
physical = [r.real for r in all_roots if np.isreal(r) and r.real > b_co2]
print(f"\nPhysical molar volume: V = {physical[0]:.5f} L/mol")

All roots of the van der Waals cubic:
  Root 1: V = 0.21940 L/mol  (real)
  Root 2: V = (0.09582797990203343+0.032026426303389866j)  (complex — not physical)
  Root 3: V = (0.09582797990203343-0.032026426303389866j)  (complex — not physical)

Physical molar volume: V = 0.21940 L/mol

Under the hood: why eigenvalues = roots?#

Start with a degree-2 polynomial $p(x) = x^2 + a_1 x + a_2$.

Its companion matrix is the $2 \times 2$ matrix:

\[\begin{split}C = \begin{bmatrix} -a_1 & -a_2 \\ 1 & 0 \end{bmatrix}\end{split}\]

Now compute the characteristic polynomial of $C$ — i.e., set $\det(C - \lambda I) = 0$:

\[\begin{split}\det\begin{bmatrix} -a_1 - \lambda & -a_2 \\ 1 & -\lambda \end{bmatrix} = (-a_1 - \lambda)(-\lambda) - (-a_2)(1)\end{split}\]

\[= \lambda^2 + a_1 \lambda + a_2\]

That is exactly $p(\lambda)$. So $\det(C - \lambda I) = 0$ is the same equation as $p(\lambda) = 0$ — the eigenvalues of $C$ are the roots of $p$.

The same construction works for any degree $n$: the companion matrix is always built so that its characteristic polynomial equals the original polynomial.

General form for a monic degree-$n$ polynomial $x^n + a_1 x^{n-1} + \cdots + a_n$:

\[\begin{split}C = \begin{bmatrix} -a_1 & -a_2 & \cdots & -a_n \\\\ 1 & 0 & \cdots & 0 \\\\ 0 & 1 & \cdots & 0 \\\\ \vdots & & \ddots & \vdots \\\\ 0 & 0 & \cdots & 0 \end{bmatrix}\end{split}\]

The code below demonstrates this with a quadratic first, then the cubic.

# --- 2x2 demo: p(x) = x^2 - 5x + 6  (roots are 2 and 3) ---
# a1 = -5, a2 = 6
C2 = np.array([[ 5., -6.],
               [ 1.,  0.]])

print("2x2 companion matrix C:")
print(C2)
print(f"det(C - lI) = l^2 - 5l + 6  =>  eigenvalues: {np.linalg.eigvals(C2).real}")
print(f"np.roots result:                              {np.sort(np.roots([1, -5, 6]).real)}")

print()

# --- cubic demo: p(x) = x^3 - 6x^2 + 11x - 6  (roots are 1, 2, 3) ---
c = np.array([1., -6., 11., -6.])
n = len(c) - 1
C3 = np.zeros((n, n))
C3[0, :] = -c[1:]              # first row: 6, -11, 6
C3[1:, :-1] = np.eye(n - 1)    # sub-diagonal of 1s

print("3x3 companion matrix C:")
print(C3)
print(f"eigvals(C) result: {np.sort(np.linalg.eigvals(C3).real)}")
print(f"np.roots result:   {np.sort(np.roots(c).real)}")
print(f"True roots:        [1. 2. 3.]")

2x2 companion matrix C:
[[ 5. -6.]
 [ 1.  0.]]
det(C - lI) = l^2 - 5l + 6  =>  eigenvalues: [3. 2.]
np.roots result:                              [2. 3.]

3x3 companion matrix C:
[[  6. -11.   6.]
 [  1.   0.   0.]
 [  0.   1.   0.]]
eigvals(C) result: [1. 2. 3.]
np.roots result:   [1. 2. 3.]
True roots:        [1. 2. 3.]

16.3 Bisection Method#

Mathematical background#

The bisection method is grounded in the Intermediate Value Theorem (IVT) from calculus:

If $f$ is continuous on $[a, b]$ and $f(a)$ and $f(b)$ have opposite signs, then there exists at least one root $x^* \in (a, b)$.

The sign change $f(a) \cdot f(b) < 0$ is not just a programming check — it is the mathematical guarantee that a root exists in the interval.

Error analysis:

After $n$ iterations, the interval has width $(b-a)/2^n$, so the root is known to within: $$|x_n - x^*| \leq \frac{b - a}{2^n}$$

To achieve a tolerance $\epsilon$, you need at least $n \geq \log_2\!\left(\dfrac{b-a}{\epsilon}\right)$ iterations.

For example, starting from an interval of width 1 and wanting $\epsilon = 10^{-8}$: you need $\log_2(10^8) \approx 27$ iterations. That is slow but completely predictable — this is why bisection is called linearly convergent.

Algorithm:

Start with $[a, b]$ where $f(a) \cdot f(b) < 0$
Compute midpoint $m = (a + b)/2$
If $f(a) \cdot f(m) < 0$: root is in $[a, m]$ → set $b = m$; else set $a = m$
Repeat until $(b - a)/2 < \epsilon$

Strengths: Always works if the bracket is valid. No derivative needed. Error bound is known in advance.

Weaknesses: Slow ($\sim 27$ iterations for 8 decimal places). Needs a bracket.

Visual: bisection step by step#

Each panel below shows one iteration of the algorithm:

The current bracket $[a, b]$ is shaded in red (the region we know contains the root)
The midpoint $m = (a+b)/2$ is shown as a dashed vertical line
Based on the sign of $f(a) \cdot f(m)$, we keep one half and discard the other

def f_demo(x):
    return np.exp(x) - 2*x - 1

a0, b0 = 1.0, 2.0   # initial bracket
n_panels = 6

# Run bisection and record state BEFORE each step
states = []   # (a, b, m)
a, b = a0, b0
for _ in range(n_panels):
    m = (a + b) / 2
    states.append((a, b, m))
    if f_demo(a) * f_demo(m) < 0:
        b = m
    else:
        a = m

fig, axes = plt.subplots(2, 3, figsize=(13, 7.5))
axes = axes.flatten()
x_plot = np.linspace(0.8, 2.2, 400)

for k, (ak, bk, mk) in enumerate(states):
    ax = axes[k]

    # function curve
    ax.plot(x_plot, f_demo(x_plot), "steelblue", lw=2)
    ax.axhline(0, color="k", lw=0.8)

    # shade current bracket
    ax.axvspan(ak, bk, alpha=0.15, color="red")

    # midpoint vertical line + dot
    ax.axvline(mk, color="red", ls="--", lw=1.5)
    ax.plot(mk, f_demo(mk), "ro", ms=7, zorder=5)

    # label a and b on x-axis with their values
    ax.plot(ak, 0, "k<", ms=7, clip_on=False)
    ax.plot(bk, 0, "k>", ms=7, clip_on=False)
    ax.annotate(f"$a$={ak:.4f}", (ak, 0), xytext=(0, -22),
                textcoords="offset points", ha="center", fontsize=7.5, color="black")
    ax.annotate(f"$b$={bk:.4f}", (bk, 0), xytext=(0, -22),
                textcoords="offset points", ha="center", fontsize=7.5, color="black")

    # label m with its value on the x-axis
    ax.annotate(f"$m$={mk:.4f}", (mk, 0), xytext=(0, -35),
                textcoords="offset points", ha="center", fontsize=7.5, color="red")

    # drop a vertical line from f(m) down to x-axis to show f(m) value
    ax.vlines(mk, 0, f_demo(mk), color="red", lw=1, ls=":", alpha=0.6)
    ax.annotate(f"$f(m)$={f_demo(mk):.3f}", (mk, f_demo(mk)),
                xytext=(8, 4), textcoords="offset points", fontsize=7.5, color="red")

    # decision logic annotation
    if f_demo(ak) * f_demo(mk) < 0:
        decision = (f"$m = ({ak:.3f}+{bk:.3f})/2 = {mk:.4f}$\n"
                    f"$f(a)={f_demo(ak):.3f},\ f(m)={f_demo(mk):.3f}$\n"
                    f"same sign as $f(a)$? No → $b \\leftarrow m={mk:.4f}$")
    else:
        decision = (f"$m = ({ak:.3f}+{bk:.3f})/2 = {mk:.4f}$\n"
                    f"$f(a)={f_demo(ak):.3f},\ f(m)={f_demo(mk):.3f}$\n"
                    f"same sign as $f(a)$? Yes → $a \\leftarrow m={mk:.4f}$")

    ax.set_title(f"Step {k+1}  |  width = {bk-ak:.4f}", fontsize=9, fontweight="bold")
    ax.text(0.98, 0.97, decision, transform=ax.transAxes,
            fontsize=7, va="top", ha="right",
            bbox=dict(boxstyle="round,pad=0.3", fc="lightyellow", alpha=0.85))

    ax.set_xlim(0.8, 2.2)
    ax.set_ylim(-1.2, 3.5)
    ax.set_xlabel("x", fontsize=8)
    ax.set_ylabel("f(x)", fontsize=8)

plt.suptitle(r"Bisection on $f(x)=e^x-2x-1$, bracket $[1,\ 2]$", fontsize=12, y=1.01)
plt.tight_layout()
plt.show()

../../_images/30d3df2e2913ac71eb9f0b4e702cf9486be5b1264f04988fb0c64da2067b36c1.png

from scipy.optimize import bisect

def bisection(f, a, b, tol=1e-8, max_iter=100):
    """Find root of f in [a, b] by bisection."""
    assert f(a) * f(b) < 0, "f(a) and f(b) must have opposite signs"

    history = []   # track midpoints for visualization
    for i in range(max_iter):
        m = (a + b) / 2.0
        history.append(m)
        if abs(f(m)) < tol or (b - a) / 2 < tol:
            break
        if f(a) * f(m) < 0:
            b = m
        else:
            a = m
    return m, history

# Find the root of f(x) = e^x - 2x - 1  near x = 1.2
root_bis, history = bisection(f_demo, a=1.0, b=2.0)

print(f"Root found: x* = {root_bis:.10f}")
print(f"f(x*)      = {f_demo(root_bis):.2e}")
print(f"Iterations : {len(history)}")

# Plot convergence
errors = [abs(m - root_bis) for m in history]
fig, ax = plt.subplots(figsize=(7, 4))
ax.semilogy(range(1, len(errors)+1), errors, 'o-', color='steelblue', linewidth=2, markersize=5)
ax.set_xlabel('Iteration'); ax.set_ylabel('|error|  (log scale)')
ax.set_title('Bisection convergence: error halves each step')
plt.tight_layout()
plt.show()

Root found: x* = 1.2564312071
f(x*)      = -2.37e-09
Iterations : 26

../../_images/d512a9b53f15bf2e74f1fabb5e9efd2f915a36a0acc5e286f0195ba1efd20aa9.png

(Optional) Using `scipy.optimize.bisect`#

SciPy provides a built-in bisection solver so you do not need to implement it yourself:

from scipy.optimize import bisect

root = bisect(f, a, b)              # basic usage
root = bisect(f, a, b, xtol=1e-12)  # tighter tolerance
root = bisect(f, a, b, args=(p1,))  # pass extra parameters to f

Requirements: same as hand-coded bisection — $f(a)$ and $f(b)$ must have opposite signs.

The result is identical to our hand-coded version, but bisect is optimized internally and handles edge cases for you.

# scipy.optimize.bisect — same problem, one line
root_scipy = bisect(f_demo, 1.0, 2.0)
root_hand,  _ = bisection(f_demo, 1.0, 2.0)

print(f"scipy bisect:      x* = {root_scipy:.10f}")
print(f"hand-coded:        x* = {root_hand:.10f}")
print(f"difference:             {abs(root_scipy - root_hand):.2e}")

scipy bisect:      x* = 1.2564312086
hand-coded:        x* = 1.2564312071
difference:             1.57e-09

16.4 Newton’s Method#

Mathematical background#

Newton’s method comes from a first-order Taylor expansion of $f$ around the current guess $x_n$:

\[f(x) \approx f(x_n) + f'(x_n)(x - x_n)\]

We want $f(x^*) = 0$, so we set the right-hand side to zero and solve for $x$:

\[0 = f(x_n) + f'(x_n)(x_{n+1} - x_n)\]

\[\boxed{x_{n+1} = x_n - \frac{f(x_n)}{f'(x_n)}}\]

Geometrically: draw the tangent line to $f$ at $(x_n,\, f(x_n))$. The next guess $x_{n+1}$ is where that tangent line crosses the x-axis.

Convergence: quadratic vs. linear#

The error $e_n = x_n - x^*$ satisfies:

\[|e_{n+1}| \approx \frac{|f''(x^*)|}{2|f'(x^*)|} \, |e_n|^2\]

The error is squared at each step — this is called quadratic convergence. Concretely:

Iteration	Approximate error
0	$10^{-1}$
1	$10^{-2}$
2	$10^{-4}$
3	$10^{-8}$
4	$10^{-16}$ (machine precision)

Compare to bisection: to reach $10^{-16}$ from an interval of width 1 takes $\log_2(10^{16}) \approx 53$ iterations, versus ~5 for Newton.

When Newton’s method can fail#

$f'(x_n) \approx 0$: the tangent line is nearly horizontal and the next guess shoots far away
Poor initial guess: if $x_0$ is far from the root, the tangent may point toward a completely different region
Multiple roots close together: the method may oscillate between roots

Rule of thumb: Plot $f(x)$ first, pick $x_0$ close to the root you want, and verify $f'(x_0) \neq 0$.

Newton’s method step by step#

At each step we:

Evaluate $f(x_n)$ and $f'(x_n)$ at the current guess
Draw the tangent line at $(x_n,\ f(x_n))$: $\quad y = f(x_n) + f'(x_n)(x - x_n)$
Find where that tangent crosses zero → this is $x_{n+1}$: $$x_{n+1} = x_n - \frac{f(x_n)}{f'(x_n)}$$
Stop when either:
- $|f(x_n)| < \epsilon$ (residual is small enough), or
- $|x_{n+1} - x_n| < \epsilon$ (step size is negligible)

Each panel below shows one iteration with the numbers filled in.

def f(x):  return np.exp(x) - 2*x - 1
def df(x): return np.exp(x) - 2

x0 = 2.0
n_panels = 6
colors = ["darkorange", "seagreen", "purple", "firebrick", "teal", "saddlebrown"]

# Collect iterates
iterates = [x0]
for _ in range(n_panels):
    xn = iterates[-1]
    iterates.append(xn - f(xn) / df(xn))

true_root = brentq(f, 1.0, 2.0)
x_plot = np.linspace(0.8, 2.3, 400)

fig, axes = plt.subplots(2, 3, figsize=(14, 8))
axes = axes.flatten()

for k in range(n_panels):
    ax = axes[k]
    xn  = iterates[k]
    xn1 = iterates[k + 1]
    fn  = f(xn)
    dfn = df(xn)
    color = colors[k]

    # --- curve ---
    ax.plot(x_plot, f(x_plot), "steelblue", lw=2, zorder=2)
    ax.axhline(0, color="k", lw=0.8)
    ax.axvline(true_root, color="gray", lw=0.8, ls=":", alpha=0.6)

    # --- tangent line through (xn, fn) ---
    # extends a bit left and right so it clearly crosses zero
    x_left  = min(xn, xn1) - 0.15
    x_right = max(xn, xn1) + 0.15
    x_tan = np.linspace(x_left, x_right, 100)
    y_tan = fn + dfn * (x_tan - xn)
    ax.plot(x_tan, y_tan, "--", color=color, lw=1.8, zorder=3)

    # --- vertical drop from (xn, fn) to x-axis ---
    ax.vlines(xn, 0, fn, color=color, lw=1.2, ls=":", alpha=0.7)

    # --- current point (xn, f(xn)) ---
    ax.plot(xn, fn, "o", color=color, ms=8, zorder=5)

    # --- next guess xn1 on x-axis ---
    ax.plot(xn1, 0, "^", color=color, ms=9, zorder=6)

    # --- x-axis labels ---
    ax.annotate(f"$x_{k}$\n{xn:.5f}", (xn, 0),
                xytext=(0, -30), textcoords="offset points",
                ha="center", fontsize=8, color=color)
    ax.annotate(f"$x_{{{k+1}}}$\n{xn1:.5f}", (xn1, 0),
                xytext=(0, 14), textcoords="offset points",
                ha="center", fontsize=8, color=color)

    # --- annotation box: show the arithmetic ---
    box_text = (
        f"$x_{k} = {xn:.5f}$\n"
        f"$f(x_{k}) = {fn:.5f}$\n"
        f"$f'(x_{k}) = {dfn:.5f}$\n"
        f"$x_{{{k+1}}} = {xn:.4f} - {fn:.4f}/{dfn:.4f}$\n"
        f"$x_{{{k+1}}} = {xn1:.6f}$"
    )
    ax.text(0.97, 0.97, box_text, transform=ax.transAxes,
            fontsize=7.5, va="top", ha="right",
            bbox=dict(boxstyle="round,pad=0.35", fc="lightyellow", alpha=0.9))

    ax.set_title(f"Step {k+1}:  $|error|$ = {abs(xn1 - true_root):.2e}", fontsize=9, fontweight="bold")
    ax.set_xlim(0.8, 2.3)
    ax.set_ylim(-1.5, 4.5)
    ax.set_xlabel("x", fontsize=8)
    ax.set_ylabel("f(x)", fontsize=8)

plt.suptitle(
    r"Newton's method on $f(x)=e^x-2x-1$, starting from $x_0=2.0$",
    fontsize=12, y=1.01
)
plt.tight_layout()
plt.show()

../../_images/67313b4132171f17b27a5eb2936eaa48909ab0dd00a38d1a6ca83dce4cbdf64f.png

(x_n, f(x_n))

y = f’(x_n)(x-x_n) + f(x_n)

0 = f’(x_n)(x_n+1-x_n) + f(x_n)

x_n+1 = x_n - f(x_n)/f’(x_n) x_n

from scipy.optimize import newton

def f(x):   
    return np.exp(x) - 2*x - 1
def df(x):  
    return np.exp(x) - 2        # f'(x) = e^x - 2

# Run 4 Newton steps from x0 = 2.0
x0 = 2.0
iterates = [x0]
for _ in range(4):
    xn = iterates[-1]
    xn2 = xn - f(xn) / df(xn)
    iterates.append(xn2)

print(f"{'Step':>4}  {'x_n':>14}  {'f(x_n)':>14}  {'|error|':>14}")
print("-" * 52)
true_root = brentq(f, 1, 2)
for i, xi in enumerate(iterates):
    print(f"  {i:>2}  {xi:>14.10f}  {f(xi):>14.2e}  {abs(xi - true_root):>14.2e}")

Step             x_n          f(x_n)         |error|
----------------------------------------------------
  2.0000000000        2.39e+00        7.44e-01
  1.5566837578        6.30e-01        3.00e-01
  1.3271236710        1.16e-01        7.07e-02
  1.2616298127        7.91e-03        5.20e-03
  1.2564623176        4.71e-05        3.11e-05

(Optional) Using `scipy.optimize.newton`#

SciPy also provides a built-in Newton solver:

from scipy.optimize import newton

root = newton(f, x0)           # secant method (no derivative needed)
root = newton(f, x0, fprime=df)  # full Newton (faster, uses derivative)
root = newton(f, x0, fprime=df, tol=1e-12)  # tighter tolerance

Without fprime: SciPy uses the secant method — approximates the derivative from two nearby points
With fprime: uses true Newton’s method with the analytical derivative

Providing fprime is faster and more reliable when you can compute $f'(x)$.

# scipy.optimize.newton — same problem, one line
root_secant = newton(f, x0=2.0)              # secant method (no fprime)
root_newton = newton(f, x0=2.0, fprime=df)   # full Newton with derivative

print(f"scipy newton (secant):   x* = {root_secant:.10f}")
print(f"scipy newton (Newton):   x* = {root_newton:.10f}")
print(f"hand-coded Newton:       x* = {true_root:.10f}")
print(f"difference (secant vs Newton): {abs(root_secant - root_newton):.2e}")

scipy newton (secant):   x* = 1.2564312086
scipy newton (Newton):   x* = 1.2564312086
hand-coded Newton:       x* = 1.2564312086
difference (secant vs Newton): 4.44e-16

16.3.2 Newton vs. bisection: convergence comparison#

Notice how Newton’s error shrinks quadratically — the curve drops steeply compared to bisection’s straight linear decay on a log plot.

def newton(f, df, x0, tol=1e-12, max_iter=30):
    x = x0
    history = [x]
    for _ in range(max_iter):
        x = x - f(x) / df(x)
        history.append(x)
        if abs(f(x)) < tol:
            break
    return x, history

_, bis_hist  = bisection(f, 1.0, 2.0, tol=1e-12)
_, newt_hist = newton(f, df, x0=2.0)

bis_errors  = [abs(m - true_root) for m in bis_hist]
newt_errors = [abs(m - true_root) for m in newt_hist]

fig, ax = plt.subplots(figsize=(8, 4))
ax.semilogy(range(len(bis_errors)),  bis_errors,  'o-', color='steelblue',
            linewidth=2, markersize=5, label=f'Bisection  ({len(bis_errors)} iters)')
ax.semilogy(range(len(newt_errors)), newt_errors, 's-', color='darkorange',
            linewidth=2, markersize=5, label=f"Newton  ({len(newt_errors)} iters)")
ax.set_xlabel('Iteration'); ax.set_ylabel('|error|  (log scale)')
ax.set_title('Bisection vs. Newton convergence')
ax.legend()
plt.tight_layout()
plt.show()

../../_images/fb00af45c93c66246540431472091111eb051a1ccda9165c33edbd00ae8fbbd6.png

16.5 `scipy.optimize.brentq`: The Practical Choice#

Mathematical background#

Brent’s method (1973) is the algorithm inside brentq. It is a hybrid that switches intelligently between three techniques:

Bisection — always makes progress, but slow ($O(1/2^n)$)
Secant method — approximates the derivative using two recent points instead of computing it analytically: $$x_{n+1} = x_n - f(x_n)\,\frac{x_n - x_{n-1}}{f(x_n) - f(x_{n-1})}$$ This is like Newton's method but replaces $f’(x_n)$ with a finite-difference slope. Converges superlinearly (order ~1.6) but has no bracket guarantee on its own.
Inverse quadratic interpolation — fits a quadratic through three recent $(f, x)$ pairs and uses its inverse to predict the root. Even faster than the secant method when $f$ is smooth.

At each step, Brent’s method tries inverse quadratic interpolation or the secant step first. If either would step outside the current bracket, it falls back to bisection. This gives near-Newton speed with bisection’s guarantee — the best of both worlds.

Key syntax:

from scipy.optimize import brentq

root = brentq(f, a, b)               # basic usage
root = brentq(f, a, b, xtol=1e-12)   # tighter tolerance
root = brentq(f, a, b, args=(p1,))   # pass extra parameters to f

Requirements:

$f$ must be continuous on $[a, b]$
$f(a)$ and $f(b)$ must have opposite signs — if not, you get a ValueError
Always plot first to confirm the sign change and locate the bracket

16.4.1 Finding the van der Waals molar volume#

The van der Waals equation of state for CO₂ at $T = 310$ K, $P = 70$ bar:

\[\left(P + \frac{a}{V^2}\right)(V - b) = RT \quad \Longrightarrow \quad f(V) = \left(P + \frac{a}{V^2}\right)(V - b) - RT = 0\]

van der Waals constants for CO₂: $a = 3.658$ L²·bar/mol², $b = 0.04286$ L/mol.

Step 1: Always plot $f(V)$ first to find a good bracket.

R = 0.08314   # L·bar / (mol·K)
a = 3.658     # L²·bar / mol²   (CO2)
b = 0.04286   # L/mol           (CO2)
T = 310.0     # K
P = 70.0      # bar

def f_vdw(V):
    return (P + a / V**2) * (V - b) - R * T

# Plot to find brackets (avoid V < b where the equation is undefined)
V_range = np.linspace(b + 0.001, 1.0, 500)
fV = f_vdw(V_range)


fig, ax = plt.subplots(figsize=(7, 4))
ax.plot(V_range, fV, 'steelblue', linewidth=2)
ax.axhline(0, color='k', linewidth=1)
ax.set_ylim(-5, 10)
ax.set_xlabel('Molar volume V (L/mol)')
ax.set_ylabel('f(V)')
ax.set_title('van der Waals: f(V) = 0  →  find molar volume of CO₂')
plt.tight_layout()
plt.show()

# # Sign changes reveal brackets
# for i in range(len(fV)-1):
#     if fV[i] * fV[i+1] < 0:
#         print(f"Sign change between V = {V_range[i]:.4f} and {V_range[i+1]:.4f}")

../../_images/6c420b7482a16eb85d201474069225d73c63e1007a9b89ee3974890d6ee40e55.png

# Solve with brentq using the bracket found above
import time
start = time.time()
V_root = brentq(f_vdw, a=0.1, b=0.5)
mid = time.time()
V_root2 = bisect(f_vdw, a=0.1, b=0.5)
end = time.time()
print(V_root, V_root2)

print(mid-start, end-mid)

# Compare to ideal gas: V_ideal = RT/P
V_ideal = R * T / P

print(f"van der Waals molar volume : V = {V_root:.5f} L/mol")
print(f"Ideal gas molar volume     : V = {V_ideal:.5f} L/mol")
print(f"Compressibility factor Z = PV/RT = {P*V_root/(R*T):.4f}  (1.000 = ideal)")
print(f"Residual f(V_root) = {f_vdw(V_root):.2e}")

0.21939546876736188 0.21939546876674293
3.1948089599609375e-05 3.886222839355469e-05
van der Waals molar volume : V = 0.21940 L/mol
Ideal gas molar volume     : V = 0.36819 L/mol
Compressibility factor Z = PV/RT = 0.5959  (1.000 = ideal)
Residual f(V_root) = -3.55e-15

16.6 Systems of Nonlinear Equations: `scipy.optimize.fsolve`#

Mathematical background#

brentq handles one equation, one unknown. For $n$ equations and $n$ unknowns, we need to solve:

\[\mathbf{F}(\mathbf{x}) = \mathbf{0}, \quad \mathbf{F}: \mathbb{R}^n \to \mathbb{R}^n\]

fsolve uses Newton’s method in multiple dimensions. Recall the scalar Newton step: $$x_{n+1} = x_n - \frac{f(x_n)}{f'(x_n)}$$

In $n$ dimensions, the derivative $f'(x)$ becomes the Jacobian matrix $J$, where each entry is a partial derivative: $$J_{ij} = \frac{\partial F_i}{\partial x_j}$$

The multidimensional Newton step becomes: $$\mathbf{x}_{n+1} = \mathbf{x}_n - J(\mathbf{x}_n)^{-1}\, \mathbf{F}(\mathbf{x}_n)$$

or equivalently, solve the linear system $J(\mathbf{x}_n)\,\Delta\mathbf{x} = -\mathbf{F}(\mathbf{x}_n)$ and set $\mathbf{x}_{n+1} = \mathbf{x}_n + \Delta\mathbf{x}$.

fsolve estimates $J$ numerically via finite differences — you don’t need to derive it by hand.

Key differences from `brentq`#

Property	`brentq`	`fsolve`
Equations	1	$N$
Unknowns	1	$N$
Needs bracket	Yes	No
Needs initial guess	No	Yes — critical
Guaranteed to converge	Yes (bracket valid)	No
Silent wrong answers	No	Possible — always verify

Because fsolve has no bracket guarantee, it can converge to a wrong answer without warning. Always check residuals after solving.

Syntax:

from scipy.optimize import fsolve

def F(vars):
    x, y = vars
    return [eq1, eq2]      # list of residuals, one per equation

solution = fsolve(F, x0=[x_guess, y_guess])
print(F(solution))          # must be near [0, 0]

16.5.1 Simple example: two nonlinear equations#

Find the intersection of $y = x^2$ and $y = 2 - x$: $$\begin{cases} y - x^2 = 0 \\ y + x - 2 = 0 \end{cases}$$

def equations(vars):
    x, y = vars
    eq1 = y - x**2       # parabola
    eq2 = y + x - 2      # line
    return [eq1, eq2]

# Two intersection points — use different initial guesses to find each
sol1 = fsolve(equations, x0=[-2.0, 3.0])
sol2 = fsolve(equations, x0=[ 1.0, 2.0])

print(f"Solution 1: x = {sol1[0]:.6f},  y = {sol1[1]:.6f}   residuals: {equations(sol1)}")
print(f"Solution 2: x = {sol2[0]:.6f},  y = {sol2[1]:.6f}   residuals: {equations(sol2)}")

# Visualize
x_p = np.linspace(-2.5, 2.0, 300)
fig, ax = plt.subplots(figsize=(5, 5))
ax.plot(x_p, x_p**2,   'steelblue',  linewidth=2, label='$y = x^2$')
ax.plot(x_p, 2 - x_p,  'darkorange', linewidth=2, label='$y = 2 - x$')
ax.scatter([sol1[0], sol2[0]], [sol1[1], sol2[1]], color='red', s=80, zorder=5, label='Intersections')
ax.set_xlim(-2.5, 2.0); ax.set_ylim(-0.5, 5)
ax.set_xlabel('x'); ax.set_ylabel('y')
ax.set_title('Intersection of two curves')
ax.legend()
plt.tight_layout()
plt.show()

Solution 1: x = -2.000000,  y = 4.000000   residuals: [0.0, 0.0]
Solution 2: x = 1.000000,  y = 1.000000   residuals: [0.0, 0.0]

../../_images/be32e0cfac559470f894871b6a5baa8b51406c88e6219760e9762f81a58de960.png

16.7 Chemical Engineering Application: Reaction Equilibrium#

Problem setup#

Consider a gas-phase reaction at constant temperature and pressure: $$\text{A} \rightleftharpoons 2\,\text{B}$$

Starting with pure A ($n_{A0} = 1$ mol, $n_{B0} = 0$), the equilibrium mole fractions are:

\[y_A = \frac{1 - X_e}{1 + X_e}, \quad y_B = \frac{2X_e}{1 + X_e}\]

where $X_e$ is the equilibrium conversion (fraction of A that reacted). The equilibrium condition at $T = 600$ K, $P = 2$ atm is:

\[K_{eq}(T) = \frac{y_B^2}{y_A} \cdot P = \frac{\left(\dfrac{2X_e}{1+X_e}\right)^2}{\dfrac{1-X_e}{1+X_e}} \cdot P\]

with $K_{eq} = 1.5$ at this temperature.

Goal: Find $X_e$ by solving $f(X_e) = K_p(X_e) - 1.5 = 0$.

P   = 2.0    # atm
Keq = 1.5   # equilibrium constant at T=600K

def Kp_expr(X):
    """K_p as a function of conversion X (must equal Keq at equilibrium)."""
    y_A = (1 - X) / (1 + X)
    y_B = (2 * X) / (1 + X)
    return (y_B**2 / y_A) * P

def f_equil(X):
    return Kp_expr(X) - Keq

# --- Step 1: plot to find bracket ---
X_arr = np.linspace(0.01, 0.99, 300)

fig, ax = plt.subplots(figsize=(7, 4))
ax.plot(X_arr, Kp_expr(X_arr), 'steelblue', linewidth=2.5, label=r'$K_p(X)$')
ax.axhline(Keq, color='red', linestyle='--', linewidth=2, label=f'$K_{{eq}} = {Keq}$')
ax.set_xlabel('Conversion $X_e$'); ax.set_ylabel(r'$K_p(X)$')
ax.set_title(r'Find $X_e$ where $K_p(X) = K_{eq}$')
ax.legend()
plt.tight_layout()
plt.show()

../../_images/6d2ea07ff6fa6d17580a9e17f981094dce52b67397803863e5a15e02321b8d06.png

# --- Step 2: solve with brentq ---
X_eq = brentq(f_equil, 0.01, 0.99)

y_A_eq = (1 - X_eq) / (1 + X_eq)
y_B_eq = (2 * X_eq) / (1 + X_eq)

print(f"Equilibrium conversion:   X_e  = {X_eq:.4f}  ({100*X_eq:.1f}%)")
print(f"Mole fraction of A:       y_A  = {y_A_eq:.4f}")
print(f"Mole fraction of B:       y_B  = {y_B_eq:.4f}")
print(f"Check K_p(X_e) = {Kp_expr(X_eq):.6f}  (target: {Keq})")
print(f"Residual f(X_e) = {f_equil(X_eq):.2e}")

Equilibrium conversion:   X_e  = 0.3974  (39.7%)
Mole fraction of A:       y_A  = 0.4313
Mole fraction of B:       y_B  = 0.5687
Check K_p(X_e) = 1.500000  (target: 1.5)
Residual f(X_e) = -1.69e-12

How does equilibrium conversion change with pressure?#

Le Chatelier’s principle tells us that increasing pressure shifts A ⇌ 2B toward A (fewer moles), reducing $X_e$. We can verify this computationally by solving the root-finding problem over a range of pressures.

pressures = np.linspace(0.5, 10.0, 100)
X_equil_list = []

for P_val in pressures:
    def f_P(X):
        y_A = (1 - X) / (1 + X)
        y_B = (2 * X) / (1 + X)
        return (y_B**2 / y_A) * P_val - Keq
    X_equil_list.append(brentq(f_P, 0.01, 0.999))

fig, ax = plt.subplots(figsize=(7, 4))
ax.plot(pressures, np.array(X_equil_list) * 100, 'steelblue', linewidth=2.5)
ax.axvline(2.0, color='red', linestyle='--', linewidth=1.5, label='P = 2 atm (our case)')
ax.set_xlabel('Pressure (atm)')
ax.set_ylabel('Equilibrium conversion $X_e$ (%)')
ax.set_title(r'A $\rightleftharpoons$ 2B: higher pressure suppresses conversion')
ax.legend()
plt.tight_layout()
plt.show()

../../_images/191d0c83db627f39cc4c17c1c465c386e096aa01a7b0226c53e920efea7f9481.png

Chapter 16 Summary#

Method	Section	Tool	Convergence	Needs	Use when
Polynomial roots	16.2	`np.roots(coeffs)`	Exact (eigenvalue)	Coefficients	Equation is a polynomial
Bisection	16.3	(hand-coded)	Linear $O(1/2^n)$	Bracket $[a,b]$	Learning / guaranteed fallback
Newton	16.4	(hand-coded)	Quadratic $O(e^2)$	$f'(x)$, good $x_0$	You can derive the derivative
Brent’s method	16.5	`brentq(f, a, b)`	Superlinear	Bracket $[a,b]$	1 equation, 1 unknown
Newton (N-D)	16.6	`fsolve(F, x0)`	Quadratic	Initial guess	N equations, N unknowns

The root-finding workflow#

Rewrite your equation as $f(x) = 0$
Plot $f(x)$ to count roots and identify brackets
Choose your tool — polynomial? use np.roots. Single equation? use brentq. System? use fsolve.
Verify by checking that $f(x^*) \approx 0$

Common pitfalls#

np.roots returns all roots including complex ones — filter for real, physically meaningful values.
brentq requires a sign change across the bracket. If $f(a)$ and $f(b)$ have the same sign, it raises a ValueError. Plot first.
fsolve can converge silently to a wrong answer. Always print the residuals.
Newton’s method diverges when $f'(x_n) \approx 0$. If the derivative is small near your guess, pick a different starting point.

Original equation	Rewrite as \(f(x) = 0\)
\(x^2 = 4\)	\(f(x) = x^2 - 4\)
\(e^x = 2x + 1\)	\(f(x) = e^x - 2x - 1\)
\((P + a/V^2)(V-b) = RT\)	\(f(V) = (P + a/V^2)(V-b) - RT\)

Iteration	Approximate error
0	\(10^{-1}\)
1	\(10^{-2}\)
2	\(10^{-4}\)
3	\(10^{-8}\)
4	\(10^{-16}\) (machine precision)

Property	`brentq`	`fsolve`
Equations	1	\(N\)
Unknowns	1	\(N\)
Needs bracket	Yes	No
Needs initial guess	No	Yes — critical
Guaranteed to converge	Yes (bracket valid)	No
Silent wrong answers	No	Possible — always verify

Chapter 16: Root Finding & Newton’s Method

Contents

Chapter 16: Root Finding & Newton’s Method#

Motivation: Equations You Cannot Solve by Hand#

16.1 What Is a Root?#

16.2 Special Case: Polynomial Roots with np.roots#

How it works#

Example: van der Waals as a cubic polynomial#

Under the hood: why eigenvalues = roots?#

16.3 Bisection Method#

Mathematical background#

Visual: bisection step by step#

(Optional) Using scipy.optimize.bisect#

16.4 Newton’s Method#

Mathematical background#

Convergence: quadratic vs. linear#

When Newton’s method can fail#

Newton’s method step by step#

(Optional) Using scipy.optimize.newton#

16.3.2 Newton vs. bisection: convergence comparison#

16.5 scipy.optimize.brentq: The Practical Choice#

Mathematical background#

16.4.1 Finding the van der Waals molar volume#

16.6 Systems of Nonlinear Equations: scipy.optimize.fsolve#

Mathematical background#

Key differences from brentq#

16.5.1 Simple example: two nonlinear equations#

16.7 Chemical Engineering Application: Reaction Equilibrium#

Problem setup#

How does equilibrium conversion change with pressure?#

Chapter 16 Summary#

The root-finding workflow#

Common pitfalls#

16.2 Special Case: Polynomial Roots with `np.roots`#

(Optional) Using `scipy.optimize.bisect`#

(Optional) Using `scipy.optimize.newton`#

16.5 `scipy.optimize.brentq`: The Practical Choice#

16.6 Systems of Nonlinear Equations: `scipy.optimize.fsolve`#

Key differences from `brentq`#