## Calculus in SymPy

## Calculus in SymPy¶

Working with densities involves calculus which can sometimes be time-consuming. This course gives you two ways of reducing the amount of calculus involved.

- Probabilistic methods can help reduce algebra and calculus. You've seen this with algebra in the discrete case. You'll see it with calculus as we learn more about densities.
- Python has a symbolic math module called
`SymPy`

that does algebra, calculus, and much other symbolic math. In this section we will show you how to do calculus using`SymPy`

.

We will demonstrate the methods in the context of an example. Suppose $X$ has density given by

$$ f(x) = \begin{cases} 105x^2(1-x)^4 ~~~ \text{if } 0 \le x \le 1 \\ 0 ~~~~~~~~~ \text{otherwise} \end{cases} $$As you can see from its graph below, $f$ could be used to model the distribution of a random proportion that you think is likely to be somewhere between 0.2 and 0.4.

The density $f$ is a polynomial on the unit interval, and in principle the algebra and calculus involved in integrating it are straightforward. But they are tedious. So let's get `SymPy`

to do the work.

First, we will import all the functions in `SymPy`

and set up some printing methods that make the output look nicer than the retro typewritten pgf output you saw in a previous section. In future sections of this text, you can assume that this importing and initialization will have been done at the start.

```
from sympy import *
init_printing()
```

Next, we have to tell Python which variables are symbolic and what their possible values are. The function `declare`

lets us do this. It takes as its arguments the string representing the variable, and an option `interval`

specifying the interval of possible values of the variable. In our example, the variable `x`

takes values in the unit interval. In later examples we will show you how to declare infinite intervals of possible values.

```
declare('x', interval=(0, 1))
```

Now we will assign the name `density`

to the expression that defines $f$. The expression looks just like a numerical calculation, but the output is algebraic!

```
density = 105 * x**2 * (1-x)**4
density
```

That's the density $f$ defined by the equation at the start of the section. Notice that what we naturally think of as $1 - x$ is expressed as $-x + 1$. That's because `SymPy`

is writing the polynomial leading with the term of highest degree.

Let's not simply accept that this function is a density. Let's check that it is a density by integrating it from 0 to 1. To do this, we use the method `Integral`

that takes the name of a function and a *tuple* (a sequence in parentheses) consisting of the variable of integration and the lower and upper limits of integration. We have assigned this integral to the name `total_area`

```
total_area = Integral(density, (x, 0, 1))
total_area
```

The output of displays the integral, which is nice, but what we really want is its numerical value. In `SymPy`

, this is achieved by rather rudely instructing the method to `doit()`

.

```
total_area.doit()
```

This confirms that the function $f$ is a density.

We can use `Integral`

again to find the chance of any interval. Here is $P(0.2 < X < 0.4)$.

```
p_02_04 = Integral(density, (x, 0.2, 0.4)).doit()
p_02_04
```

For $x$ in the unit interval, the cdf of $X$ is $$ F(x) ~ = ~ P(X \le x) ~ = ~ \int_0^x f(s)ds ~ = ~ I(s)~ \Big{\rvert}_0^x ~ = ~ I(x) - I(0) $$

where $I$ is the indefinite integral of $f$.

To get the indefinite integral, simply ask `SymPy`

to integrate the density; there are no limits of integration.

```
indefinite = Integral(density).doit()
indefinite
```

Now $F(x) = I(x) - I(0)$. You can see at a glance that $I(0) = 0$ but here is how `SymPy`

would figure that out.

To evaluate $I(0)$, `SymPy`

must substitute $x$ with 0 in the expression for $I$. This is achieved by the method `subs`

that takes the variable as its first argument and the specified value as the second.

```
I_0 = indefinite.subs(x, 0)
I_0
```

```
cdf = indefinite - I_0
cdf
```

To find the value of the cdf at a specified point, say 0.4, we have to substitute $x$ with 0.4 in the formula for the cdf.

```
cdf_04 = cdf.subs(x, 0.4)
cdf_04
```

Thus $P(X \le 0.4)$ is roughly 58%. Earlier we calulated $P(0.2 < X < 0.4) = 43.2\%$, which we can confirm by using the cdf:

```
cdf_02 = cdf.subs(x, 0.2)
cdf_04 - cdf_02
```

The expectation $E(X)$ is a definite integral from 0 to 1:

```
expectation = Integral(x*density, (x, 0, 1)).doit()
expectation
```

Notice how simple the answer is. Later in the course, you will see why.

Here is $E(X^2)$, which turns out to be another simple fraction. Clearly, the density $f$ has interesting properties. We will study them later. For now, let's just get the numerical answers.

```
expected_square = Integral((x**2)*density, (x, 0, 1)).doit()
expected_square
```

Now you can find $SD(X)$.

```
sd = (expected_square - expectation**2)**0.5
sd
```

### SymPy and the Exponential Density¶

One of the primary distributions in probability theory, the exponential distribution has a positive parameter $\lambda$ known as the "rate", and density given by

$$ f(t) ~ = \lambda e^{-\lambda t}, ~~~ t \ge 0 $$The density is 0 on the negative numbers. Here is its graph when $\lambda = 3$.

To check that $f$ is a density, we have to confirm that its integral is 1. So we will declare two positive symbolic variables `t`

and `lamda`

. Notice the incorrectly spelled `lamda`

. That is because `lambda`

has another meaning in Python, as some of you might know.

In fact `lamda`

is a constant, not a variable. But `SymPy`

needs to know that it's an algebraic object, so we have to declare it as such.

Note the use of `positive=True`

to declare positive variables.

```
declare('lamda', positive=True)
declare('t', positive=True)
```

Now we will define the density function. Notice the use of `exp`

for the exponential function. Notice also that the form of the answer looks different from the way we have written it above, though it's algebraically the same.

```
expon_density = lamda * exp(-lamda * t)
expon_density
```

This is an unavoidable aspect of computer aided algebra, and it is the reason we will use `SymPy`

purely for computation, not for display.

To see that the function is a density, we can check that its integral from 0 to $\infty$ is 1. The symbol that `SymPy`

uses for $\infty$ is `oo`

, a double lower case o. It looks very much like $\infty$.

```
Integral(expon_density, (t, 0, oo)).doit()
```

Suppose $T$ has the exponential $(\lambda)$ density. Then for $t \ge 0$ the cdf of $T$ is

$$ F_T(t) ~ = ~ P(T \le t) ~ = ~ \int_0^t \lambda e^{-\lambda s}ds $$This is a straightforward integral that you can probably do in your head. However, let's get some more practice using `SymPy`

to find cdf's.
We will use the same method that we used to find the cdf in the previous example.

where $I$ is the indefinite integral of the density. To get this indefinite integral we will use `Integral`

as before, except that this time we must specify `t`

as the variable of integration. That is because `SymPy`

sees two algebraic quantities `t`

and `lamda`

in the density, and doesn't know which one is the variable unless we tell it.

```
indefinite = Integral(expon_density, t).doit()
indefinite
```

Now use $F_T(t) = I(t) - I(0)$:

```
I_0 = indefinite.subs(t, 0)
I_0
```

```
cdf = indefinite - I_0
cdf
```

Thus the cdf of the exponential $(\lambda)$ density is $$ F_T(t) ~ = ~ 1 - e^{-\lambda t} $$ Also, $$ E(T) ~ = ~ \int_0^\infty t \lambda e^{-\lambda t} dt ~ = ~ \frac{1}{\lambda} $$

which you can check by integration by parts. But `SymPy`

is faster:

```
expectation = Integral(t*expon_density, (t, 0, oo)).doit()
expectation
```

And $$ E(T^2) = \frac{2}{\lambda^2} $$

```
expected_square = Integral(t**2 * expon_density, (t, 0, oo)).doit()
expected_square
```

So $$ Var(T) ~ = ~ \frac{2}{\lambda^2} - \frac{1}{\lambda^2} ~ = ~ \frac{1}{\lambda^2} $$ and hence $$ SD(T) ~ = ~ \frac{1}{\lambda} $$

The purpose of this section has been to give you a workout in `SymPy`

. We will take a closer look at the exponential distribution in the next section.

```
```