## Finding hypergeometric function expressions

In Uncategorized on May 7, 2011 by nessgrh

The core of the new code will be algorithms for expressing Hypergeometric and Meijer G functions in terms of known special functions. This is fairly complicated, so let me use this blog post to order my thoughts on the Hypergeometric case. The Meijer G case is fairly similar, there are just more indices to keep track of and more possible transformations, I’ll describe this in a later post.

Most of this post is based on the paper by Kelly Roach.

So first of all the Hypergeometric function is initially defined as a series: ${}_p F_q\left({a_p \atop b_q} \middle| z \right) = \sum_{n=0}^\infty \frac{(a_1)_n \dots (a_p)_n}{(b_1)_n \dots (b_q)_n} \frac{z^n}{n!}$, where $a_p$ and $b_q$ represent vectors of respectively $p$ and $q$ complex parameters, and $(a)_n = a(a+1)\dots(a+n-1)$. This converges in some disc if $p \le q+1$ and none of the $b_j$ is a negative integer (it also converges in some other cases if one of the $a_i$ is a non-positive integer). Notice that the Hypergeometric function is symmetric in all numerator parameters (the $a_p$), and also in all denominator parameters (the $b_q$).

It turns out that there are certain differential operators that can change the $a_p$ and $p_q$ parameters by integers. If a sequence of such operators is known that converts the set of indices $a_r^0$ and $b_s^0$ into $a_p$ and $b_q$, then we shall say the pair $a_p, b_q$ is reachable from $a_r^0, b_s^0$. Our general strategy is thus as follows: given a set $a_p, b_q$ of parameters, try to look up an origin $a_r^0, b_s^0$ for which we know an expression, and then apply the sequence of differential operators to the known expression to find an expression for the Hypergeometric function we are interested in.

## Notation

In the following, the symbol $a$ will always denote a numerator parameter and the symbol $b$ will always denote a denominator parameter. The subscripts $p, q, r, s$ denote vectors of that length, so e.g. $a_p$ denotes a vector of $p$ numerator parameters. The subscripts $i$ and $j$ denote “running indices”, so they should usually be used in conjuction with a “for all $i$“. E.g. $a_i < 4$ for all $i$. Uppercase subscripts $I$ and $J$ denote a chosen, fixed index. So for example $a_I > 0$ is true if the inequality holds for the one index $I$ we are currently interested in.

## Incrementing and decrementing indices

Suppose $a_i \ne 0$. Set $A(a_i) = \frac{z}{a_i}\frac{\mathrm{d}}{dz}+1$. It is then easy to show that $A(a_i) {}_p F_q\left({a_p \atop b_q} \middle| z \right) = {}_p F_q\left({a_p + e_i \atop b_q} \middle| z \right)$, where $e_i$ is the i-th unit vector. Similarly for $b_j \ne 1$ we set $B(b_j) = \frac{z}{b_j-1} \frac{\mathrm{d}}{dz}+1$ and find $B(b_j) {}_p F_q\left({a_p \atop b_q} \middle| z \right) = {}_p F_q\left({a_p \atop b_q - e_i} \middle| z \right)$. Thus we can increment upper and decrement lower indices at will, as long as we don’t go through zero. The $A(a_i)$ and $B(b_j)$ are called shift operators.

It is also easy to show that $\frac{\mathrm{d}}{dz} {}_p F_q\left({a_p \atop b_q} \middle| z \right) = \frac{a_1 \dots a_p}{b_1 \dots b_q} {}_p F_q\left({a_p + 1 \atop b_q + 1} \middle| z \right)$, where $a_p + 1$ is the vector $a_1 + 1, a_2 + 1, \dots$ and similarly for $b_q + 1$. Combining this with the shift operators,  we arrive at one form of the Hypergeometric differential equation: $\left[ \frac{\mathrm{d}}{dz} \prod_{j=1}^q B(b_j) - \frac{a_1 \dots a_p}{(b_1-1) \dots (b_q-1)} \prod_{i=1}^p A(a_i) \right] {}_p F_q\left({a_p \atop b_q} \middle| z \right) = 0$. This holds if all shift operators are defined, i.e. if no $a_i = 0$ and no $b_j = 1$. Clearing denominators and multiplying through by z we arrive at the following equation: $\left[ z\frac{\mathrm{d}}{dz} \prod_{j=1}^q \left(z\frac{\mathrm{d}}{dz} + b_j-1 \right) - z \prod_{i=1}^p \left( z\frac{\mathrm{d}}{\mathrm{d}z} + a_i \right) \right] {}_p F_q\left({a_p \atop b_q} \middle| z\right) = 0$. Even though our derivation does not show it, it can be checked that this equation holds whenever the ${}_p F_q$ is defined.

Notice that, under suitable conditions on $a_I, b_J$, each of the operators $A(a_i)$, $B(b_j)$ and $z\frac{\mathrm{d}}{\mathrm{d}z}$ can be expressed in terms of $A(a_I)$ or $B(b_J)$. Our next aim is to write the Hypergeometric differential equation as follows: $[X A(a_I) - b] {}_p F_q\left({a_p \atop b_q} \middle| z\right) = 0$, for some operator $X$ to be determined. If $b \ne 0$, then we can write this as $\frac{-1}{b} X {}_p F_q\left({a_p + e_I \atop b_q} \middle| z\right) = {}_p F_q\left({a_p \atop b_q} \middle| z\right)$, and so $\frac{-1}{b}X$ undoes the shifting of $A(a_I)$, whence it will be called an inverse-shift operator.

Now $A(a_I)$ exists if $a_I \ne 0$, and then $z\frac{\mathrm{d}}{\mathrm{d}z} = a_I A(a_I) - a_I$. Observe also that all the operators $A(a_i)$, $B(b_j)$ and $z\frac{\mathrm{d}}{\mathrm{d}z}$ commute. We have $\prod_{i=1}^p \left( z\frac{\mathrm{d}}{\mathrm{d}z} + a_i \right) = \left(\prod_{i=1, i \ne I}^p \left( z\frac{\mathrm{d}}{\mathrm{d}z} + a_i \right)\right) a_I A(a_I)$, so this gives us the first half of $X$. The other half does not have such a nice expression. We find $z\frac{\mathrm{d}}{dz} \prod_{j=1}^q \left(z\frac{\mathrm{d}}{dz} + b_j-1 \right) = \left(a_I A(a_I) - a_I\right) \prod_{j=1}^q \left(a_I A(a_I) - a_I + b_j - 1\right)$. Since the first half had no constant term, we infer $b = -a_I\prod_{j=1}^q(b_j - 1 -a_I)$.

This tells us under which conditions we can “un-shift” $A(a_I)$, namely when $a_I \ne 0$ and $b \ne 0$. Substituting $a_I - 1$ for $a_I$ then tells us under what conditions we can decrement the index $a_I$. Doing a similar analysis for $B(a_J)$, we arrive at the following rules:

• An index $a_I$ can be decremented if $a_I \ne 1$ and $a_I \ne b_j$ for all $b_j$.
• An index $b_J$ can be incremented if $b_J \ne -1$ and $b_J \ne a_i$ for all $a_i$.

Combined with the conditions (stated above) for the existence of shift operators, we have thus established the rules of the game!

## Reduction of Order

Notice that, quite trivially, if $a_I = b_J$, we have ${}_p F_q\left({a_p \atop b_q} \middle| z \right) = {}_{p-1} F_{q-1}\left({a_p^* \atop b_q^*} \middle| z \right)$, where $a_p^*$ means $a_p$ with $a_I$ omitted, and similarly for $b_q^*$. We call this reduction of order.

In fact, we can do even better. If $a_I - b_J \in \mathbb{Z}_{>0}$, then it is easy to see that $\frac{(a_I)_n}{(b_J)_n}$ is actually a polynomial in $n$. It is also easy to see that $(z\frac{\mathrm{d}}{\mathrm{d}z})^k z^n = n^k z^n$. Combining these two remarks we find:

• If $a_I - b_J \in \mathbb{Z}_{>0}$, then there exists a polynomial $p(n) = p_0 + p_1 n + \dots$ (of degree $a_I - b_J$) such that $\frac{(a_I)_n}{(b_J)_n} = p(n)$ and ${}_p F_q\left({a_p \atop b_q} \middle| z \right) = \left(p_0 + p_1 z\frac{\mathrm{d}}{\mathrm{d}z} + p_2 \left(z\frac{\mathrm{d}}{\mathrm{d}z}\right)^2 + \dots \right) {}_{p-1} F_{q-1}\left({a_p^* \atop b_q^*} \middle| z \right)$.

Thus any set of parameters $a_p, b_q$ is reachable from a set of parameters $c_r, d_s$ where $c_i - d_j \in \mathbb{Z}$ implies $c_i < d_j$. Such a set of parameters $c_r, d_s$ is called suitable. Our database of known formulae should only contain suitable origins. The reasons are twofold: firstly, working from suitable origins is easier, and secondly, a formula for a non-suitable origin can be deduced from a lower order formula, and we should put this one into the database instead.

## Moving Around in the Parameter Space

It remains to investigate the following question: suppose $a_p, b_q$ and $a_p^0, b_q^0$ are both suitable, and also $a_i - a_i^0 \in \mathbb{Z}$, $b_j - b_j^0 \in \mathbb{Z}$. When is $a_p, b_q$ reachable from $a_p^0, b_q^0$? It is clear that we can treat all parameters independently that are incongruent mod 1. So assume that $a_i$ and $b_j$ are congruent to $r$ mod 1, for all $i$ and $j$. The same then follows for $a_i^0$ and $b_j^0$.

If $r \ne 0$, then any such $a_p, b_q$ is reachable from any $a_p^0, b_q^0$. To see this notice that there exist constants $c, c^0$, congruent mod 1, such that $a_i < c < b_j$ for all $i$ and $j$, and similarly $a_i^0 < c^0 < b_j^0$. If $n = c - c^0 > 0$ then we first inverse-shift all the $b_j^0$ $n$ times up, and then similarly shift shift up all the $a_i^0$ $n$ times. If $n < 0$ then we first inverse-shift down the $a_i^0$ and then shift down the $b_j^0$. This reduces to the case $c = c^0$. But evidently we can now shift or inverse-shift around the $a_i^0$ arbitrarily so long as we keep them less than $c$, and similarly for the $b_j^0$ so long as we keep them bigger than $c$. Thus $a_p, b_q$ is reachable from $a_p^0, b_q^0$.

If $r = 0$ then the problem is slightly more involved. WLOG no parameter is zero. We now have one additional complication: no parameter can ever move through zero. Hence $a_p, b_q$ is reachable from $a_p^0, b_q^0$ if and only if the number of $a_i < 0$ equals the number of $a_i^0 < 0$, and similarly for the $b_i$ and $b_i^0$. But in a suitable set of parameters, all $b_j > 0$! This is because the Hypergeometric function is undefined if one of the $b_j$ is a non-positive integer and all $a_i$ are smaller than the $b_j$. Hence the number of $b_j \le 0$ is always zero.

We can thus associate to every suitable set of parameters $a_p, b_q$, where no $a_i = 0$, the following invariants:

• For every $r \in [0, 1)$ the number $\alpha_r$ of parameters $a_i \equiv r \pmod{1}$, and similarly the number $\beta_r$ of parameters $b_i \equiv r \pmod{1}$.
• The number $\gamma$ of integers $a_i$ with $a_i < 0$.

The above reasoning shows that $a_p, b_q$ is reachable from $a_p^0, b_q^0$ if and only if the invariants $\alpha_r, \beta_r, \gamma$ all agree. Thus in particular “being reachable from” is a symmetric relation on suitable parameters without zeros.

## Applying the Operators

If all goes well then for a given set of parameters we find an origin in our database for which we have a nice formula. We now have to apply (potentially) many differential operators to it. If we do this blindly then the result will be very messy. This is because with Hypergeometric type functions, the derivative is usually expressed as a sum of two contiguous functions. Hence if we compute $N$ derivatives, then the answer will involve $2N$ contiguous functions! This is clearly undesirable. In fact we know from the Hypergeometric differential equation that we need at most $\max(p, q+1)$ contiguous functions to express all derivatives.

Hence instead of differentiating blindly, we will work with a $\mathbb{C}(z)$-module basis: for an origin $a_r^0, b_s^0$ we either store (for particularly pretty answers) or compute a set of $N$ functions (typically $N = \max(r, s+1)$) with the property that the derivative of any of them is a $\mathbb{C}(z)$-linear combination of them. In formulae, we store a vector $B$ of $N$ functions, a matrix $M$ and a vector $C$ (the latter two with entries in $\mathbb{C}(z)$), with the following properties:

• ${}_r F_s\left({a_r^0 \atop b_s^0} \middle| z \right) = C B$
• $\frac{\mathrm{d}}{\mathrm{d}z} B = M B$.

Then we can compute as many derivatives as we want and we will always end up with $\mathbb{C}(z)$-linear combination of at most $N$ special functions.

As hinted above, $B$, $M$ and $C$ can either all be stored (for particularly pretty answers) or computed from a single ${}_p F_q$ formula.

### 6 Responses to “Finding hypergeometric function expressions”

1. I’m a little confused what you mean by $a_I$ and $a_J$ with capital subscripts as opposed to lowercase subscripts.

2. For whatever reason, 2N ($2N$) is rendering as 3- ($3-$) in the last paragraph (but the mouseover text says “2N”).

So did I read it right that everything has to differ by integers? In other words, if you have 3/4 in your result, then you only need to look for items in the table that are congruent to 3/4 mod 1 for that index?

This looks good. Very interesting. Now I’d like to read about how you plan to implement. Based on this, I’d say that just the MeijerG class should be non-trivial to implement (if it includes all these things as methods), so I’d like to see ideas how you plan to design it.

• I think that latex thing is a WordPress bug. If I create a new post on my blog with the text from the first sentence and preview it, it says “For whatever reason, 2N (2N) is rendering as 3- (2N) in the last paragraph (but the mouseover text says “2N”).” Weird.

3. I already commented on this, but it must have gotten lost. I am unclear on the difference between $a_I$ and $a_i$ in your notation (how are the capital indices different?).

4. For some reason wordpress wants me to moderate every comment. Got to tweak the settings somehow…

I will write about implementation in the next post(s).

As for capitalised indices, there is not much of a difference. I’m mentally fixing the I and/or J, and then ask “can I increment index I on a”. I’ll try to make this clearer.

• OK, that’s what I thought it was (lowercase implies for all), but it wasn’t clear.