A question of the argument—or: Tricky branch cuts revisited

In Uncategorized on July 13, 2011 by nessgrh

This post is a continuation of my previous one on branch cuts. My usual status update will again come on friday, and this time I actually have something to show off.

As before, the aim of this post is two-fold: both to clarify my thoughts, and to invite suggestions for solution.

It turn out my previous treatment of branched functions was a bit short-sighted (who would have guessed that, hu? :D). In order to make progress, I have for now been ignoring problems related to branches/arguments; but by now this is the last big problem before my branch can finally become “pull request ready” (more on that in friday’s post). Let me first describe the problem(s) at hand. I have picked five (six) representative examples.

  1. IFT\left[\frac{1}{1 + 2\pi i k}\right](x), x > 0. (IFT stands for inverse fourier transform) As explained previously, this comes out as (roughly) e^{-x} \left(\Gamma(0, e^{2 \pi i}) - \Gamma(0, 1) \right). The e^{2 \pi i} must not be simplified to 1 here.
  2. IFT\left[ \frac{1}{1 + 2\pi i k} \right](x), x < 0. As also explained previously, the G-function algorithm cannot compute this without realising that exp(x) is unbranched, i.e. that the argument of this function may indeed be converted to (-\pi, \pi].
  3. MT\left[\frac{(b + \sqrt{b^2 + x^2})^a}{-\sqrt{b^2 + x^2}} \right], \text{where } b < 0. (MT stands for mellin transform) This is curious. To understand what is going on, we have to realise that the function is actually represented as C F(x^2/b^2) for some constant C independent of x, and a function F(z). Thus b and x together determine the branch of the square root taken. Thus putting a negative b in above expression yields \frac{(b - \sqrt{b^2 + x^2})^a}{-\sqrt{b^2 + x^2}}, where \sqrt{z} for once denotes the principal branch. This is surely (?) not what the user had in mind when entering the above expression (and btw we cannot compute the answer for this argument anyway).
  4. \sqrt{x^2 + t^2} + t matches \sqrt{x^2 + b^2} - b with t = -b. This is wrong for the same reason as the example above, but this time the problem is that matching code thinks (-1)^2 = 1.
  5. MT \left[e^{ax} \sin{ax} \right]. This mellin transform only exists for a in the left half plane. The unsimplified result has a term (roughly) like (e^{2 \pi i}a)^s, except that it is more complicated so that powdenest() doesn’t put the s into the exponent. This only happens when calling simplify. The end result is that when we substitute the real \pi for the dummy an important factor of e^{2 \pi i} is lost. (In case this explanation is too confusing, look at this session log. Crucially, the last equation has a factor of e^{-2\pi i s} right before the gamma function, whereas the first does not.)
  6. ILT \frac{a^\nu (s + \sqrt{s^2-a^2})^{-\nu}}{\sqrt{s^2 - a^2}}. (ILT stands for inverse laplace transform) In this case a double minus sign is cancelled, for similar reasons as above. (Here is a log.)

I think the following conclusions can be drawn from these examples:

  1. powdenest(force=True) is not a reliable way to propagate exponents and thus not sufficient to be sure that we can safely substitute \pi for a dummy. [It has other problems as well, e.g. powdenest(abs(x), force=True) == x, and that’s probably not even a bug.]
  2. It is (thus) not tenable to try to hide all branch computations from the user in the way I tried so far (i.e. to return a result with fully computed branches, where -1-s can be cancelled in etc).
  3. There needs to be a function similar to arg() and unbranched_argument() that can return reduced arguments if this is known to be correct.
  4. It would be desirable to to have functions somewhat like powdenest(force=True) that do all simplifications valid on the riemann surface of the logarithm. Note that fore=True works by pretending all symbols are positive, this is not necessarily the same thing.

I’m looking for an unintrusive way to solve the above four problems.

I’m toying with the idea of adding a new assumption “polar” to sympy. I think almost no code needs to know about it, and it seems like it could solve my problems. I’m not very experienced with the sympy core, though, so I may be wrong. I imagine that this assumption can be set on various objects to make them behave like numbers on the riemann surface of the logarithm. In particular exp(z, polar=True) stops exp(2*pi*I) == 1. We could also allow this on e.g. I or -1, so that exp(pi*I, polar=True) can return -1 with the polar=True flag set. Multiplying two such factors together will propagate back to exp. Also polar=True should stop exp from being expanded trigonometrically (since adding polar numbers does not really make sense, as far as I can see—adding shouldn’t be disallowed though, since it can be useful formal notation). Matching code could propagate this flag (I’m fuzzy on the details). Similarly powdenest etc could watch out for it.

This solves all but (3), for which I could just make a new function. My existing machinery for handling branch cuts should be adapted, obviously (actually most of it could go now). argument_dummy can go of course.

It might even be advantageous to have an assumption “period” which can have a numerical value (where the old behaviour means period=None, and the above behaviour means period=oo); but I think assumptions must be flags so this cannot work. If it would, it might allow for a unified treatment of (1-4), but that’s probably not worth it.


Status Update—Week 7

In Uncategorized on July 8, 2011 by nessgrh

I’m starting to get into a phase were results of my new code are not as impressive as one might wish. Basically I hit one obscure integral that doesn’t work, and then I spend hours hunting down some bugs to fix it. These bugs can be minor typos or rather deep conceptual problems (see the branch cuts post), but nonetheless they are not very user-visible.

In any case, this week I fixed the combsimp algorithm for gamma functions, I implemented most of the expansion algorithm for G functions, I added classes to represent unevaluated transforms (as suggested in a comment to my last blog post), and I just about started to to extend the lookup tables. The last part turns out to be a real stress test for my code (since obviously we want to derive the formulas wherever possible, instead of putting them all into tables).

As I said before, there is not a lot to show off. Perhaps one word of clarification: there was code to expand meijer g-functions all the time. It works by evaluating the defining integral(s) using the residue theorem, i.e. using the (generalised) slater theorem. However, in some cases this does not work, because the series are properly non-hypergeometric. This can be quite annoying. Consider, for example G_{1, 2}^{2, 1}\left( y \atop {0, y} \middle| x \right) = e^{x} \Gamma\left(1 - y \right) \Gamma(y, x). This formula holds for all y that are not positive integers (that is for all y so that the G-function on the left hand side makes sense). But for negative integers, slater’s theorem does not apply! This is quite a shame, since obviously substituting them in the end result is no problem at all, there does not seem to be a limit involved. But this is not the case, there is some gamma function cancellation going on. In fact, whenever Slater’s theorem applies, we know that the G-function is a sum of terms of the form z^a F(z), where F(z) is unbranched at the origin. But \Gamma(-n, z) has logarithmic monodromy at the origin. Thus there really is a limiting process.

Hence I spent about 1.5 days developing this new code. And then another half a day thinking about the branch cuts. Let’s see how much longer it is going to take until there is a satisfactory solution ^^.


Tricky branch cuts

In Uncategorized on July 7, 2011 by nessgrh

This post will elaborate on the recent problems I’m having with “multivalued” functions. My usual status update will come tomorrow. I shall try to be almost pedantically clear in my description, both in order to clear my own thoughts and to increase the chance of someone suggesting a way out. In what follows, \mathcal{S} will denote the riemann surface of the logarithm (this only means infinitely many copies of the complex line, glued together approprietely so that \log{z} becomes an entire, single-valued function on \mathcal{S}). Zero is not an element of \mathcal{S}.

The problem we will study is how to compute the inverse fourier transform f(t) = \int_{-\infty}^\infty e^{i t x} \frac{\mathrm{d}x}{i x + 1}. First of all it is easy to see that the integral converges only for real, non-zero t. Also using contour integration it is easily established that f(t) = 2\pi H(t) e^{-t}, where H(t) denotes the Heaviside step function.

How does the Meijer G-function algorithm compute this integral? First of all it is split up into two parts, I_1 = \int_0^\infty e^{i t x} \frac{\mathrm{d}x}{i x + 1} and I_2 = \int_{-\infty}^0 e^{i t x} \frac{\mathrm{d}x}{i x + 1}. Let’s concentrate on I_1 for now. The integral exists if the exponential has a decaying component, or if it is oscillatory. That is to say it converges for Im(t) \ge 0, t \ne 0. But let’s be a bit more specific: all meijer G functions are defined on \mathcal{S}, and may or may not descend to \mathbb{C}. Thus define I_1: \mathcal{S} \supset D \rightarrow \mathbb{C} to be our integral (where D denotes the region of convergence). Of course I_1(z) = I_1(e^{2\pi i} z) when the integral converges (note that this equation is not vacuous: e^{2\pi i} \ne 1 on \mathcal{S}). We thus find:

The integral I_1(t) converges if (2n-1) \pi \le Arg(t) \le 2n \pi, for some integer n.

Having established this, let’s see what the algorithm computes. I should explain the output. Note that debugging output is enabled. The first piecewise result is what would ordinarily be presented to the user: there is a closed expression, holding under various conditions. And there is the “otherwise” part of the piecewise, which just says that if the conditions do not hold, we don’t know the answer. Consider the conditions. unbranched_argument(t) means the argument of t \in \mathcal{S} (whereas arg(t) would mean some value between -\pi and \pi, and that function is not continuous either). Thus the conditions are just a funny way of saying 0 \le Arg(t) \le \pi. Now consider the claimed result. It looks a bit daunting, but this is because the functions in which it is written are defined on \mathbb{C}, not \mathcal{S}. The debugging output line “Result before branch substitutions …” shows what the result would be on \mathcal{S}. I have computed this in output [8] (notice that what looks like \pi here is a dummy, to stop evaluation of e^{\pi i} etc). Of course \exp(z) is an unbranched function on \mathbb{C}, so the first two factors are uninteresting. But the last factor is not: we see that the result is related to the upper incomplete gamma function. This is a branched function, i.e. it is properly defined on \mathcal{S} and it does not descend to \mathbb{C}. If t is a positive real number, then the incomplete gamma function is to be evaluated at argument -\pi. This is where the extra terms (like argument_period()) come from: on \mathbb{C} argument -\pi means negative reals. But with the branch cut conventions as implemented (in mpmath, effectively, but also enforced throughout sympy), negative reals mean argument +\pi. So the answer cannot just be \Gamma(0, -t), because this would end up on the wrong branch. Thus we conclude (or rather the algorithm did for us):

Define g: \mathcal{S} \rightarrow \mathbb{C} by g(z) = -i e^{-t} \Gamma(0, e^{-i \pi} t). Then for 0 \le Arg(t) \le \pi, I_1(t) = g(t).

Notice in particular that our original integral has two natural extensions to \mathcal{S}: the naive definition I_1(t), and the analytic continuation g(t). The algorithm has to return some subset of the first definition, and not the second. In a sense this is why the convergence condition is necessary (although technically it is there because the integral representation of the G-functions used to establish this formula diverges outside the region specified by the convergence conditions).

We could now do a similar analysis for I_2. The story is essentially the same, with the upper half plane replaced by the lower half plane (-\pi \le Arg(t) \le 0) throughout. So when can we compute the combined integral? We already noted that this should be the case exactly when t is real, and working in \mathbb{C} this is easy to see: the intersection of the (closed) upper and lower half planes is the real axis. But notice how this fails on \mathcal{S}. The intersection of 0 \le Arg(t) \le \pi (where we can evaluate the first integral) and -\pi \le Arg(t) \le(t) is Arg(t) = 0, i.e. only positive t.

To further amplify the confusion, computing said fourier transform with argument -t, where t is declared positive, works. This is because before “switching” to working on \mathcal{S}, the algorithm carelessly cancels some minus signs … which is fine for entire functions, of course.

Here ends this rather lengthy explanation. It remains to figure out how we can teach the algorithm that the integrals are periodic in argument 2\pi, instead of knowing them only on one half-sheet. I have some vague ideas for this, but input is appreciated. In any case, I’ll think about this for a while. For now there is enough other work to do.


Status Update—Week 6

In Uncategorized on July 2, 2011 by nessgrh

This week saw less spectacular addition of functionality, and more fixing of details. This is what I did:

  • Laplace Inversion integral for G functions
  • Fourier Transforms
  • Simplification of Gamma functions
  • Miscellaneous extensions and fixes.

Laplace Inversion integral for G functions

The inverse laplae tranform can be defined as f(t) = \int_{c - i\infty}^{c + i\infty} e^{zt} F(z) \mathrm{d}z. As discussed last week, this is related to the mellin inversion formula, and indeed mellin inversion can be used to compute many inverse laplace transforms. On the other hand, if F is expressible as a G function of argument az^b for b rational, it turns out that the answer itself is expressible as a G function. This opens another route to computation. However, I could only find a formula in the literature for b = \pm 1. For this reason I worked out in some detail the formula, and, more importantly, the accompanying convergence conditions in a documentation file. As expected, this allows some more computations.

Fourier Transforms

The last big transform to take care of is the Fourier transform. This first involved extending my integration code to handle integrals from -\infty to \infty; it does this by splitting up the integral in (hopefully) intelligent way into two parts. Many of the basic fourier transforms work already, but some do not. This will require some more work. Here are a few examples. As can be seen in particular inverse fourier transforms tend not to work (where I omitted them); this is because of some bugs in the way the integral is split, and also because some integrands are just not recognised (e.g. I should add code to recognise general rational functions). However, it is very pleasing to see that many non-trivial transforms are computed correctly essentially from the integral definition, almost without intervention.

Simplification of Gamma functions

Another fun thing I did this week was to extend the combsimp() function to handle two gamma function identities, namely the multiplication theorem and the reflection formula. These are used to reduce the number of gamma functions as far as possible. The algorithm works quite well now, bit it is getting somewhat slow. Here are a few examples. Indeed the last one is evidently a bug…

Miscellaneous extensions and fixes

Now that the release is out, Aaron has started reviewing my gsoc pull request in detail. This stirs up lots of little bugs/inconsistencies I have been working on. Also my work on the integration code introduced some test failures here and there, which I also have been fixing today. Furthermore, I sometimes hit bugs in sympy and have to fix them along the way, for example the implementations of is_irrational() in Add and Mul are broken (one would not expect this to be much trouble, but actually from this the assumptions engine can infer in a number of steps that certain things are e.g. positive which most definitely are not; for example abs(I*pi) used to return I*pi for me…). Finally I implemented a few simplifications of bessel functions of integral orders, using e.g. symmetry J_n(-x) = (-1)^n J_n(x) etc.


Status Update—Week 5

In Uncategorized on June 24, 2011 by nessgrh

This was one busy week. Here is a list of things I did:

  • Start inverse mellin transform.
  • Implement Laplace transform, start inverse Laplace transform.
  • Implement expansion of G functions of finite confluence.
  • Implement the recursive mellin transform method.
  • Re-organise the integration code, Performance improvements.

Let’s look at everything in turn.

Inverse Mellin Transform

The inverse mellin transform is a cornerstone of the recursive mellin transform method. Since the meijer G function is essentially defined as the inverse mellin transform of a fairly general function, this is “only” about recognising a well-defined set of building blocks (rational functions, exponentials, trigonometric functions, gamma function), rewriting them using gamma functions, and turning this information into a G function. As usual the devil is in the details, but the code works quite well by now. Here are a few examples. Note that inverse mellin transforms are only defined relative to a fundamental strip (i.e. a strip parallel to the imaginary axis, free of singularities): they are defined as a contour integral along a line in the strip. This is what is being passed as the last parameter. But notice that the fundamental strip can be recovered unambiguously from only one of its edges. Namely if we say know the left edge, then any contour sufficiently close to the right of it will give the right answer—indeed any contour to the left of all singularities (since we can only work with a small set of functions anyway, finding the singularities is easy). This can be handy sometimes, and is triggered by passing None for one of the edges (see last example).

Currently unimplemented are inverse mellin transforms of trigonometric functions, and some transformations on gamma functions (Basically for the meijer G function, all gamma functions must have argument \pm s + c. If e.g. it is an integer multiple of s, we can use the gamma function multiplication theorem.)

Laplace Transforms

The laplace transform of a function is defined as F(s) = \int_0^\infty e^{-st} f(t) \mathrm{d}t. This is evidently an integral of the form we can handle. Notice also that the laplace transform is related by a change of variables to the mellin transform. But computing laplace transforms via mellin transforms is unlikely to work since the functions become much more complicated (multiply by a step function, and introduce a logarithm in the innermost part). Hence the laplace_transform function directly tries to compute the integral. Here are some examples (the very first example used to be quite slow—see the section below). Similarly to mellin transforms, laplace transforms of all “sensible” functions converge for Re(s) > a, i.e. in a half plane. Results are returned as (transform, a, cond) triples where `a` designates the half plane of convergence, and cond are auxiliary convergence conditions (these are always True here, because of the assumptions we set up on a, b, c, d).

The inverse laplace transform consists of a Mellin-type contour integral. Again it can be expressed via inverse mellin transforms, but this time chances that this works are much higher: the change of variables effectively happens after the integration, so the integrand is not made more complicated. This is what I currently implemented, and it works for many interesting examples. Notice that the inverse laplace transform is also defined (in principle) relative to a half plane. But here we can use the capabilities of inverse_mellin_transform to pass the left edge of the “strip” (a, \infty) as None. This is why the plane does not have to be specified.

However there are common examples that are unlikely to ever work this way (e.g. inverse laplace transform of laplace transform bessel functions, or even inverse laplace transform of s^a for symbolic/non-integer a). But it turns out that the laplace inversion integral of a G function is again expressible as a G function. This yields another route to computation, but I have not yet implemented it.

G functions of finite confluence

Slater’s theorem can be used to expand a G function in terms of hypergeometric functions if all poles of the integrand in the definition are simple. If there are multiple poles, this does not work; the resulting series are not hypergeometric. We hit such a case when computing the inverse mellin (or laplace) transform of e.g. 1/s^2.

Even though it is not expressible (directly) using hypergeometric functions only, if there are only finitely many multiple poles, then we can write the G function as a finite sum of residues, plus some hypergeometric series (I didn’t find this worked out in the literature, but its a fairly simple generalisation of Slater’s theorem. Just messier. I wrote it down in detail in a documentation file.) Thus we need to compute residues of gamma products, for which we need laurent expansions of gamma functions. I added these as well. This all works now. The last answer comes out incredibly messy without the simpilfy—a bug.

Recursive Mellin Transforms; Optimisations

I described the rescursive mellin transform method in my last blog post. This is now implemented, allowing us to compute e.g. this.

However, this recursive code can be fairly slow. I sped it up a little, using a simple heuristic and caching. For the heuristic, recall that we try to write the integrand as a product of two G functions. The heuristic is to try to make only one power of x occur in every term that is supposed to be a G function. E.g. if we want to rewrite e^{x^2} e^x e^{3x^2} e^{2x}, this code will first try to rewrite this as \left(e^{x^2} e^{3x^2} \right) \left( e^x e^{2x} \right).

Also I made the use of dummies in my code more caching-friendly (not creating too many new ones), and declared a key function @cacheit. The algorithm implementation could be changed so as to avoid redundant calls, but this way is much easier. Computing above integral now takes about 500ms, mostly due to the heuristic. The above-mentioned laplace transform now takes about 1.7 seconds, mostly due to caching. These times are not great, but I think they are ok for now. I don’t want to hit the “premature optimisation trap”.


That’s it for this week. Next week I will add the laplace inversion integral, and Fourier transforms. I also have a growing list of assorted bugs and todos. With some luck, by the end of next week I will be able to start extending the lookup tables.


Status Update—Week 4

In Uncategorized on June 17, 2011 by nessgrh


This week I finally started writing code to compute integrals. Let’s quickly recap the general strategy employed: given an integral \int_I f(x) \mathrm{d}x, rewrite it in one of the following forms:

  1. \int x^s G(x) \mathrm{d}x
  2. \int_0^\infty x^s G(ax) \mathrm{d}x
  3. \int_0^\infty x^s G(ax) G(bx) \mathrm{d}x

Here G(x) represents any G function. If this is possible, we have a chance of evaluating the integral in terms of more G functions, and the code from the previous weeks takes over to expand these in terms of known functions (hopefully). Note that in cases (2) and (3) there are of course very many conditions under which the integrals may or may not converge. These have been tabulated extensively, and this way we can, given an integrand with parameters, actually determine convergence regions! My current code contains more than 30 conditions for (3) which seem to take care of almost all cases.

There are two parts to the re-writing: (a) converting the limits into the form 0, \infty, and (b) rewriting the integrand using G-functions. Both seem to be more of an art than a science. Much of my efforts during the rest of GSOC will probably involve improving the code for (a) and (b) in some way. For the curious, (b) is essentially equivalent to computing the Mellin transform of a function. This of course again involves integration, and opens the door for a recursive algorithm. I have not implemented that, yet.


Let’s take a look at some examples. Another GSOC student needs to compute integrals of Gaussian functions. This works to some extent. There are several interesting things to notice. First of all, declaring \sigma, \mu to be positive is crucial, otherwise the code would return a piecewise expression with conditions of convergence. Second, we see in (2, 3) that general gaussian integrals can be computed. What happens is that the code expands the integrand and then ends up with a product of two exponentials, both of which are easily rewritten as G-functions. On the other hand, (4) does not work: we end up with three exponentials, and we have no formula for that. Of course the two with linear  exponent could be combined, but the algorithm does not recognise that. Here the recursive mellin transform technique would have been helpful. (Also note the new parameter meijerg for integrate(): it can be either True, False, or None (default), which mean, respectively, to always use meijer G function methods, never use meijer G function methods, and to use all methods available.)

That brings us to another very important application: integral transforms. It is clear that the code should be able to compute any common integral transform (Fourier, Laplace, Mellin, etc) of any G function, which means integral transforms of *lots* of functions. I will add code that wraps integrate() to apply various conventions of integral transforms, e.g. to provide an analytic continuation of the laplace transform, and the critical strip of a mellin transform, etc. For now there is only mellin_transform. Given any function f(x), it computes the Mellin Transform F(s), the critical strip (a, b) (such that the defining integral converges for a < Re(s) < b), and possibly auxiliary convergence conditions. Here are a few examples. Again there are a few noteworthy things. First of all we see that the convergence conditions returned by the code are strong enough to compute the fundamental strips—those shown indeed agree with what can be found in the literature. Secondly, we see that sometimes the result is expressed using factorials, where gamma functions would be more appropriate. This comes from the function combsimp(), which I will have to improve. Thirdly, we see that all mellin transforms shown are essentially products exponentials and gamma functions involving s. This is no surprise, since this is exactly the condition for being expressible as a G function. In particular, example (3) demonstrates that using recursive mellin transforms, the non-working example (4) from the previous link could be made to work. Finally, there is a catch: the algorithm could not determine any convergence conditions for positive real t. This probably (hopefully) means that I mis-typed some of the convergence conditions—I will have to investigate this.


So much for this week. Next week I will concentrate on improving all the problems pointed out above (that’s of course quite open-ended…), and on extending the tables used to rewrite functions as G functions. This should hopefully yield many non-trivial integrals.


Status Update — Week 3

In Uncategorized on June 11, 2011 by nessgrh

This week I had exams. As agreed upon with my mentor, I did not do any serious work on SymPy.