Publications by Jochen Voss. A complete list of my publications. … read more

Some Random Remarks. Private homepage of Jochen Voss, including publications, computer programs, some photos and a blog. … read more

Photos. Main page of my online photo album. … read more

2015-04-17

I am happy to announce the release of fast-dm version 30.2. This release fixes some memory leaks found by Valgrind. You can download the program from the Fast-dm homepage.

2015-02-06

A new paper, together with my PhD student Mark Webster and my colleague Stuart Barber, has been published today!

- S. Barber, J. Voss and M. Webster:
The Rate of Convergence for Approximate Bayesian Computation.
Electronic Journal of Statistics, vol. 9, pp. 80–105, 2015.

DOI:10.1214/15-EJS988, arXiv:1311.2038, more…

This paper took a long time to be published, but I think it is worth the wait.

2014-03-04

I am very happy to announce that a very old (nearly forgotten?) article has finally appeared. We chose the journal because we had heard the rumour that they have a fast turnaround, but this plan didn't work out: the article was submitted 23 September 2011 and accepted on 21 February 2012. But then came a two year wait, and the article only has appeared just now:

- K.V. Mardia and J. Voss:
Some Fundamental Properties of a Multivariate von Mises Distribution.
Communications in Statistics — Theory and Methods, vol. 43,
pp. 1132–1144, 2014.

DOI:10.1080/03610926.2012.670353, arXiv:1109.6042, more…

2013-11-22

Do you want to do a PhD on Bayesian Uncertainty Quantification

with me? Are you good with maths and computers? As part of the new,
NERC funded,
doctoral training centre I am advertising for a PhD position. Details can
be found on the
DTP web page,
the deadline for applications is 24 January 2014, start of the PhD is
October 2014.

2013-11-08

I am very happy to report that after a long struggle we have finally finished (and submitted) a new paper today:

S. Barber, J. Voss and M. Webster:
The Rate of Convergence for Approximate Bayesian Computation.
Electronic Journal of Statistics, vol. 9, pp. 80–105, 2015.

DOI:10.1214/15-EJS988, arXiv:1311.2038, more…

The paper studies the convergence of ABC estimates to the correct value. In particular,

- we prove that the method converges under very weak conditions,
- we show that the tolerance parameter δ should be chosen
proportional to
*n*^{-1/4}, where*n*is the number of accepted ABC samples, and - we determine the speed of convergence under optimal choice of δ.

Hopefully, these results prove useful for others, too.

2013-10-21

I just tried to find out what the default random number generator of
the Go programming language is. The
source code for this RNG can be found in the file
`rng.go`

;
the only reference in the source code is the cryptic comment algorithm
by DP Mitchell and JA Reeds

. Does anybody have a reference for this,
or can at least identify DP Mitchell

or JA Reeds

? Some
things I tried:

- I found one reference which claims that the Go random number generator is the same as used in Plan 9. The participants of this discussion seem to not know the origin of the method used.
- While searching for
Mitchell

in the context of random number generation, I found references to the following article: D. Mitchell,Nonlinear Key Generators,

Cryptologia, vol. 14, pp. 350-354, 1990. I hoped that this article may allow me to find an affiliation forD. Mitchell

, but unfortunately it seems to have been removed(!) from the journal. The table of contents shows a gap between pages 349 and 355. - A Google search for
mitchell reeds random number generator

shows no obvious good candidates.

Any hints about where the algorithm comes from would be most welcome.

2013-10-14

Hooray!!!

I am very happy to report that my book has finally appeared. While the book officially is out for a few days already, it was only very recently that it appeared on Amazon. You can get your copy by clicking on the thingy below (if it shows; otherwise by clicking here).

The book is written as a text-book for postgraduate students and covers basic topics around the area of Monte Carlo methods. Some more information can be found on the publisher's web page.

This is quite exciting for me. It's a book!!! I have a paper copy: you can touch it and all, it's really real!

2013-08-20

Today I got an email from a Dr. R. K. Dixit (HON)

, bringing good
news:

I came across to your research paper titled "Upper and Lower Bounds in
Exponential Tauberian Theorems" and feel that your research is having a
very good impact.
With a view to begin a long-term fruitful association with you, I invite
you to submit your upcoming research articles / papers for publication
in Global Journal of Science Frontier Research (GJSFR), an international
double blind peer reviewed research journal.

It seems that I could be an expert in Frontier Research

!!! The
email refers to the following paper:

- J. Voss:
Upper and Lower Bounds in Exponential Tauberian Theorems.
Tbilisi Mathematical Journal, vol. 2, pp. 41–50, 2009.

link, arXiv:0908.0642, more…

While this is a nice paper (it got much better during the refereeing
process; had I known this in advance I would have submitted it to a better
known journal), it is still a relief to hear that the paper has very
good impact

. Google currently doesn't know of any citations of my
paper.

As it happens, the publisher of GJSFR, Global Journals Inc. (US), is
also listed on
Beall's list of
potential, possible, or probable predatory scholarly open-access
publishers

.

2013-08-07

A while ago, the computer game Infinity Blade 2 was given away for free as part of some advertisement campaign and I got myself a copy at the time.

One of the features in the game is a "gem forge", where one can put in three gems and, after a few hours, gets a different gem in return. Gems have both financial value (gems can be sold) and practical value (gems can be used to enhance fighting skills and equipment). I searched for advice about how to best use the gem forge, but found only vague rumours and incomprehensible to me lists.

To understand the gem forge better, I started taking notes about which output I got for which combination of inputs. The procedure is a bit slow, so I have only 16 data points so far, but I got a few results:

- I found two cases where the information displayed for the input gems was identical, but which resulted in different output gems. Thus, either randomness is involved, or the gems have some hidden state which is not displayed in the gem screen.
- For all 16 cases I have data for, the time to forge the new gem (in hours wall clock time) can be found by taking the price of the most expensive input gem, dividing by 15000, adding 0.25, and rounding to the nearest 30 minutes. (I found this rule using linear regression.) Let's see whether this rule allows to successfully predict the forging time for future gems.
- In all cases I observed, the sale price of the output gem was higher than the sale price of the most expensive of the input gems.
- The correlation between the sale prices of the output gem and of the most expensive input gem is higher than the correlation between output price and the sum of the input prices.

I haven't looked at other properties (shape, colour, function, etc.) of the gems yet.

If anybody knows more about the gem forge, or if there is a good reference available on the internet, I'd be happy to learn about this.

2013-07-23

During the last days I finished an implementation of the Fortuna random number generator in the Go programming language. The result can be found on the new Fortuna home page and on github.com.

This is very different from the random number generators I usually deal with: Normally I am interested in random number generators for use in Monte Carlo simulations; this is, for example, the topic of my upcoming book (click the image below for more information). In contrast, the Fortuna random number generator is made for use in cryptographic algorithms; for example it could be used to generate session tokens for a web app.

2013-07-02

I have just published a small package for the excellent Go programming language: The new package, trace, implements a simple tracing framework. Code using this framework can emit diagnostic messages using the trace.T() function. In normal operation, calls to trace.T() have no effect and are very fast. When tracing is enabled (in order to track down a problem or to explore the inner workings of a program), listeners can record/display the trace messages.

Details can be found on the tracke package home page. Comments are very welcome!

2013-03-20

Today I submitted a new paper. Preparing this took forever, but it's finally out:

- M. Dashti, K.J.H. Law, A.M. Stuart and J. Voss:
MAP Estimators and their Consistency in Bayesian Nonparametric Inverse Problems.
Inverse Problems, vol. 29, no. 9, 2013.

DOI:10.1088/0266-5611/29/9/095017, arXiv:1303.4795, more…

2013-03-20

I am very happy to report that I have finished writing my first-ever
book! The title is An Introduction to Statistical Computing

and the
book will be published by
Wiley.
This book is a textbook, mostly for postgraduate students, and it covers
random number generation, Monte Carlo methods, and a few more advanced
topics.

2013-01-02

2012-11-12

The Police and Crime Commissioner elections will take place on Thursday. The question is whom to vote for.

Some thoughts:

- To me it seems that the persons standing for election (mostly local
councillors) may not be the right persons to supervise the police.
Wouldn't this be a much better job for the independent police complaints
commission or the chair of the police authority or similar? So I wish
there was a
none of the above

option at the end of the list of candidates. - I will be able to
choose between
The Labour Party Candidate

,The Conservative Party Candidate

,Independent

andLiberal Democrats

. I feel a bit uneasy about how this seems to go against the principle of separation of powers.

The candidates for Leeds are the following (summarised from the choose my PCC web page):

- Mark Burns-Williamson (labour).

Until recently, he was the chair of the West Yorkshire police authority, which probably makes him qualified for the job as a PCC. He optimistically writes that he willstand against the £100m Government cuts of around 2000 police officers and staff in West Yorkshire

, but I would be surprised if he succeeded in this. - Geraldine Mary Carter (conservative).

Geraldinehas served on West Yorkshire Police Authority

, whatever this means. She writesthere are communities that the police feel unable to tackle crime in because they fear the consequences

, and I wish she would tell us which communities she means by this (is this about gang crime? Muslims? Corrupt politicians?). Also, she writesvictims should be able to decide the community punishment of offenders

, possibly indicating that she favours revenge over rehabilitation? Shewill recruit 850 new Community Special Constables

, but doesn't say how this will be funded. She uses all the conservative buzz-words fromzero-tolerance

tounnecessary red tape

. - Cedric Mark Christie (independent). Cedric
seems to be a retired police officer. He writes
I will chase the corrupt and incompetent out of office

andI know the difference between right and wrong. I never allowed negative traits to influence my behaviour whilst serving you.

He seems VERY focussed on bullying. - Andrew Clive Glover Marchington (libdem).

Among the four candidates, Andrew seems to have the least connection to the police. He uses the phrasesreduce red tape

,payback sentencing

, but alsoreduce reoffending

. He seems to have a stronger focus on domestic violence than the other candidates do.

2012-09-17

For (my own) future reference, here are example commands to create a 3D (surface) plot in R:

library("rgl") open3d(windowRect=c(50,50,800,800)) x <- seq(-10, 10, length=20) y <- seq(-10, 10, length=20) z <- outer(x,y, function(x,y) dnorm(x, 2, 3)*dnorm(y, 3, 7)) palette <- colorRampPalette(c("blue", "green", "yellow", "red")) col.table <- palette(256) col.ind <- cut(z, 256) persp3d(x, y, z, col=col.table[col.ind])

These commands open the plot in a new window. The
`windowRect`

parameter in the call to `open3d`

determines the initial position and size of the window (many thanks to Jem
Corcoran for pointing this out). The output can be rotated and scaled
using the mouse.

2012-08-17

Following up on my recent blog entry about generating publication quality figures with R, here are some more hints, this time about setting the figure margins.

An R figure consists of several regions. From the outside in, these are as follows:

- The device region. This region contains the
figure region, surrounded by the outer margin. The
device region may also contain text (for titles etc.). The size of this
region is determined by the
`width`

and`height`

arguments of the`pdf`

command. - The figure region contains the plot region, surrounded by the inner margin. The axis tick marks and labels are part of the figure region. The size of the figure region is determined by the size of the device region and the width of the outer margin.
- The plot region contains the plot of the data, not including any labels and tick marks. The size of the plot region is normally determined by the size of the figure region and the width of the inner margin.

figure 1. The different regions of a figure generated in R

These different regions are illustrated in figure 1, above. The figure was created using the following R code:

par(oma=c(2, 2, 2, 2), mar=c(4, 4, 2, 2)) set.seed(1) plot(rnorm(10), xlab="x label", ylab="y label") box("outer", col="red", lwd=2) mtext("device region", side=3, line=1, col="red", outer=TRUE) box("inner", col="green") mtext("figure region", side=3, line=1, col="green") box("plot", col="blue") text(5, 0, "plot region", col="blue")

A plot for use in a publication will normally need to make efficient use of the available space. For this purpose, the default margins used by R are too generous. This is illustrated in figure 2.

figure 2. A simple R plot, using the default margin settings. The red box was added to show the total area covered by the figure. There are unnecessarily large margins on the top and the right of the figure.

To maximise the area available for displaying information, unnecessary white space around the plot should be avoided. To achieve this, the following suggestions may be useful:

- Outer margins. A figure in a publication will
normally have a legend to contain all accompanying information. Thus, no
title etc.will be required for the figure. Thus, normally
the outer margin size will be 0:
par(oma=c(0, 0, 0, 0))

- Inner margins. The inner margins need as small
as possible, but large enough to contain the axis labels and tick marks.
For the plot in figure 2, we can use the following margins:
par(mai=c(0.85, 0.85, 0.1, 0.1))

Here we use

`mai`

(margin in inches) instead of`mar`

(margin in lines), to have better control over size of margins. The four numbers give the widths of the margins at the bottom, left, top and right in this order. Since the plot has no labels on the top and on the right, the last two numbers can be chosen close to 0.

A version of the above plot, with the margins adjusted as shown above, is displayed in figure 3:

figure 3. The plot from figure 2, but with the margin sizes adjusted to maximise the area available for displaying information. This figure will take up the same amount of space as figure 2 in the enclosing document, but provides much more space for displaying the plot data.

References:

- Earl F. Glynn's page about R Graphics Basics

2012-07-24

I am happy to notice that one of my papers has been published, after a bit of a wait:

- J. Voss:
The Effect of Finite Element Discretisation on the Stationary Distribution of SPDEs.
Communications in Mathematical Sciences, vol. 10, no. 4,
pp. 1143–1159, 2012.

DOI:10.4310/CMS.2012.v10.n4.a6, arXiv:1110.4653, more…

If you are interested in numerical solution of stochastics partial differential equations, this may be of interest. Comments about the article are very welcome.

2012-07-17

For one of my programming experiments, I want to compute timestamps by using the number of seconds since the beginning of 1970. The resulting value will be used for communication between a web browser and my web server. My aim is to find a method to get the very similar values for the timestamp in JavaScript code running on the web browser and in a Python program running on my web server.

To solve this problem, I need a method which does not depend on the time zone the web user is in, and which does not break when the daylight saving time (DST) starts or ends. One solution to this problem is, to perform the computation of the timestamp in Coordinated Universal Time (UTC, a successor of the Greenwich Mean Time). For reference, this post describes what I found.

In the web browser, I am using the following JavaScript code:

now = new Date() timestamp = now.getTime() / 1000

Here, the `getTime`

method does all the work; since
JavaScript timestamps use milliseconds since the beginning of 1970, we
have to divide by 1000 to convert to seconds. According to the
ECMAScript specification
(also governing JavaScript), the resulting value should be independent of
time zone and DST. For testing, I changed the clock of my computer to a
different time zone; everything still worked: after restarting the
browser, `now`

reflects the new time zone, but
`timestamp`

is only changed by the few seconds I needed to
restart the browser.

The code on the web server is written in Python. After a lot of experimenting, I've found two working combinations: The more straightforeward way of computing a timestamp is as follows.

import datetime import time now = datetime.datetime.now() timestamp = time.mktime(now.timetuple()) + now.microsecond * 1e-6

Despite the fact that there is no explicit conversion to UTC in this
code, the computed timestamp gives the number of seconds since the start
of 1970 in UTC. This depends on `time.mktime`

's ability to
magically know the current time zone of the computer this code is running
on. The same computation can also be done using times in UTC directly.

import datetime import calendar now = datetime.datetime.utcnow() timestamp = calendar.timegm(now.utctimetuple()) + now.microsecond * 1e-6

This time, the code depends on the `utcnow`

function knowing
the current timezone. An ugly point of the second solution is, that it
requires the slightly obscure
calendar module
from the Python standard library. Again, I tested this code by changing
the time zone of the computer's clock and all worked as expected.

2012-06-13

I just learned a new trick: the "texdoc" program automatically finds and displays information for LaTeX packages. On my shiny Mac, using the MacTeX distribution, I can type commands like the following:

texdoc geometry texdoc tikz texdoc jvlisting

This shows the documentation for the corresponding package as a PDF file.

2012-05-26

I am currently in the process of writing a book, and I plan to produce several figures for this project using the statistical computing package R. While R is generally quite good at creating scientific graphics, some adjustments help to create high-quality images for inclusion in another document. This blog entry gives some simple hints about the required steps.

- Including figures in LaTeX: My aim is to
create figures which can be included in a LaTeX document as follows:
\documentclass{article} \usepackage{graphicx} \begin{document} Some useful information is shown in figure~\ref{figure1}. \begin{figure} \begin{center} \includegraphics{figure1} \end{center} \caption{\label{figure1}This is a figure. It shows useful information.} \end{figure} \end{document}

- PDF output: The figures need to be generated
in one of the
image formats supported by LaTeX.
To get high-quality output it is best to avoid raster-based formats like
JPG or PNG. Since I use
pdfLaTeX, the most
convenient file format for my figures is PDF. R can directly generate a
PDF file from a script: This can be achieved with the
`pdf()`

command. The required calls are as follows:pdf("figure1.pdf", width=4.6, height=3) ...commands to generate the plot invisible(dev.off())

The options

`width`

and`height`

give the outer dimensions of the plot (including labels and margins) in inches. The`dev.off()`

command at the end is required to make sure all output is written to the PDF file. I enclose`dev.off`

inside`invisible()`

in order to suppress an annoying message about the "null device" being printed to the screen. - Reproducability: for scientific use, it is
important that the plot can be easily recreated at a later time. For this
reason, I find it useful to create a separate R source file for every
plot. For example, the file
`figure1.R`

might contain the R code required to generate`figure1.pdf`

. On Linux and MacOS we can use a shebang-line to turn the R-script into a stand-alone program. This requires two steps: First, we have to add a line like the following at the start of the R script:#! /usr/bin/env Rscript pdf("figure1.pdf", width=4.6, height=3) ...commands to generate the plot invisible(dev.off())

And secondly, we need to make the script executable by using (in a terminal window) a command like

. After these steps,`chmod +x figure1.R`

`figure1.R`

can simply be executed to regenerate`figure1.pdf`

. This particularly useful for use in Makefiles to automatically regenerate the figure every time the R script is changed.

2012-04-10

I am happy to announce release 0.6.2 of Wisent, the Python parser generator. This release fixes a few minor issues which have been found in the previous release of Wisent. The source code of Wisent 0.6.2 can be downloaded from the Wisent homepage or from github.com.

2012-04-09

I am happy to announce release 0.2 of the jvqplot data plotting program. One of the main features of jvqplot is that the plot is automatically updated whenever the data in the input file changes.

Changes in this release:

- Empty lines in the input file can now be used to separate datasets. This allows to create plots with disconnected components as in the picture below.
- The program source code has been cleaned up and split into several files.
- Some minor (mostly cosmetic) issues with the previous release have been fixed.

The new jvqplot release can be downloaded from the jvqplot homepage.

2011-12-11

This weekend, I finished a new program: JvJsDoc, a program to extract documentation from JavaScript source code and to present the collected information in a set of HTML pages. It is meant to be used together with the Google Closure Library and the Google Closure Compiler. As an example: JvJsDoc can convert the JavaScript source file goog/disposable/disposable.js into the HTML documentation in goog/Disposable.html.

More information can be found on the JvJsDoc homepage, and you can download the program here:

- jvjsdoc version 0.5, 2011-12-11
First public release.

- archive: jvjsdoc-0.5.tar.gz (99KB)

signature: jvjsdoc-0.5.tar.gz.asc

sha1: 5e4f50a6fb941bd2fcfd9974cfddef89103779d4

md5: c2c837150df941fe17b4466516802724

- archive: jvjsdoc-0.5.tar.gz (99KB)

2011-10-20

After taking much longer than I intended to, today I finally submitted a new paper (about a year later than planned). You can have a look at the preprint here:

- J. Voss:
The Effect of Finite Element Discretisation on the Stationary Distribution of SPDEs.
Communications in Mathematical Sciences, vol. 10, no. 4,
pp. 1143–1159, 2012.

DOI:10.4310/CMS.2012.v10.n4.a6, arXiv:1110.4653, more…

2011-10-17

`jvlisting`

is a
LaTeX package which
provides an alternative to LaTeX's built-in `verbatim`

environment. The new release, version 0.5, fixes some bugs and cleans up
the TeX source code a bit.

`jvlisting`

can be downloaded from my webserver.- There is a nice Users' manual.
- The package is also available on CTAN in the directory /macros/latex/contrib/jvlisting/ (but it may take a day or so until the new version shows up there).

I hope you find this package useful. Bug reports and comments are very welcome!

2011-09-27

Last week, Kanti and I submitted a new paper:

- K.V. Mardia and J. Voss:
Some Fundamental Properties of a Multivariate von Mises Distribution.
Communications in Statistics — Theory and Methods, vol. 43,
pp. 1132–1144, 2014.

DOI:10.1080/03610926.2012.670353, arXiv:1109.6042, more…

I am particularly curious about how the referees will react to my nice figures on pages 6 and 7.

2011-09-23

I have typed my lectures notes for the statistical computing module I am teaching this term. The notes are available online:

Comments are very welcome!

2011-09-06

I just started playing with Google+, not sure yet whether this will be useful or interesting to me. If you also want to play, and don't have an account yet, you can probably sign up at this link.

2011-07-28

Just for reference, here is a recipe about how to watch your DVDs on an Apple iPad. I have read that some of the required steps may be illegal to perform when you are in the USA (or an American citizen?); don't follow these instructions if you are not allowed to!

The instructions provided here use MPlayer/MEncoder on Ubuntu Linux to convert the DVD content into a file of the required format, but many other options are available.

- Use lsdvd (contained in
the Ubuntu package
lsdvd

) to find out which track of the DVD contains the actual movie. Normally this will be the longest track,`lsdvd`

helpfully lists the index of this track at the bottom of its output.lsdvd

- Use MPlayer (Ubuntu package
mplayer

) to verify that this is indeed the track you want:TRACKNO=1 mplayer dvd://$TRACKNO

You have to replace the value of

`TRACKNO`

with the track number found in the previous step. - Use
`mplayer`

to determine any cropping reqired:mplayer dvd://$TRACKNO -vf cropdetect

This should print an option of the form

`-vf crop=720:430:0:48`

to the console. - Use
MEncoder
to convert the data on the DVD to something an iPad will play. For the
method described here, you need a version of mencoder with
faac support included. One
such version can be found in the
Medibuntu
repository.
The call to

`mencoder`

requires about a million arguments. The following is inspired by a post at Mike McCandless' blog.CROP=720:430:0:48 X264OPTS=crf=28:vbv_maxrate=1500:nocabac:global_header\ :frameref=3:threads=auto:bframes=0:subq=6:mixed-refs=0\ :weightb=0:8x8dct=1:me=umh:partitions=all:qp_step=4\ :qcomp=0.7:trellis=1:direct_pred=auto FAACOPTS=br=160:mpeg=4:object=2:raw mencoder dvd://$TRACKNO \ -of lavf -lavfopts format=mp4 \ -vf crop=$CROP,pp=lb \ -ovc x264 -x264encopts $X264OPTS \ -oac faac -faacopts $FAACOPTS -channels 2 -srate 48000 \ -o out.mp4

You should replace the value of

`CROP`

with the value found in the previous step. This command can take a considerable amount of time to complete (comparable to the playing time of the DVD track), and should produce a big file`out.mp4`

(about 225MB per hour of playing time). - Copy the resulting file to a computer running Apple iTunes and add the file to your iTunes library. Finally, sync with your iPad.

2011-07-27

A while ago I finished my first home-made LaTeX package. The new package is called "jvlisting" and provides a replacement for LaTeX's verbatim environment.

This package provides the LaTeX environment `listing`

, an
alternative to the built-in `verbatim`

environment. The
`listing`

environment is specially taylored for including
listings of computer program source code into documents. The main
advantages over the original `verbatim`

environment are that

- The
`listing`

environment automatically fixes leading whitespace so that the environment and program listing can be indented with the rest of the document source, and `listing`

environments may easily be customised and extended.

You can download the package and its documentation from
my webserver or from the
jvlisting page at the
Common TeX Archive Network (CTAN). I'd be happy if you could give my
package a try. Comments about jvlisting

are very welcome!

2011-06-21

Recently I spent some time to understand how one can execute a macro repeatedly, once for every line of text in a LaTeX environment. Since the solution is a bit tricky and I found it diffcult to find answers on the web, here is a summary of what I learned.

Step 1. In order to prevent input lines from
being concatenated by TeX before we get access to them, we can use the
`\obeylines`

macro. This allows to define a macro which matches
everything until the end of line:

\def\doline#1{line found: `#1'\par} {\obeylines \gdef\getline#1 {\doline{#1}}} {\obeylines \getline This is the \textbf{first} line. This is the second line.}

Note that `\obeylines`

must be in effect both while we
define the `\getline`

macro and while scanning the text. The
outer pairs of enclosing brackets are there to contain the effect of
`\obeylines`

. To make `\getline`

visible outside
these brackets we use
`\gdef`

to
define the macro in the global namespace. The output of the above TeX code
is

```
line found: ‘This is the first line.’
This is the second line.
```

Step 2. We would like to have the
`\getline`

macro called for every line instead of just for the
first one. This can be achieved by putting a `\getline`

(without arguments) at the end of the `\getline`

replacement
text. The only complication is that we somehow need to stop this procedure
once the end of the region of interest has been reached:

\def\marker{END} {\obeylines \gdef\getlines#1 {\def\text{#1}% \ifx\text\marker \let\next\empty \else \doline{#1}\let\next\getlines \fi \next}} {\obeylines \getlines This is the \textbf{first} line. This is the second line. END }

The
`\ifx`

command is used to test whether the matched line is the same as
`\marker`

. Until we find the maker, we put a new call to
`\getlines`

at the end of the `\getlines`

replacement text, thereby looping over all lines until the marker is
found. For this to work, the END

needs to occur on a line on its
own; therefore we have to move the closing brackets down one line. The TeX
code above leads to the following output.

```
line found: ‘This is the first line.’
line found: ‘This is the second line.’
```

Step 3. We are now ready to pack the commands from above into the definition of a LaTeX environment:

\def\marker{\end{dolines}} {\obeylines \gdef\getlines#1 {\def\text{#1}% \ifx\text\marker \let\next\text \else \doline{#1}\let\next\getlines \fi \next}} \newenvironment{dolines}{\begingroup\obeylines\getlines}% {\endgroup} \begin{dolines} This is the \textbf{first} line. This is the second line. \end{dolines}

Here we had to change the definition of `\getlines`

in order
to include the `\end{dolines}`

in the replacement text when the
`\ifx`

is true; otherwise it would have been swallowed by
`\dolines`

. The resulting output is, a bit surprisingly, as
follows:

```
line found: ‘’
line found: ‘This is the first line.’
line found: ‘This is the second line.’
```

The extra empty line at the beginning is caused by the
`\getlines`

macro matching the (empty) text between
`\begin{dolines}`

and the end of line. The
`\end{listing}`

must be on a line on its own for this
environment to work.

Step 4. If we want to prevent processing of the
TeX commands inside the `dolines`

environment, we can do so by
changing the
category codes of
special characters like

to 12 (`\`

other

). A
complication with this plan is, that `\ifx`

also compares
category codes. Therefore, when we try to detect the end of our
environment while special characters are switched off, we need to define
`\marker`

as the string

, but
with the category codes of all special charcters (the backslash and the
curly brackets) set to 12. This can be done by using the following,
contorted sequence of commands:
`\end{dolines}`

\begingroup \catcode`|=0 \catcode`[=1 \catcode`]=2 \catcode`\{=12 \catcode`\}=12 \catcode`\\=12 |gdef|marker[\end{dolines}] |endgroup

Two more changes are required to make this work: first, we need to disable all special characters at the beginning of the environment:

\let\do=\@makeother\dospecials

If you want to use this command outside a style file, you will need to
turn

into a letter by bracketing your definitions
with
`@`

\makeatletter ... \makeatother

Secondly, inside `\getlines`

, the expansion of
`\text`

still has the category codes as found in the input.
Therefore, if the input was read with special characters switched of, we
cannot write `\let\next\text`

to get the closing
`\end{dolines}`

(since the category codes of the backslash and
the curly brackets would be wrong). Instead, we need to use a definition
like `\def\next{\end{dolines}}`

.

Example. Using the techniques explained above,
we can define a replacement for the LaTeX comment environment (contained
in the
`verbatim`

package) as follows:

\documentclass{article} \makeatletter \begingroup \catcode`|=0 \catcode`[=1 \catcode`]=2 \catcode`\{=12 \catcode`\}=12 \catcode`\\=12 |gdef|marker[\end{comment}] |endgroup {\obeylines \gdef\getlines#1 {\def\text{#1}% \ifx\text\marker \def\next{\end{comment}} \else \let\next\getlines \fi \next}} \newenvironment{comment}{\begingroup \let\do=\@makeother\dospecials \obeylines\getlines}% {\endgroup} \makeatother \begin{document} \begin{comment} Everything here will be ignored. \Invalid LaTeX code and incomplete constructs like \begin{itemize} without the closing end are no problem. \end{comment} \end{document}

2011-03-23

Funny coincidence: Yesterday I learned that the following paper which we first submitted in 2010 has been accepted:

- M. Hairer and J. Voss:
Approximations to the Stochastic Burgers Equation.
Journal of Nonlinear Science, vol. 12, no. 6,
pp. 897–920, 2011.

DOI:10.1007/s00332-011-9104-3, arXiv:1005.4438, more…

And today I received an email notice that a paper we submitted in 2009(!) is finally going into print now:

- M. Hairer, A.M. Stuart and J. Voss:
Sampling Conditioned Hypoelliptic Diffusions.
Annals of Applied Probability, vol. 21, no. 2,
pp. 669–698, 2011.

DOI:10.1214/10-AAP708, arXiv:0908.0162, more…

2011-03-09

We are currently advertising a new lectureship in the statistics department of the University of Leeds. Thus, if you are a PostDoc and want to become a collegue of me, have a look at the job advert. The closing date for applications is 31st March 2011.

2010-11-29

I am happy to announce the release of Jvqplot version 0.1.

Jvqplot is a data plotting program, resembling a simplified version of gnuplot. It is very simple to use, but it can only plot simple data files. The format and scaling of the plot is automatically chosen and cannot be changed. One of the main features of jvqplot is that the plot is automatically updated whenever the data in the input file changes. More information can be found on the Jvqplot homepage.

- jvqplot version 0.2, 2012-04-09
empty lines in data files separate data sets, clean up the program sources

- archive: jvqplot-0.2.tar.gz (167KB)

signature: jvqplot-0.2.tar.gz.asc

sha1: cebb02849f6743babf6cf2a52a1098328ad374f6

md5: 9f4422716e0882b7de19731a47be7976

- archive: jvqplot-0.2.tar.gz (167KB)
- jvqplot version 0.1, 2010-11-29 (experimental version)
first public release

- archive: jvqplot-0.1.tar.gz (95KB)

signature: jvqplot-0.1.tar.gz.asc

sha1: 025b1e96eff5f8f57c8d78f30a612c15d36e6780

md5: e6a19835d89be82d192c4821f164a849

- archive: jvqplot-0.1.tar.gz (95KB)

This is the first public release of jvqplot. Any feedback about the program would be most welcome.

2010-11-27

Since I don't watch TV programmes and also don't own a TV, I do not need a TV license. The TV licensing people seem to find this similarly difficult to believe than their German counterparts, so they keep hassling me about this.

Their latest letter seems to allow to declare online that I don't need a license. Unfortunatly, the web form quickly forces me to choose one of the following possibilities:

- Only watch pre-recorded programmes (DVDs/Videos)
- Only watch non live programmes via a Computer (e.g. Catch up services on iPlayer)
- Only watch non live programmes via a TV (e.g. Catch up services on iPlayer)
- Only use TV for games / consoles
- Covered by another licence
- I have no equipment

The last option seems like a possible choice for me, except for the
fact that my computer probably qualifies as equipment

. So which
option am I meant to choose????

2010-09-28

I just discovered the (1991) article What Every Computer Scientist Should Know About Floating-Point Arithmetic by David Goldberg. Very useful reading!

2010-09-16

I am happy to announce the release of Wisent version 0.6.1. Wisent is a parser generator for Python. The new release fixes a problem (introduced in version 0.6) where comments and multi-line strings in grammar files were broken. You can download the program from the wisent homepage.

2010-09-15

I am slightly embarrassed by the fact that I've been caught out by the birthday paradox yesterday. The encounter went as follows:

While testing a new random number generator by analysing a list of 1000 generated standard normal distributed random numbers, I discovered that the list contained one of the numbers twice!!! This is suspicious, because this event has probability 0 in theory. After an (unsuccessful) hunt for bugs in my program, I finally found the following explanation.

The program prints the numbers using a C command like

printf("%f\n", normal(0, 1));

By default, the `%f`

format string outputs numbers with a
precision of six significant digits:

-0.641062 1.116142 1.417036 0.337435 -0.310383 ...

Most of the numbers will lay between -2 and 2, i.e. they are concentrated in a set of about 4 million possible values. A quick check reveals that 1000 independent uniform draws out of a set of this size contains a number twice with a probability of more than 10, and for the normal distribution the probability will be even higher because the numbers are more concentrated around 0. Thus, seeing a number twice is something which will actually happen from time to time and is no indication that the program is malfunctioning!

2010-08-18

Just now, my (not very fancy) phone showed the following corrupted message to me:

I'm a bit at a loss about what is going on? If you have any idea, I would be happy to receive hints via email. Do I need to worry?

2010-08-04

The program DSSP is used
to determine the
secondary structure
of a protein, taking the three dimensional coordinates of its atoms as the
input. Bio3d is a library for the
statistical software package R
which makes it easier to analyse protein structure; as part of this, bio3d
contains an interface to DSSP (in the function `dssp`

). Since I
had a bit of trouble using this interface, here are some hints.

- Get the DSSP program from the DSSP webpage. For academic use, the program is free of charge, but you need to fill in a license agreement.
- Rename the executable to dssp (mine was called DSSP_MAC.EXE after I had downloaded it).
- Move the file to the directory where it will live. For my system (and for the examples below) I chose /usr/local/bin/. The name of this directory is not allowed to contain white space.
- Call the
`dssp`

function with this directory as the`exepath`

argument. The value of the`exepath`

argument must end in`/`

(on Linux/Unix/MacOS) or`\`

(on Microsoft Windows). Also note that every`\`

in an R string constant needs to be doubled, e.g. on Microsoft Windows you will probably need to write something like`exepath="C:\\path\\to\\file\\"`

.

Example. In R one can now use the following commands.

library("bio3d") pdb <- read.pdb("12as") x <- dssp(pdb, exepath="/usr/local/bin/")

Then the secondary structure information of the protein 12AS can be accessed as follows:

> x$helix $start 1 2 3 4 5 6 7 8 9 1 2 5 76 130 170 182 258 277 297 320 271 310 $end 1 2 3 4 5 6 7 8 9 1 2 27 83 155 176 193 268 283 305 325 275 312 $length 1 2 3 4 5 6 7 8 9 1 2 23 8 26 7 12 11 7 9 6 5 3 $chain [1] "A" "A" "A" "A" "A" "A" "A" "A" "A" "A" "A"

This tells us that the first helix reaches from residue 5 to residue 27 (both inclusive).

2010-08-03

I learned about the different kinds of grains long ago, when I was in primary school. Since then I have somehow managed to confuse myself about which grain is which. Here's to unconfusion! (All information and pictures are from Wikipedia; follow the links for details.)

name | picture | comments |
---|---|---|

wheat /
Weizen (Triticum aestivum, Triticum durum, etc.) | Used for bread, flour, couscous, beer, biofuel, … Needs good soil and climate, several species of wheat are used, main source of vegetable protein in human food. | |

barley /
Gerste (Hordeum vulgare) | Used for beer, whisky, animal feeding, … One of the first cultivated crops (together with einkorn and emmer). | |

oat /
Hafer (Avena sativa) | Used for animal feeding (e.g. horses), oat
flakes, porridge, … Can be grown in regions with relatively cold, wet summers. | |

rye /
Roggen (Secale cereale) | Used for bread, animal feeding, … Can grow on relatively poor soil and in colder climates. |

2010-06-28

In the last days I have finished my first ever browser-based game: a puzzle, based on the classical fifteen puzzle. Cou can play the game here and download the source code from the program's homepage.

2010-06-14

2010-05-24

Today, Martin and I finally finished our paper "Approximations to the Stochastic Burgers Equation". A preprint can be downloaded below, comments are very welcome.

- M. Hairer and J. Voss:
Approximations to the Stochastic Burgers Equation.
Journal of Nonlinear Science, vol. 12, no. 6,
pp. 897–920, 2011.

DOI:10.1007/s00332-011-9104-3, arXiv:1005.4438, more…

2010-04-29

It is of course
well known
that
water fluoridation
is a Communist conspiracy to sap and impurify all of our precious
bodily fluids

. Still, I am surprised to learn that one of the
political parties running in the UK's upcoming
general elections
uses their opposition to water fluoridation as a campaign topic.

Can you guess which party this is? (for the answer click here)

2010-04-23

Only one day after we submitted the revised version of our paper
Sampling Conditioned Hypoelliptic Diffusions

(see my previous blog
post), the article was accepted! It will appear in the
Annals of Applied Probability.

- M. Hairer, A.M. Stuart and J. Voss:
Sampling Conditioned Hypoelliptic Diffusions.
Annals of Applied Probability, vol. 21, no. 2,
pp. 669–698, 2011.

DOI:10.1214/10-AAP708, arXiv:0908.0162, more…

2010-04-20

In order to address the referee's comments, we have prepared an updated version the following article:

- M. Hairer, A.M. Stuart and J. Voss:
Sampling Conditioned Hypoelliptic Diffusions.
Annals of Applied Probability, vol. 21, no. 2,
pp. 669–698, 2011.

DOI:10.1214/10-AAP708, arXiv:0908.0162, more…

2010-04-13

I put up the slides from my talk at the Maths2010 conference in Edinburgh. Since I had only 15 minutes, the talk is nearly free of contents, but in compensation there are many nice pictures. I'm particularly fond of the torus on page 8 (which was surprisingly difficult to generate).

2010-03-25

This is a (some weeks old) photo of all twelve of me on Ilkley Moor.

2010-03-21

2010-02-03

Yesterday I learned how to speed up the vector operation
*y* := *y* + α*x* by using
SSE2 instructions.

The function I was trying to speed up was as follows:

static void daxpy0(int n, double a, const double *x, double *y) { int i; for (i=0; i<n; ++i) { y[i] += a * x[i]; } }

Using SSE2 instructions, operating on two double precision numbers at a time, this can be written as follows:

#include <emmintrin.h> static void daxpy1(int n, double a, const double *x, double *y) { __m128d aa, xx, yy; int i; aa = _mm_set1_pd(a); /* store a in both registers */ for (i=0; i+1<n; i+=2) { xx = _mm_load_pd(x+i); /* load two elements from x ... */ xx = _mm_mul_pd(aa, xx); /* ... and multiply both by a */ yy = _mm_load_pd(y+i); /* load two elements from y ... */ yy = _mm_add_pd(yy, xx); /* ... and add */ _mm_store_pd(y+i, yy) /* write back both elements of y */ } if (i < n) { y[i] += a * x[i]; /* handle last element if n is odd */ } }

This code required that the vectors `x`

and `y`

are 16-byte aligned (otherwise a segmentation fault will occur). This
assumption holds, for example, for memory blocks allocated by
`malloc`

on 64bit Linux and MacOS X systems. Also,
obviously, this only works on CPUs which support the SSE2 instruction set.
A description of the SSE2 instructions used here can be found in the
Intrinsics Reference

of the
Intel C++ Compiler Documentation
(also seems to apply to the GNU C compiler; direct link
here).

For comparison, I also tried to use the `daxpy`

function
provided by BLAS:

static void daxpy2(int n, double a, const double *x, double *y) { extern void daxpy_(int *Np, double *DAp, const double *X, int *INCXp, double *Y, int *INCYp); int inc = 1; daxpy_(&n, &a, x, &inc, y, &inc); }

Details about this approach are described on my page about linear algebra packages on Linux.

Results. I tried the three functions using two
vectors of length 2000. The following table gives the execution time for a
million calls to `daxpy`

(on a newish quad-core Linux machine):

method | function | time [s] |
---|---|---|

direct | `daxpy0`
| 1.52 |

SSE2 | `daxpy1`
| 0.91 |

BLAS | `daxpy2`
| 2.47 |

As can be seen, the SSE2 code in `daxpy1`

is fastest,
compared to the naive implementation `daxpy0`

it takes 40% off
the execution time! For some reason, the BLAS version seems to be very
slow; and the results on my MacOS machine are similar. Currently I have no
explanation for this effect.

2010-01-20

Recently I learned how to tunnel http traffic (e.g. web surfing) over an ssh connection. The effect of this is that you can browse the web on one computer A, say, but for the web servers you are visiting it will look like your requests originate from a different host B. You need to be able to log into host B via ssh for this to work.

There are several situations where such tunneling is useful:

- If your own computer A is on a network which filters, changes, blocks or monitors your http traffic, you can use an ssh tunnel to work around the resulting restrictions.
- If you are on an unencrypted wireless network, you can use tunnelling to prevent your neighbours from being able to read all of your web traffic.
- If you work from home, you can use tunnelling to access online journals (which usually work only from university addresses), university internal web pages, etc. as if you were directly sitting at your work machine B.

Setting up a tunnel is done in two steps:

- Create a tunnel using the following command:
ssh -D 8080 -f -q -N login@host

You will need to replace login and host with your login details. The machine you type this command on is machine A in the description above, and host is machine B. This command will start an ssh process which will run in the background and will act as a tunnel to forward the web traffic.

- Set up your web browser to use a SOCKS web proxy on host
localhost

and port8080

. Both SOCKS4 and SOCKS5 should work. This will tell the web browser to connect to the local end of the ssh tunnel.

2009-12-14

Over the weekend I finished a helper which allows to easily distribute a number of programs over the available CPUs on a multi-core system. You can find the program on the Parallel homepage.

2009-12-01

I just found somebody's blog post stating 8 Reasons Normal People Should Juggle and I agree very much with his reasons.

2009-10-11

Yesterday, I completed version 0.6 of my Python parser generator Wisent. The new version allows to use UTF-8 encoded grammar files (originally, only ASCII characters could be used in grammar files), adds a CSS parser as an additional example, and fixes some minor bugs. You can download the program from the Wisent homepage.

2009-10-01

Yesterday I finished a revised version of my article about exponential Tauberian theorems. The main change is, that it transpired that my proof could be trivially modified to get a more general statement. Many thanks to the (anonymous) referee for pointing this out.

- J. Voss:
Upper and Lower Bounds in Exponential Tauberian Theorems.
Tbilisi Mathematical Journal, vol. 2, pp. 41–50, 2009.

link, arXiv:0908.0642, more…

2009-09-25

Today, one of our papers was accepted:

- A. Voss, J. Voss and K.C. Klauer:
Separating Response-Execution Bias from Decision Bias: Arguments for an Additional Parameter in Ratcliff's Diffusion Model.
British Journal of Mathematical and Statistical Psychology, vol. 63,
no. 3, pp. 539–555, 2010.

DOI:10.1348/000711009X477581, preprint:pdf, more…

2009-09-13

Just in case this is useful for anybody: here is an implementation of the HSV to RGB colorspace conversion in Python:

from __future__ import division def hsv2rgb(h, s, v): hi = int(h//60 % 6) f = h/60 - h//60 p = v * (1-s) q = v * (1-f*s) t = v * (1-(1-f)*s) return [ (v,t,p), (q,v,p), (p,v,t), (p,q,v), (t,p,v), (v,p,q) ][hi]

Update. In the meantime I learned that the Python standard library has a built-in version of this function in the colorsys module.

2009-08-03

Today, I finally submitted the SPDE-paper I wrote with Martin Hairer and Andrew Stuart. This paper took a long time to complete (and it changed from an applied maths paper into a pure maths paper in the process). You can download a pre-print here; comments are very welcome:

- M. Hairer, A.M. Stuart and J. Voss:
Sampling Conditioned Hypoelliptic Diffusions.
Annals of Applied Probability, vol. 21, no. 2,
pp. 669–698, 2011.

DOI:10.1214/10-AAP708, arXiv:0908.0162, more…

2009-07-30

A while ago, Andreas and I submitted another paper. Here is a pre-print:

A. Voss, J. Voss and K.C. Klauer:
Separating Response-Execution Bias from Decision Bias: Arguments for an Additional Parameter in Ratcliff's Diffusion Model.
British Journal of Mathematical and Statistical Psychology, vol. 63,
no. 3, pp. 539–555, 2010.

DOI:10.1348/000711009X477581, preprint:pdf, more…

2009-07-15

Over the week-end I completed the work on release 0.5 of my Python parser generator Wisent. You can now download the program from the Wisent homepage. The main news is the addition of a users' manual.

2009-07-15

I just found out that Amazon sells Uranium ore. Not sure whether I should get some …

2009-07-09

Today I discovered by chance the quirksmode.org page. Somebody there has spent lots and lots of time to work out the differences between different web-browsers, in particular in CSS and JavaScript support. Very useful!!!

2009-06-17

Some days ago I found this strange story about two men trying to smuggle bonds worth 134 billion US dollars across the Italian-Swiss border. Or maybe it was fake bonds? Everything about this case seems quite mysterious to me.

Media coverage is strangly spotty, but summaries are published by the online editions of the Times and the German newspaper Die Welt (which I don't normally read). Of course there are also lots of conspiracy theories.

Update: It transpires that the bonds were fakes.

2009-06-02

A sentence

from the
BBC web page:

Although passengers survived a landing on the Hudson River in New York in January — it is rarely successful, especially in the middle of an ocean the size of the Atlantic.

2009-04-26

I added a few photos from New York to my Photo album:

2009-04-26

Recently I submitted a new paper:

J. Voss:
Upper and Lower Bounds in Exponential Tauberian Theorems.
Tbilisi Mathematical Journal, vol. 2, pp. 41–50, 2009.

link, arXiv:0908.0642, more…

This is another leftover bit from my PhD thesis. Comments are most welcome!

2009-04-02

Since 1st April 2009 I'm a lecturer at the statistics department of the University of Leeds.

2009-03-16

Recently we submitted a book chapter for the upcoming Oxford Handbook of Nonlinear Filtering:

M. Hairer, A.M. Stuart and J. Voss:
Signal Processing Problems on Function Space: Bayesian Formulation, Stochastic PDEs and Effective MCMC Methods.
Pages 833–873 in The Oxford Handbook of Nonlinear Filtering,
Dan Crisan and Boris Rozovsky (editors), Oxford University Press, 2011.

link, preprint:pdf, more…

2009-02-03

I like to use the matplotlib Python library to create figures for use in my articles. Unfortunately, the matplotlib default settings seem to be optimised for screen-viewing, so creating pictures for inclusion in a scientific paper needs some fiddling. Here is an example script which sets up things for such a figure.

#! /usr/bin/env python import matplotlib matplotlib.use("PS") from pylab import * rc("font", family="serif") rc("font", size=10) width = 4.5 height = 1.4 rc("figure.subplot", left=(22/72.27)/width) rc("figure.subplot", right=(width-10/72.27)/width) rc("figure.subplot", bottom=(14/72.27)/height) rc("figure.subplot", top=(height-7/72.27)/height) figure(figsize=(width, height)) x = loadtxt("x.dat") plot(x[:,0], x[:,1], "k-") savefig("figure.eps")

The script performs the following steps:

- The
`matplotlib.use`

statement tells matplotlib to use one of the non-interactive backends. Without this statement, the script would need a connections to an X-windows server to run. - The two
`rc`

statement set the font family and size for the labels on the axes to match the font used in the article's text (matplotlib uses a 12pt sans-serif font by default). - The
`rc("figure.subplot", ...)`

settings and the`figsize`

argument to the`figure`

call set the image geometry. Since the text on journal pages typically is about 5in wide, a total figure width of 4.5in is often a good value. The margins set by the`rc`

statements have to be given as fractions of the image width/height and must be big enough to contain the axis labels. A bit of experimentation helps to find good values; here we use a 14pt margin at the bottom (10pt font height plus some space), 22pt left margin (depends on the number of digits in the*y*-axis labels), 7pt for the top margin (space for half of the top-most*y*-axis label, and then some more) and 10pt to the right (enough space for half the right-most*x*-axis label). - The color code
`"k"`

in the plot command sets the plot color to black. Matplotlib default is blue, but some journals charge you for use of colour in your articles (and also it is nice if you can print things on a b/w printer). - The
`savefig`

command creates anencapsulated PostScript file

. This format is good for inclusion in TeX documents. The file can be converted to PDF using the epstopdf command as needed.

2008-12-31

Today I found the British government's guidelines for computing the CO2 emission for different modes of transportation. I like the careful explanation of the authors' methodology.

The results are summarised in the table below. All data is for 2008, and I rounded and simplified the values to make them easier to digest. You can find the true results in the report. The values for cars (marked with an asterisk) are given in g/km, you need to divide them by the number of passengers in the car.

mode of transportation | CO2 emission [g / passenger km] | reference |
---|---|---|

national rail | 60 | table 21 (page 25) |

london underground | 65 | table 21 (page 25) |

trams etc. | 78 | table 21 (page 25) |

Eurostar | 18 | table 21 (page 25) |

bus | 107 | table 17 (page 21) |

coach | 29 | table 17 (page 21) |

petrol car | 207* | table 11 (page 18) |

diesel car | 198* | table 11 (page 18) |

motor cycle | 106* | table 18 (page 22) |

domestic flights | 175 | table 3 (page 9) |

short-haul flights | 98 | table 3 (page 9) |

long-haul flights | 111 | table 3 (page 9) |

Update (2010-04-13). DEFRA reorganised their web page, so the above link does no longer work. Their reports can now be found on the Greenhouse gas (GHG) conversion factor methodology papers page. I believe that the 2008 report there is identical to what I used for the table.

2008-12-20

It is Christmas very soon, now. Why don't you get me a Christmas present from my Amazon wishlist?

2008-12-20

I released version 0.8 of my implementation of the voting algorithm used by the Debian project. The new version should be functionally equivalent to version 0.7, but the source code is greatly cleaned up. Many thanks to Neil Schemenauer for providing patches! You can download the program from my page about the Debian voting system.

2008-10-30

Some photos from the North Carolina state fair are now on my photo album:

2008-09-23

Last Saturday I finally managed to visit Durham. Some photos are on my photo album:

My favourite is the last one: it shows a mechanical knight guarding a railway crossing.

2008-09-21

The
common log format

of the Apache web server contains information about page views on a web
page. Today I figured out how this format can be parsed in a Python
program.

Typical log entries look like this (I added line breaks for better readability):

66.249.71.138 - - [20/Sep/2008:10:09:09 +0100] "GET / HTTP/1.1" 200 4235 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" 67.15.143.7 - - [20/Sep/2008:13:35:06 +0100] "GET //errors.php?error=http://www.rooksgambit.org/thebigsystem//sistem.txt? HTTP/1.1" 404 2272 "-" "libwww-perl/5.79"

The first step of parsing these lines is easy, the regular expression library allows to break a line into the individual fields:

```
import re
parts = [
r'(?P<host>\S+)', # host %h
r'\S+', # indent %l (unused)
r'(?P<user>\S+)', # user %u
r'\[(?P<time>.+)\]', # time %t
r'"(?P<request>.+)"', # request "%r"
r'(?P<status>[0-9]+)', # status %>s
r'(?P<size>\S+)', # size %b (careful, can be '-')
r'"(?P<referer>.*)"', # referer "%{Referer}i"
r'"(?P<agent>.*)"', # user agent "%{User-agent}i"
]
pattern = re.compile(r'\s+'.join(parts)+r'\s*\Z')
line = ... a line from the log file ...
m = pattern.match(line)
res = m.groupdict()
```

This results in a Python dictionary, containing the individual fields as strings.

The second step is to convert the field values from strings to Python data types and to remove the oddities of the log file format. Most fields are easy:

if res["user"] == "-": res["user"] = None res["status"] = int(res["status"]) if res["size"] == "-": res["size"] = 0 else: res["size"] = int(res["size"]) if res["referer"] == "-": res["referer"] = None

Converting the time value into a Python datetime object is surprisingly messy, mainly because of the timezone information. The best way I found until now is the following:

import time, datetime class Timezone(datetime.tzinfo): def __init__(self, name="+0000"): self.name = name seconds = int(name[:-2])*3600+int(name[-2:])*60 self.offset = datetime.timedelta(seconds=seconds) def utcoffset(self, dt): return self.offset def dst(self, dt): return timedelta(0) def tzname(self, dt): return self.name tt = time.strptime(res["time"][:-6], "%d/%b/%Y:%H:%M:%S") tt = list(tt[:6]) + [ 0, Timezone(res["time"][-5:]) ] res["time"] = datetime.datetime(*tt)

If you know of any better way of doing this, please let me know. Hints are very welcome!

2008-09-01

After many delays, the paper containing the main result of my PhD thesis finally was published today:

- J. Voss:
Large Deviations for One Dimensional Diffusions with a Strong Drift.
Electronic Journal of Probability, vol. 13, no. 53,
pp. 1479–1526, 2008.

link, preprint:pdf, more…

This is more theoretical than the things I currently do, but I believe that it is a good paper.

Abstract. We derive a large deviation principle
which describes the behaviour of a diffusion process with additive noise
under the influence of a strong drift. Our main result is a large
deviation theorem for the distribution of the end-point of a
one-dimensional diffusion with drift θ*b* where *b* is a
drift function and θ a real number, when θ converges
to ∞. It transpires that the problem is governed by a rate
function which consists of two parts: one contribution comes from the
Freidlin-Wentzell theorem whereas a second term reflects the cost for a
Brownian motion to stay near a equilibrium point of the drift over long
periods of time.

2008-08-21

Mark Reid wrote a very nice review article about the current (very strong) evidence for the existence of a supermassive black hole at the centre of the Milky Way. Definitely worth a read! (Found via the KFC blog.)

2008-07-23

Today I discovered the very exciting wordle

tool at
wordle.net. It can convert texts into
word clouds like the following:

2008-06-24

A nice MacOS X feature:

voss@soup [~] osascript -e 'tell app "ARDAgent" to do shell script "whoami"' root

2008-05-29

Today I discovered, by chance, the web page A Review of the Universe - Structures, Evolutions, Observations, and Theories. I have not yet read it all but the parts I looked at seem quite nice.

2008-05-28

Stephen Dolan provides a very nice page which can find the shortest link between any two Wikipedia articles. The results are sometimes surprising: for example, can you guess how to get from Pleistocene to Microprocessor in three clicks?

2008-05-08

This week-end we had an exciting trip to Blenheim Palace.

2008-05-06

In a recent questionaire, our IT services asked the following, interesting question: Please rank the following in your priority order (1 = Highest, 2 = Medium, 3 = Lowest)

- Guaranteed access to the network during normal working hours
- Access to the network at anytime (24 x 7)
- The maximum bandwidth possible

A university with working hours

only (presumably theirs not
mine) network access is an interesting thought. And I wonder whether
guaranteed access

differs from access

or whether the first
option is just a subset of the second option.

2008-03-27

I finally got around to put some photos from my trip to Aachen onto my web page. Some weeks late, but here they are:

2008-03-22

Our article

- A. Voss and J. Voss:
A Fast Numerical Algorithm for the Estimation of Diffusion-Model Parameters.
Journal of Mathematical Psychology, vol. 52, pp. 1–9, 2008.

DOI:10.1016/j.jmp.2007.09.005, preprint:pdf, more…

is now in print.

2008-03-21

Today I completed version 0.4 of the Wisent parser generator for Python. The program is now efficient enough to deal with grammars of the size of the C grammar.

I spent some effort on generating useful error messages for grammar conflicts. For example the usual "if-else conflict" in the C grammar is reported as follows:

cg.wi:164:61: shift-reduce conflict: the input cg.wi:164:61: IDENTIFIER { if ( IDENTIFIER ) ;.else ... cg.wi:164:61: can be shifted using the production rule cg.wi:164:61: selection_statement: if ( expression ) statement.else statement; cg.wi:164:25: or can be reduced to cg.wi:164:25: IDENTIFIER { selection_statement.else ... cg.wi:164:25: using the production rule cg.wi:164:25: selection_statement: if ( expression ) statement; cg.wi: 1 conflict, aborting ...

Now you can even tell Wisent to ignore this conflict by placing an
!

before the offending else

.

2008-03-08

While trying to add the capability to create LALR(1) parsers to Wisent, I came across the following paper:

- David Pager,
A practical general method for constructing LR(k) parsers.

Acta Informatica, volume 7 (1977), number 3, pages 249–268.

This article is nicely written and I hope that using his algorithm I can teach Wisent to create efficient LR(1) parsers, thus removing the need for LALR(1) support.

2008-03-01

Today I published the first version of a parser generator for Python. Currently it only generates LR(1) parsers, i.e. it is not very useful for big grammars. I hope to add support for LALR(1) parsers in the future.

One problem is the choice of a project name: I came up with
Wisent

since this fits together well with
Bison. But unfortunately
I am not the first one to think so, there are at least two other parser
generators which also use the name Wisent, and most likely I will rename
my program. Suggestions for a better name, preferably with
Monty Python
connotations, are very welcome.

2008-02-24

I put my go ranking list online. Currently it covers some games played in the common room of the Warwick mathematics department. You can find the list on the secret go portal.

2008-02-20

There is a
lamp powered by gravity.
The energy is created by a weight slowly sliding down some rail and you
recharge

the lamp by moving the weight up again. The motion is then
converted to electricity and used to power the LEDs which create the
light. I want one of these lamps!!!

Update. A quick calculation shows that this
lamp cannot possibly work: the
best possible efficiency
for generating light is approximately 683 lumen per watt (real
efficiencies are much lower), thus to get the advertised 600-800 lumens
… over a period of 4 hours

one would need more than
12650 joule. Since
lifting a weight of 1 kg by 1 meter only provides
9.81 joule, the lamp, even assuming 100% efficiency, will never be
able to generate enough electricity to power the LEDs.

2008-01-23

Understanding Art for Geeks is a very nice picture collection on Flickr.

2008-01-21

During the Christmas holidays I carved an ancient Egytian hippo from
soapstone. It turned out to be quite a happy animal. The photo shows the
almost finished statuette, only some polishing is left to do.

My photo page has a picture of a very early stage.

2008-01-15

We prepared another fast-dm release over
the Chrismas holidays: release 29 fixes a bug where for fixed
*z* large values of *s*_{z} were not correctly
recognised.

2007-12-11

The previous fast-dm version (release 27) introduced an unfortunate bug. We released a fixed version today. As a nice side effect the program is now a bit faster for some parameter constellations (big st0 and high precision). If you are using an older version of fast-dm, you should upgrade to the new release 28.

2007-12-02

According to an article in the Times it seems that the United States' government believes it to be legal to kidnap anybody who is suspected of a crime by them. I find this scary!

2007-11-22

Andreas and I finished release 27 of our program fast-dm for parameter estimation in Ratcliff's diffusion model (in Psychology). You can download the new release from my fast-dm web page.

2007-11-21

I had missed this until now: In Leeds a man created a terrorist threat by falling into coma on a bus. Most exciting!

2007-11-13

If *M*'s a complete metric space

(and non-empty), it's always the case:

If *f*'s a contraction

Then, under its action,

Exactly one point stays in place.

2007-11-12

Today I completed another article, which will eventually appear in the conference proceedings for Heinrich v. Weizsäcker's birthday conference:

- M. Hairer, A.M. Stuart and J. Voss:
Sampling Conditioned Diffusions.
Pages 159–186 in Trends in Stochastic Analysis,
Cambridge University Press,
vol. 353 of London Mathematical Society Lecture Note Series, 2009.

link, preprint:pdf, more…

This is a (hopefully) very nice and readable summary of our project to use SPDEs to construct MCMC methods in infinite dimensional spaces.

2007-11-11

Yesterday I learned of the existence of flash cookies (aka local shared objects) which somehow had evaded my attention until now. Flash cookies seem to be very similar to the usual HTTP cookies, except that they can store more information (up to 100kb by default, instead of 4kb for HTTP cookies).

Flash cookies are independent of the browser's normal cookie settings and are not blocked by by tools like CookieSafe. It seems that the only way to view flash cookies is by visiting Adobe's Setting Manager web page.

2007-11-04

Today I spent some time looking through my spam folder. Since beginning
of this year I received 47587 spam messages but only

38281 ham
(i.e. good) messages. This means that 55% of my mails are now spam.

Not surprisingly, the spam messages came from a much wider set of hosts than the spam messages: 43673 hosts sent only spam, 3034 hosts sent only ham, and 145 hosts sent both types of messages. This effect can also be seen in my old map of the internet.

The worst offender is the host with the IP address 80.253.80.28. It
sent 49 spam emails to me this year, the first one on 2007-02-23 (Bitte
Geld abholen !!

), the last one until now on 2007-11-02 (Bitte
zurück Rufen

). It is sad that this spammer manages to use the same
address for 10 month in a row for an, in my eyes, criminal activity
without being stopped.

2007-10-23

I started reading Ulrich Drepper's excellent series What every
Programmer should know about memory

. It is published by
Linux Weekly News in weekly installments.
Until now, the following parts appeared:

- Commodity Hardware Today
- CPU Caches
- Virtual Memory
- NUMA Support
- What Programmers Can Do — Cache Optimization
- What Programmers Can Do — Multi-Threaded Optimizations
- Memory Performance Tools
- Future Technologies
- Appendices and Bibliography

The author shows, for example, how the naive matrix-matrix multiplication

for (i = 0; i < N; ++i) for (j = 0; j < N; ++j) for (k = 0; k < N; ++k) res[i][j] += A[i][k] * B[k][j];

can be improved to run 10 times faster for big matrices!

2007-10-17

Recently I learned how to compute the distribution of the number
*N* of visits in zero of a one-dimensional, symmetric, simple random
walk (*X*_{n}), conditioned on
*X*_{2n}=0. There are two ways to compute this.

Plan A. Using the nice equality

which I learned long ago as a student, we can see that there is a
bijection between all bridge paths and all paths which don't hit zero.
(The equality is proved by constructing a bijection between the
complements of these sets, using the reflection principle.) We can apply
this to bridge paths with at least *k* zeros, mapping the path
segment starting at the *k*th zero to a path which does not hit zero.
This results in a bijection between all bridge paths having at least
*k* zeros and all paths having exactly *k* zeros. If you happen
to know this number, you are done. Otherwise you can use

Plan B. Consider the following map: for
each bridge path with at least *k* zeros, flip the last *k*
excusions so that they are all positive and then remove the last step of
each of these excursions. This procedure defines a map from paths of
length 2*n* with at least *k* zeros which end at zero to paths
of length 2*n*-*k* (we removed *k* steps) which end at
level *k* (all removed steps where downwards). It transpires that
this map is surjective and every image has exactly 2^{k}
pre-images. Thus the number of bridge paths with at least *k* zeros
is

/ 2*n*-*k* \
2^{k} | |
\ *k* /

and dividing by the total number of bridge paths gives the required probability:

2007-09-29

For my birthday I built myself a new kitchen shelf, starting with timber from homebase. At the back it has specially crafted gaps so that the wires and tubes from my heating can go through. Say hello to my new shelf!

2007-09-24

Another paper I wrote together with my brother Andreas finally has been accepted:

- A. Voss and J. Voss:
A Fast Numerical Algorithm for the Estimation of Diffusion-Model Parameters.
Journal of Mathematical Psychology, vol. 52, pp. 1–9, 2008.

DOI:10.1016/j.jmp.2007.09.005, preprint:pdf, more…

This paper describes the mathematical foundations which underly our fast-dm program.

Older entries can be found on the next page …

Copyright © 2017, Jochen Voss. All content on this website (including text, pictures, and any other original works), unless otherwise noted, is licensed under a Creative Commons Attribution-Share Alike 3.0 License.