LaTeX: calling a macro for every line of input

By , on

Recently I spent some time to understand how one can execute a macro repeatedly, once for every line of text in a LaTeX environment. Since the solution is a bit tricky and I found it diffcult to find answers on the web, here is a summary of what I learned.

Step 1. In order to prevent input lines from being concatenated by TeX before we get access to them, we can use the \obeylines macro. This allows to define a macro which matches everything until the end of line:

\def\doline#1{line found: #1'\par}
{\obeylines
\gdef\getline#1
{\doline{#1}}}

{\obeylines
\getline This is the \textbf{first} line.
This is the second line.}


Note that \obeylines must be in effect both while we define the \getline macro and while scanning the text. The outer pairs of enclosing brackets are there to contain the effect of \obeylines. To make \getline visible outside these brackets we use \gdef to define the macro in the global namespace. The output of the above TeX code is

line found: ‘This is the first line.’
This is the second line.


Step 2. We would like to have the \getline macro called for every line instead of just for the first one. This can be achieved by putting a \getline (without arguments) at the end of the \getline replacement text. The only complication is that we somehow need to stop this procedure once the end of the region of interest has been reached:

\def\marker{END}
{\obeylines
\gdef\getlines#1
{\def\text{#1}%
\ifx\text\marker \let\next\empty
\else \doline{#1}\let\next\getlines \fi
\next}}

{\obeylines
\getlines This is the \textbf{first} line.
This is the second line.
END
}


The \ifx command is used to test whether the matched line is the same as \marker. Until we find the maker, we put a new call to \getlines at the end of the \getlines replacement text, thereby looping over all lines until the marker is found. For this to work, the END needs to occur on a line on its own; therefore we have to move the closing brackets down one line. The TeX code above leads to the following output.

line found: ‘This is the first line.’
line found: ‘This is the second line.’


Step 3. We are now ready to pack the commands from above into the definition of a LaTeX environment:

\def\marker{\end{dolines}}
{\obeylines
\gdef\getlines#1
{\def\text{#1}%
\ifx\text\marker \let\next\text
\else \doline{#1}\let\next\getlines \fi
\next}}
\newenvironment{dolines}{\begingroup\obeylines\getlines}%
{\endgroup}

\begin{dolines}
This is the \textbf{first} line.
This is the second line.
\end{dolines}


Here we had to change the definition of \getlines in order to include the \end{dolines} in the replacement text when the \ifx is true; otherwise it would have been swallowed by \dolines. The resulting output is, a bit surprisingly, as follows:

line found: ‘’
line found: ‘This is the first line.’
line found: ‘This is the second line.’


The extra empty line at the beginning is caused by the \getlines macro matching the (empty) text between \begin{dolines} and the end of line. The \end{listing} must be on a line on its own for this environment to work.

Step 4. If we want to prevent processing of the TeX commands inside the dolines environment, we can do so by changing the category codes of special characters like \ to 12 (other). A complication with this plan is, that \ifx also compares category codes. Therefore, when we try to detect the end of our environment while special characters are switched off, we need to define \marker as the string \end{dolines}, but with the category codes of all special charcters (the backslash and the curly brackets) set to 12. This can be done by using the following, contorted sequence of commands:

\begingroup
\catcode|=0 \catcode[=1 \catcode]=2
\catcode\{=12 \catcode\}=12 \catcode\\=12
|gdef|marker[\end{dolines}]
|endgroup


Two more changes are required to make this work: first, we need to disable all special characters at the beginning of the environment:

\let\do=\@makeother\dospecials


If you want to use this command outside a style file, you will need to turn @ into a letter by bracketing your definitions with

\makeatletter
...
\makeatother


Secondly, inside \getlines, the expansion of \text still has the category codes as found in the input. Therefore, if the input was read with special characters switched of, we cannot write \let\next\text to get the closing \end{dolines} (since the category codes of the backslash and the curly brackets would be wrong). Instead, we need to use a definition like \def\next{\end{dolines}}.

Example. Using the techniques explained above, we can define a replacement for the LaTeX comment environment (contained in the verbatim package) as follows:

\documentclass{article}

\makeatletter

\begingroup
\catcode|=0 \catcode[=1 \catcode]=2
\catcode\{=12 \catcode\}=12 \catcode\\=12
|gdef|marker[\end{comment}]
|endgroup

{\obeylines
\gdef\getlines#1
{\def\text{#1}%
\ifx\text\marker \def\next{\end{comment}}
\else \let\next\getlines \fi
\next}}
\newenvironment{comment}{\begingroup
\let\do=\@makeother\dospecials \obeylines\getlines}%
{\endgroup}

\makeatother

\begin{document}

\begin{comment}
Everything here will be ignored.
\Invalid LaTeX code and incomplete constructs like
\begin{itemize}
without the closing end are no problem.
\end{comment}

\end{document}
`

This is an excerpt from Jochen's blog.
Newer entry: new LaTeX package "jvlisting"
Older entry: old papers