Tag Archives: writing

Convert LaTeX to Word

A big problem with writing in LaTeX is collaborating with colleagues who don’t use it. One option is to generate a Word .docx version and use the comments and track changes features in Word / LibreOffice. This does require manually copying the changes back to LaTeX so isn’t quite as nice as using latexdiff (see earlier post) but is slightly easier than adding comments to a PDF.

The best program I’ve found for converting LaTeX to Word is the open source (GPL) command line tool, pandoc (http://pandoc.org/).

Basic usage is quite straightforward:

pandoc latex_document.tex -o latex_document_word_version.docx

The conversion isn’t perfect, figures and tables can get a bit mangled, but it does a good job with the text.

Pandoc can convert between many different formats, including from markdown and reStructuredText (commonly used for software documentation) so it is worth having installed.

Tracking changes in a LaTeX document

One of the problems people often have with using LaTeX for collaborative writing is that it is difficult to track changes in a document, like in Word.

As .tex files are text documents version control such as Git or Mercurial can be used to keep track of changes in the LaTeX source. However, looking at differences in .tex files is not as easy as having a formatted copy of the document with the changes marked, in particular for people not used to LaTeX.

The latexdiff command can be used to create a LaTeX document with the changes marked, from which a PDF can be created showing where the text has changed. For example:

latexdiff_output

Basic usage is:

latexdiff origionaldoc.tex changeddoc.tex > changes.tex

By default the command will print everything to the terminal so the output needs to be redirected (using >) to a file.

When writing papers we commonly have a shell script to generate a change tex file, create a PDF and remove temp files.

# Create diff file
latexdiff --exclude-textcmd "section,subsection,sub subsection" \
   201403_GOBIA_RSGISLib_RIOS_origional.tex \
   201403_GOBIA_RSGISLib_RIOS.tex > \
   201403_GOBIA_RSGISLib_RIOS_changes_temp.tex

# Create PDF
pdflatex 201403_GOBIA_RSGISLib_RIOS_changes_temp.tex
bibtex 201403_GOBIA_RSGISLib_RIOS_changes_temp
pdflatex 201403_GOBIA_RSGISLib_RIOS_changes_temp.tex
pdflatex 201403_GOBIA_RSGISLib_RIOS_changes_temp.tex

# Rename PDF and move extra files.
mv 201403_GOBIA_RSGISLib_RIOS_changes_temp.pdf 201403_GOBIA_RSGISLib_RIOS_changes.pdf
rm 201403_GOBIA_RSGISLib_RIOS_changes_temp.*

Note sometimes changes for particular tex commands can cause problems when creating a PDF, producing an error along these lines:

! Argument of \UL@word has an extra }.
    <inserted text>
                \par 
l.292 ...\DIFaddbegin \DIFadd{Methods}\DIFaddend }
                                                  \label{sec:method}

To stop this happening you can exclude particular commands (e.g., section) using the ‘–exclude-textcmd’ flag. (Thanks to this post on stack exchange for the tip).

(slightly) Easier writing in Word

Writing is a part of any research position and it’s likely at least some of that writing will be in Microsoft Word. Without wanting to start a LaTeX vs. Word debate (my preference is LaTeX)  there will be times when you won’t have the choice and will have to use Word. Here are two tips I recently learnt that might make the your life a little easier.

Modify the styles so they look good and use them

On word 2011 for mac:

  1. Select format > style.
  2. Select list ‘All styles’.
  3. Modify ‘Normal’, ‘Heading 1’, ‘Heading 2’, ‘Heading 3’ and Caption (you can modify more if necessary but these should be sufficient for now).
  4. Whenever you have a heading, or subheading apply the relevant style (they are available in the ribbon) – this way Word knows the document structure (like when using LaTeX tags).
  5. You can generate a table of contents using Insert > Index and Tables…

If someone’s already made the effort to modify the styles, make sure you use them.

Use captions and cross references

  1. Right click on a figure or table and select ‘Insert caption’.
  2. Type in your caption (or just hit OK and type in the main document). If the formatting isn’t to your liking change the style for caption (don’t just reformat the one caption).
  3. When you need to reference the figure / table select Insert > Cross Reference.
  4. Select the reference type and select Insert reference to ‘Only label and number’.

As a side note for your colleagues using Linux (or that don’t have office), sending round a PDF with the Word document is helpful to show how it should be formatted. If you’re editing in LibreOffice / OpenOffice, despite a lot of work on improving compatibility, chances are not all the formatting will be preserved. My advice is to  make a copy and only edit the text, making sure you track the changes (Changes > Record in LibreOffice) and let your colleagues know you didn’t use Word. Any figures can be sent separately. If you’re using a slightly different version of Word, editing a copy is good practice anyway. LaTeX is plain text you can edit in whatever program you want. However, I did say I didn’t want to start a LaTeX vs. Word debate…