Sometimes when I'm learning I like to print source code on paper because I find it easier to read, and nicer to annotate. I had a rummage online to see if anyone had come up with a nice way to generate PDFs of source code, and improved what I found into this useful BASH script. The script searches for source code in the current directory and its subdirectories, and uses the typesetting software LaTeX to create a PDF of the contents with syntax highlighting.
Summary and Defaults
The script is interactive, and will prompt you on the following points:
- Choice of document title (for the front page). The default is to use the name of the current directory.
- Choice of file extensions (defaults to .h, .cpp and .qml files). This is useful for omitting boilerplate stuff you're not interested in like Makefiles, and also helps reduce the chances of accidentally including binary files (which would cause LaTeX to throw a load of errors). You should provide a list of one or more extensions, with each item in the list separated by a single space.
- Choice of whether to place .h files in front of .cpp files (defaults to yes). The source files are sorted alphabetically before inclusion into the PDF, which means that without this option header files appear after cpp files, which is inconvenient.
- Option to review the files that will be included, and exit cleanly if there is a problem with the list.
The script then creates a latex source file by printing a header into the temporary .tex file (which determines the document's formatting) followed by the contents of each source file in a new section, so that each file starts on a new page with a title. The source code itself is contained in LaTeX listings, which enable syntax highlighting.
Installation
If you don't have LaTeX installed, you must install it. On Debian derivatives like Ubuntu, Raspbian:
sudo apt-get update sudo apt-get install texlive-latex-base texlive-latex-extra
You can choose any name you like for the script, but I named it src2pdf
. If you don't have a bin subdirectory for scripts in your home directory, create one:
mkdir ~/bin
Create a file and copy and paste the source (CTRL+SHIFT+V to paste in most terminal emulators):
nano ~/bin/src2pdf
Make the script executable:
chmod +x ~/bin/src2pdf
If you want to add the bin directory to your path (so you can type src2pdf
instead of ~/bin/src2pdf
), append this line to your ~/.bashrc
:
PATH=$PATH:~/bin
And reload the settings:
source ~/.bashrc
The Script
To run the script, just change directory into the folder where the source code is found and then run the src2pdf
command.
#!/usr/bin/env bash # CREATE PDF FROM SOURCE CODE # original source https://superuser.com/a/601412/151431 # source code file names must not contain spaces read -p "Please type the document title (blank to use ${PWD##*/}) : " answer if [[ $answer == "" ]]; then title=${PWD##*/} else title=$answer fi # if output files already exist, delete them if [ -f ./tmp.aux ] || [ -f ./tmp.log ] || [ -f ./tmp.out ] || [ -f ./tmp.pdf ] || [ -f ./tmp.toc ] ; then echo "Removing old output files..." rm ./tmp.* fi tex_file=$(mktemp) ## Random temp file name if [ $? -ne 0 ]; then echo "ERROR: failed to create temporary file" exit 1; fi # DOCUMENT HEADER cat<<EOF >$tex_file ## Print the tex file header \batchmode \documentclass[titlepage,twoside]{article} %\usepackage{showframe} %\usepackage[inner=2cm,outer=4cm]{geometry} %\usepackage[]{geometry} \usepackage[inner=2.5cm,outer=2.5cm,bottom=2.5cm]{geometry} \usepackage{listings} \usepackage[usenames,dvipsnames]{color} %% Allow color names \lstdefinestyle{customasm}{ belowcaptionskip=1\baselineskip, xleftmargin=\parindent, language=C++, %% Change this to whatever you write in breaklines=true, %% Wrap long lines basicstyle=\footnotesize\ttfamily, commentstyle=\itshape\color{Gray}, stringstyle=\color{Black}, keywordstyle=\bfseries\color{OliveGreen}, identifierstyle=\color{blue}, %xleftmargin=-8em, } \usepackage[colorlinks=true,linkcolor=blue]{hyperref} \begin{document} \title{$title} \author{Sam Hobbs} \maketitle \pagenumbering{roman} \tableofcontents \newpage \setcounter{page}{1} \pagenumbering{arabic} EOF ############### # ask the user which file extensions to include read -p "Provide a space separated list of extensions to include (default is 'h cpp qml') : " answer if [[ $answer == "" ]]; then answer="h cpp qml" fi # replace spaces with double escaped pipe using substring replacement http://www.tldp.org/LDP/abs/html/parameter-substitution.html extensions="${answer// /\\|}" ############### # FINDING FILES TO INCLUDE # inline comments http://stackoverflow.com/questions/2524367/inline-comments-for-bash#2524617 # not all of the conditions below are necessary now that the regex for c++ files has been added, but they don't harm filesarray=( $( find . `# find files in the current directory` \ -type f `# must be regular files` \ -regex ".*\.\($extensions\)" `# only files with the chosen extensions (.h, .cpp and .qml) by default` \ ! -regex ".*/\..*" `# exclude hidden directories - anything slash dot anything (Emacs regex on whole path https://www.emacswiki.org/emacs/RegularExpression)` \ ! -name ".*" `# not hidden files` \ ! -name "*~" `# don't include backup files` \ ! -name 'src2pdf' `# not this file if it's in the current directory` )) ############### # sort the array https://stackoverflow.com/questions/7442417/how-to-sort-an-array-in-bash#11789688 # internal field separator $IFS https://bash.cyberciti.biz/guide/$IFS IFS=$'\n' filesarray=($(sort <<<"${filesarray[*]}")) unset IFS ############### read -p "Re-order files to place header files in front of cpp files? (y/n) : " answer if [[ ! $answer == "n" ]] && [[ ! $answer == "N" ]] ; then echo "Re-ordering files..." # if this element is a .cpp file, check the next element to see if it is a matching .h file # if it is, swap the order of the two elements re="^(.*)\.cpp$" # this element is ${filesarray[$i]}, next element is ${filesarray[$i+1]} for (( i=0; i<=$(( ${#filesarray[@]} -1 )); i++ )) do # if the element is a .cpp file, check the next element to see if it is a matching .h file if [[ ${filesarray[$i]} =~ $re ]]; then header=${BASH_REMATCH[1]} header+=".h" if [[ ${filesarray[$i+1]} == $header ]]; then # replace the next element in the array with the current element filesarray[$i+1]=${filesarray[$i]} # replace the current element in the array with $header filesarray[$i]=$header fi fi done fi ############### # Change ./foo/bar.src to foo/bar.src IFS=$'\n' filesarray=($(sed 's/^\..//' <<<"${filesarray[*]}")) unset IFS ############### read -p "Review files found? (y/n) : " answer if [[ $answer == "y" ]] || [[ $answer == "Y" ]] ; then echo "The following files will be included in the document..." for i in "${filesarray[@]}" do echo $i done # allow the user to abort read -p "Proceed? (y/n) : " answer if [[ $answer == "n" ]] || [[ $answer == "N" ]] ; then exit 0 fi fi ############### # create a .tex file with each section on its own page echo "Creating tex file..." for i in "${filesarray[@]}" do echo "\newpage" >> $tex_file # start each section on a new page echo "\section{$i}" >> $tex_file # create a section for each source file echo "\lstinputlisting[style=customasm]{$i}" >>$tex_file # place the contents of each file in a listing done echo "\end{document}" >> $tex_file ############### # run pdflatex twice to produce TOC echo "Creating pdf..." echo pdflatex $tex_file -output-directory . if [ $? -ne 0 ]; then echo "ERROR: pdflatex command failed on first run, refer to tmp.log for more information" exit 1; fi pdflatex $tex_file -output-directory . if [ $? -ne 0 ]; then echo "ERROR: pdflatex command failed on second run, refer to tmp.log for more information" exit 1; fi ############### echo "Renaming output files..." mv tmp.pdf $title.pdf echo "Cleaning up..." rm ./tmp.* echo "Done, output file is $title.pdf in this directory"
If you want to change the appearance of the document (e.g. modify the margin widths), you can modify the LaTeX header in the script. Questions/comments/improvements? Let me know in the comments.
Comments
Link
Hey there, I'm the author of the Super User answer you based the script on. First off, thank you for linking back to my answer, and I'm very glad you find it useful! However, it would be better if you could link directly to the answer itself (https://superuser.com/a/601412/151431) instead of to the question the answer was posted on. That makes it easier to find and the link I gave you is also sure to work even if the question's title is changed.
Thanks for that, out of
Someone left me a comment
Someone left me a comment under that answer alerting me to this post. They hadn't seen the link so they thought you may have been doing something wrong (which you absolutely are not). And the URLs should work actually, yes, even if the title changes, it's just better to link directly to the answer itself instead of to the question. But really not a big deal either way. I just thought I'd mention it since I saw this post.
Ah, that makes sense. Well
Like
Hey Sam. You need a Thumbs Up / Like engine on your pages. You have some great content on your site!
Cheers. :)
src2pdf
Nice script, thank you. I added this so I can also put LaTex comments within my C++ source code comments:
\lstset{
escapechar=@,showstringspaces=false
}
i.e.
in cpp file
/**@ Write a program that ... \sqrt{x} @*/
Add new comment