## 5.1 Project planning

Good programmers embarking on a complex project will rarely just start typing code. Instead, they will plan the steps needed to complete the task as efficiently as possible: “smart preparation minimizes work” (Berkun 2005). Although search engines are useful for identifying the appropriate strategy, the trail-and-error approach — typing code at random and Googling the inevitable error messages — is usually highly inefficient. Strategic thinking is necessary.

Often the best place to start is often with a pen and paper. Project planning is a non-linear, open-ended and creative process not always well-suited to the linear logic of computing.10 Planning simply involves thinking about the project’s aims in the context of available resources (e.g. computational vs programming skill), the project’s scope, timelines and suitable software (i.e. R packages, covered in the next section). Minutes spent before a single line of code is written have the potential to save hours later on. There are many excellent guides available that will help you develop a project plan.

Once a project overview has been devised and stored, in mind (for small projects, if you trust that as storage medium!) or written, a plan with a time-line can be drawn-up. The up-to-date visualisation of this plan can be a powerful reminder to yourself and collaborators of progress on the project so far. More importantly the timeline provides an overview of what needs to be done next. Setting start dates and deadlines for each task will help prioritise the work and ensure you are on track. Breaking a large project into smaller chunks is highly recommended, making huge, complex tasks more achievable and modular PMBoK (2000). ‘Chunking’ the work will also make collaboration easier, as we shall see in Chapter 5.

The tasks that a project should be split into will depend the nature of the work and the phases illustrated in Figure 5.1 represent a rough starting point, not a template and the ‘programming’ phase will usually need to be split into at least ‘data tidying’, ‘processing’, and ‘visualisation’.

A more rigorous (but potentially onerous) way to project plan is to divide the work into a series of objectives and tracking their progress throughout the project’s duration. One way to check if an objective is appropriate for action and review is by using the SMART criteria.

• Specific: is the objective clearly defined and self-contained?
• Measurable: is there a clear indication of its completion?
• Attainable: can the target be achieved?
• Realistic: have sufficient resources been allocated to the task?
• Time-bound: is there an associated completion date or milestone?

If the answer to each of these questions is ‘yes’, the task is likely to be suitable to include in the project’s plan. Note that this does not mean all project plans need to be uniform. A project plan can take many forms, including a short document, a Gantt chart (see Figure 5.2 or simply a clear vision of the project’s steps in mind.

A number of R packages can assist with this process of formalising and visualising the project plan, including:11

• plan provides basic tools to create burndown charts (which concisely show whether a project is on-time or not) and Gantt charts.

• plotrix, a general purpose plotting package, provides basic Gantt chart plotting functionality. See example(gantt.chart) for details.

• DiagrammeR, a new package for creating network graphs and other schematic diagrams in R. This package provides an R interface to simple flow-chart file formats such as mermaid and GraphViz.

The small example below (which provides the basis for creating charts like Figure 5.2 illustrates how DiagrammeR can take simple text inputs to create informative up-to-date Gantt charts. Such charts can greatly help with the planning and task management of long and complex R projects, as long as they do not take away valuable programming time from core project objectives.

library("DiagrammeR") # load the necessary package

# define the Gantt chart and plot the result (not shown)
mermaid("gantt
Section Initiation
Planning           :a1, 2016-01-01, 10d
Data processing    :after a1  , 30d")

In the above code gantt defines the subsequent data layout. Section refers to the project’s section (useful for large projects, with milestones) and each new line refers to a discrete task. Planning, for example, has the code a, begins on the first day of 2016 and lasts for 10 days. See knsv.github.io/mermaid/gantt.html for more detailed documentation.

### 5.1.1 Exercises

1. What are the three most important work ‘chunks’ of your current R project?

2. What is the meaning of ‘SMART’ objectives?

3. Run the code chunk at the end of this section to see the output.

4. Bonus exercise: modify the code to create a very basic Gantt chart of an R project you are working on.