Logging

Logging, like all of the topics discussed, is a large discussion. To begin with, one can legitimately ask why one has to log anything at all? Logging can be used to your advantage in many ways. In the previous section we discussed error handling. When a program encounters a problem that prevents further execution it will stop with an error. It is common sense to capture that error message in order to investigate the source after the fact. Another prominent use case, especially in the context of data analysis, is the use of logs to collect, transport and even store data. Good code needs a careful consideration of how to use logging.

In general

Each log message has an associated log level.

Level	Meaning
DEBUG	Very verbose - useful when developing a program
INFO	Confirms that something is working as expected
WARNING	Program is still running, but something unexpected happened or might soon happen
ERROR	A problem has caused a failure in the code execution
CRITICAL	The program has crashed completely

When specifying log messages in your code, you should think about what level to set for the message. This will become less abstract in the examples below.

Python

Before you can start logging you have to create a logger. When you create a logger you will also configure the logger to output messages in a file, for example, at a certain level. You can do the configuration through code or through a configuration file.

Here's an example of setting up logging to the console in code.

import logging

def init_logging():
    logger.setLevel(logging.INFO)
    console_handler = logging.StreamHandler()
    console_handler.setLevel(logging.INFO)
    formatter = logging.Formatter("%(asctime)s: %(name)s: %(levelname)s - %(message)s")
    console_handler.setFormatter(formatter)
    logger.addHandler(console_handler)


logger = logging.getLogger("mylogger")
init_logging()

def do_a_thing():
    logger.info("Going to do something now")
    print "Hello there, now you are printing in between log statements"
    logger.info("Yay! This is fun!")

do_a_thing()

If you run put this in a script my_script.py and run it python my_script.py you should see something like:

2013-12-09 12:46:03,171: mylogger: INFO - Going to do something now
Hello there, now you are printing in between log statements
2013-12-09 12:46:03,171: mylogger: INFO - Yay! This is fun!

In practice you also want to configure a file handler so that log messages are sent to a file too. If you're doing this through code then just modify the init_logging function.

def init_logging():
    logger.setLevel(logging.INFO)
    console_handler = logging.StreamHandler()
    console_handler.setLevel(logging.INFO)
    file_handler = logging.FileHandler("the_log_file.log")
    file_handler.setLevel(logging.INFO)
    formatter = logging.Formatter("%(asctime)s: %(name)s: %(levelname)s - %(message)s")
    console_handler.setFormatter(formatter)
    file_handler.setFormatter(formatter)
    logger.addHandler(console_handler)
    logger.addHandler(file_handler)

If you rerun the code you will see that the print statement does not end up in the file. Doing all the config in code becomes a little clumsy and difficult to manage once you want different logging configurations in different execution environments. You might want to set the level to DEBUG on your own machine, but INFO in your staging and production environments. You might also want different names and locations for log files, or not log to a file at all in either one of your environments. Then it makes much more sense to have configuration files that specify your loggers.

Let's redo the second case where we log to the console and to a file - this time using a configuration file to set up the logger (you can skip ahead and read about configuration if you feel the need).

The configuration file:

version: 1
formatters:
  simple:
    format: '%(asctime)s - %(name)s - %(levelname)s - %(message)s'
handlers:
  console:
    class: logging.StreamHandler
    level: INFO
    formatter: simple
    stream: ext://sys.stdout
  file:
    class: logging.FileHandler
    level: INFO
    formatter: simple
    filename: the_log_file.log
loggers:
  mylogger:
    level: INFO
    handlers: [console, file]
    propagate: no
root:
  level: INFO
  handlers: []
disable_existing_loggers: False

The corresponding python code:

import logging
import logging.config
import yaml

def init_logging_file():
    with open("logg_conf.yaml") as conf:
        log_config = yaml.load(conf)
    logging.config.dictConfig(log_config)

logger = logging.getLogger("mylogger")
init_logging_file()

def do_a_thing():
    logger.info("Going to do something now")
    print "Hello there, now you're printing in between log statements"
    logger.info("Yay! This is fun!")

do_a_thing()

If you save the config in a file called logg_conf.yaml and the python code in a file called logging_example.py - both in the same directory - and then do python logging_example.py you should see the exact same behaviour as in the previous example. Except this time the code is much cleaner and more maintainable.

The python standard library tutorial on logging is an excellent reference if you want more information.

R

The same general principles discussed above apply in the case of R. There are several choices for packages to do logging in R, but one of my favourites is rlogging. I like it for its simplicity.

R has three builtin commands called message(), warning() and stop() - each can be used to send output to the console. The rlogging package uses these existing functions to write pretty output to log files. message() is linked to INFO level. warning() is linked to both INFO and WARNING levels. And stop() is linked to INFO, WARNING and STOP levels.

Let's see it in action.

# load it 'silently'
suppressMessages(library(rlogging))
SetLogFile('mylogfile.log')

do_something <- function() {
  message('this will be in your log file :)')
  warning('I think something will soon go wrong...')
  stop('The end of the world is upon us - exiting.')
}

do_something()

If you put the code into a file called rlog.r and do Rscript --vanilla rlog.r, and then less mylogfile.log you should see something like this:

[2013-12-09 14:38:28] [INFO] this will be in your log file :)
[2013-12-09 14:38:28] [WARN] I think something will soon go wrong...
[2013-12-09 14:38:28] [STOP] The end of the world is upon us - exiting.

Plain and simple. Then it is just a matter of using the message(), warning() and stop() functions cleverly throughout your code. that will probably involve some trial and error.

Want to add to the wiki?

Please drop me a mail

Logging

In general

Python

R

Want to add to the wiki?

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Contents

Getting started

Software fundamentals

Working with data

Visualisation

Tools

Programming resources

Clone this wiki locally