Happy Structured Logging in R

構造化ログ入門.R

atusy

Atusy


Publication


Why logging?

To record or help identifying (potential) issues and enhance improvements.

The logger package helps you.


Why not print?

Avoid reinvention with print, unless intentionally.


Generate analyzable logs

{"time":"2024-12-03 22:38:17","level":"INFO","ns":"japanr2024","message":"Started session","session_id":"01JE6BWEWP0BAAFGPZTZ9GBNM6"}
{"time":"2024-12-03 22:38:17","level":"INFO","ns":"japanr2024","message":"Received request","session_id":"01JE6BWEWP0BAAFGPZTZ9GBNM6","trace_id":"01JE6BWEXF5F6HDD4YFZBP157F"}
{"time":"2024-12-03 22:38:17","level":"DEBUG","ns":"japanr2024","message":"Received request parameters","value":{"locations":"tokyo, london"},"session_id":"01JE6BWEWP0BAAFGPZTZ9GBNM6","trace_id":"01JE6BWEXF5F6HDD4YFZBP157F"}
{"time":"2024-12-03 22:38:17","level":"DEBUG","ns":"japanr2024","message":"Locations parsed","value":["tokyo","london"],"session_id":"01JE6BWEWP0BAAFGPZTZ9GBNM6","trace_id":"01JE6BWEXF5F6HDD4YFZBP157F"}
{"time":"2024-12-03 22:38:17","level":"DEBUG","ns":"japanr2024","message":"Processing location","value":"tokyo","span_id":"01JE6BWEXSW65DD1EC858XMZQ3","session_id":"01JE6BWEWP0BAAFGPZTZ9GBNM6","trace_id":"01JE6BWEXF5F6HDD4YFZBP157F"}
{"time":"2024-12-03 22:38:17","level":"DEBUG","ns":"japanr2024","message":"Plotting weather forecast","span_id":"01JE6BWEXSW65DD1EC858XMZQ3","session_id":"01JE6BWEWP0BAAFGPZTZ9GBNM6","trace_id":"01JE6BWEXF5F6HDD4YFZBP157F"}

R-equivalent of JSON Lines

Named list of lines.

list(time = "2024-12-03 22:38:17", level = "INFO", ns = "japanr2024", message = "Started session", session_id = "01JE6BWEWP0BAAFGPZTZ9GBNM6")
list(time = "2024-12-03 22:38:17", level = "INFO", ns = "japanr2024", message = "Received request", session_id = "01JE6BWEWP0BAAFGPZTZ9GBNM6", trace_id = "01JE6BWEXF5F6HDD4YFZBP157F")
list(time = "2024-12-03 22:38:17", level = "DEBUG", ns = "japanr2024", message = "Received request parameters", value = {locations = "tokyo, london"}, session_id = "01JE6BWEWP0BAAFGPZTZ9GBNM6", trace_id = "01JE6BWEXF5F6HDD4YFZBP157F")
list(time = "2024-12-03 22:38:17", level = "DEBUG", ns = "japanr2024", message = "Locations parsed", value = ["tokyo", "london"], session_id = "01JE6BWEWP0BAAFGPZTZ9GBNM6", trace_id = "01JE6BWEXF5F6HDD4YFZBP157F")
list(time = "2024-12-03 22:38:17", level = "DEBUG", ns = "japanr2024", message = "Processing location", value = "tokyo", span_id = "01JE6BWEXSW65DD1EC858XMZQ3", session_id = "01JE6BWEWP0BAAFGPZTZ9GBNM6", trace_id = "01JE6BWEXF5F6HDD4YFZBP157F")
list(time = "2024-12-03 22:38:17", level = "DEBUG", ns = "japanr2024", message = "Plotting weather forecast", span_id = "01JE6BWEXSW65DD1EC858XMZQ3", session_id = "01JE6BWEWP0BAAFGPZTZ9GBNM6", trace_id = "01JE6BWEXF5F6HDD4YFZBP157F")

Threshold by log level

logger::log_threshold("INFO") # DEBUG logs are ignored

logger::log_info(message = "Program started.")

params <- list(x = 1)
logger::log_debug(message = "Received parameters", value = params)

Control output destinations


The more about analyzable logs

Previous sections introduced some key features of logging.

Let’s dive into the content of logs.


What is the preferred format?

Typically JSON Lines.

{"time":"2024-12-04 21:07:28","level":"INFO","message":"Program started."}
{"time":"2024-12-04 21:07:28","level":"WARN","message":"Some warning."}

Then, packages parse them and outputs data frame (e.g., jsonlite, yyjsonr, duckdb, …).

time level message
2024-12-04 21:07:28 INFO Program started.
2024-12-04 21:07:28 WARN Some warning.

What about human-readable format?

The log requires self-maintained parsers…

2024-12-04 21:07:28 [INFO] Program started
2024-12-04 21:07:28 [WARN] Some warning.

Can we use both?

YES!

But you need advanced configurations.


How to design log schema?

key type desc example
time string timestamp 2024-12-03 22:38:17
level string log level INFO
ns string name of package, project, … japanr2024
message string static message Received request
value any value related to the message list(input = "tokyo")
session_id string unique per access 01JE6BWEWP0BAAFGPZTZ9GBNM6
trace_id string unique per request 01JE6BWEXF5F6HDD4YFZBP157F
span_id string unique per task 01JE6BWEXSW65DD1EC858XMZQ3
context list (Optional) additional context list(...)

Use static message and dynamic value

key type desc example
time string timestamp 2024-12-03 22:38:17
level string log level INFO
ns string name of package, project, … japanr2024
message string static message Received request
value any value related to the message list(input = "tokyo")
session_id string unique per access 01JE6BWEWP0BAAFGPZTZ9GBNM6
trace_id string unique per request 01JE6BWEXF5F6HDD4YFZBP157F
span_id string unique per task 01JE6BWEXSW65DD1EC858XMZQ3
context list (Optional) additional context list(...)

Use static message and dynamic value

# BAD: dynamic message
x <- 1
logger::log_info(message = glue::glue("x is {x}"))
{"time":"2024-12-07 14:08:20","level":"INFO","message":"x is 1"}
# GOOD: static message and dynamic value are tidyverse-friendly
logger::log_info(message = "x is defined", value = x)
{"time":"2024-12-07 14:08:20","level":"INFO","message":"x is defined","value":1}

Cautions for dynamic value


Add identifiers for traceability

key type desc example
time string timestamp 2024-12-03 22:38:17
level string log level INFO
ns string name of package, project, … japanr2024
message string static message Received request
value any value related to the message list(input = "tokyo")
session_id string unique per access 01JE6BWEWP0BAAFGPZTZ9GBNM6
trace_id string unique per request 01JE6BWEXF5F6HDD4YFZBP157F
span_id string unique per task 01JE6BWEXSW65DD1EC858XMZQ3
context list (Optional) additional context list(...)

Add identifiers for traceability

These keywords typically appears in observable engineering.

Identifiers are often generated by ULID or UUIDv4.


Add context for additional information

key type desc example
time string timestamp 2024-12-03 22:38:17
level string log level INFO
ns string name of package, project, … japanr2024
message string static message Received request
value any value related to the message list(input = "tokyo")
session_id string unique per access 01JE6BWEWP0BAAFGPZTZ9GBNM6
trace_id string unique per request 01JE6BWEXF5F6HDD4YFZBP157F
span_id string unique per task 01JE6BWEXSW65DD1EC858XMZQ3
context list (Optional) additional context list(...)

Add context for additional information

Optional data that helps analyzing the log.

For example, parent-child relationship between logs.

{
  ...,
  "span_id": "01JDNN0EK0715J048SHYW7XZ2V",
  "context": {
    "parent_span_id": "01JDNN0F76GR0398VJ4MMGBCY7",
  }
}

Recap of introduction to logging


How to log with logger?

# Configure
logger::log_appender(logger::appender_stdout)
logger::log_formatter(logger::formatter_json)
logger::log_layout(logger::layout_json_parser(c("time", "level", "ns")))
# Log
logger::log_info(message = "Program started.")
{"time":"2024-12-07 14:08:20","level":"INFO","ns":"global","message":"Program started."}

logger in package


Demo of logging in a Shiny app

https://github.com/atusy/japanr-2024-logging

JAPANR2024_LOG_LEVEL=INFO R --no-save -e 'devtools::load_all(); run()'
JAPANR2024_LOG_LEVEL=DEBUG R --no-save -e 'devtools::load_all(); run()'

Examples of log analysis

logdata <- system.file("extdata/japanr2024.jsonl", package = "japanr2024") |>
  yyjsonr::read_ndjson_file() |>
  dplyr::mutate(time = clock::date_time_parse(time, "Asia/Tokyo")) |>
  dplyr::glimpse()
#> Rows: 184
#> Columns: 8
#> $ time       <dttm> 2024-12-03 22:38:17, 2024-12-03 22:38:17, 2024-12-03 22:38…
#> $ level      <chr> "INFO", "INFO", "DEBUG", "DEBUG", "DEBUG", "DEBUG", "DEBUG"…
#> $ ns         <chr> "japanr2024", "japanr2024", "japanr2024", "japanr2024", "ja…
#> $ message    <chr> "Started session", "Received request", "Received request pa…
#> $ session_id <chr> "01JE6BWEWP0BAAFGPZTZ9GBNM6", "01JE6BWEWP0BAAFGPZTZ9GBNM6",…
#> $ trace_id   <chr> NA, "01JE6BWEXF5F6HDD4YFZBP157F", "01JE6BWEXF5F6HDD4YFZBP15…
#> $ value      <list> <NULL>, <NULL>, ["tokyo, london"], <"tokyo", "london">, "t…
#> $ span_id    <chr> NA, NA, NA, NA, "01JE6BWEXSW65DD1EC858XMZQ3", "01JE6BWEXSW6…

Error count

logdata |>
  dplyr::summarize(
    time = min(time),
    errors = sum(level == "ERROR"),
    .by = "trace_id"
  ) |>
  ggplot2::ggplot() +
  ggplot2::aes(time, errors) +
  ggplot2::geom_point() +
  ggplot2::scale_x_datetime(name = "Requested at", date_breaks = "10 sec")

Oh no! Many requests suffer errors!


Response time

logdata |>
  dplyr::reframe(
    start = min(time),
    duration = max(time) - start,
    locations = lengths(value[message == "Locations parsed"]),
    .by = "trace_id"
  ) |>
  ggplot2::ggplot() +
  ggplot2::aes(start, duration, color = locations) +
  ggplot2::geom_point() +
  ggplot2::scale_x_datetime(name = "Requested at", date_breaks = "10 sec") +
  ggplot2::scale_y_continuous(name = "Response time") +
  NULL

Hmm… users have to wait for several seconds…


Deep dive into errors and slow responses.

logdata |>
  dplyr::filter(!is.na(span_id)) |>
  dplyr::summarize(
    start = min(time),
    end = max(time),
    location = unlist(value[message == "Processing location"]),
    error = "ERROR" %in% level,
    .by = c("trace_id", "span_id")
  ) |>
  dplyr::mutate(
    location = forcats::fct_relevel(location, rev(unique(location)))
  ) |>
  ggplot2::ggplot() +
  ggplot2::aes(
    x = start, xend = end,
    y = location, yend = location,
    color = error,
  ) +
  ggplot2::geom_point() +
  ggplot2::geom_segment() +
  ggplot2::facet_grid(rows = "trace_id", scales = "free_y") +
  ggplot2::theme(
    strip.text.y = ggplot2::element_blank(),
    axis.title.x = ggplot2::element_blank()
  ) +
  ggplot2::scale_x_datetime(name = "", date_breaks = "10 sec")


Insights from the analysis


Ideas to solve the issues


Memoising slow functions

# Memoised versions of openmeteo functions
geocode <- memoise::memoise(openmeteo::geocode)
weather_forecast <- memoise::memoise(
  openmeteo::weather_forecast,
  # Invalidate cache daily
  hash = function(x) rlang::hash(list(x, Sys.Date()))
)

# Efficient forecast function
forecast <- function(x) {
  g <- geocode(x)
  weather_forecast(c(g$latitude, g$longitude), hourly = "temperature_2m")
}

Handling errors with try-catch

geocode <- function(x) {
  tryCatch(
    openmeteo::geocode(x),
    error = function(e) {
      if (e$message == "No matches found") {
        logger::log_debug("No matches found for geocode", value = x)
        return(NULL)
      }
      stop(e)
    }
  )
}

ENJOY HAPPY STRUCTURED LOGGING!