To record or help identifying (potential) issues and enhance improvements.
The logger package helps you.
print
-debugging is not suitable for long-term maintenance.print
-logging lacks some features
Avoid reinvention with print
, unless intentionally.
{"time":"2024-12-03 22:38:17","level":"INFO","ns":"japanr2024","message":"Started session","session_id":"01JE6BWEWP0BAAFGPZTZ9GBNM6"}
{"time":"2024-12-03 22:38:17","level":"INFO","ns":"japanr2024","message":"Received request","session_id":"01JE6BWEWP0BAAFGPZTZ9GBNM6","trace_id":"01JE6BWEXF5F6HDD4YFZBP157F"}
{"time":"2024-12-03 22:38:17","level":"DEBUG","ns":"japanr2024","message":"Received request parameters","value":{"locations":"tokyo, london"},"session_id":"01JE6BWEWP0BAAFGPZTZ9GBNM6","trace_id":"01JE6BWEXF5F6HDD4YFZBP157F"}
{"time":"2024-12-03 22:38:17","level":"DEBUG","ns":"japanr2024","message":"Locations parsed","value":["tokyo","london"],"session_id":"01JE6BWEWP0BAAFGPZTZ9GBNM6","trace_id":"01JE6BWEXF5F6HDD4YFZBP157F"}
{"time":"2024-12-03 22:38:17","level":"DEBUG","ns":"japanr2024","message":"Processing location","value":"tokyo","span_id":"01JE6BWEXSW65DD1EC858XMZQ3","session_id":"01JE6BWEWP0BAAFGPZTZ9GBNM6","trace_id":"01JE6BWEXF5F6HDD4YFZBP157F"}
{"time":"2024-12-03 22:38:17","level":"DEBUG","ns":"japanr2024","message":"Plotting weather forecast","span_id":"01JE6BWEXSW65DD1EC858XMZQ3","session_id":"01JE6BWEWP0BAAFGPZTZ9GBNM6","trace_id":"01JE6BWEXF5F6HDD4YFZBP157F"}
Named list of lines.
list(time = "2024-12-03 22:38:17", level = "INFO", ns = "japanr2024", message = "Started session", session_id = "01JE6BWEWP0BAAFGPZTZ9GBNM6")
list(time = "2024-12-03 22:38:17", level = "INFO", ns = "japanr2024", message = "Received request", session_id = "01JE6BWEWP0BAAFGPZTZ9GBNM6", trace_id = "01JE6BWEXF5F6HDD4YFZBP157F")
list(time = "2024-12-03 22:38:17", level = "DEBUG", ns = "japanr2024", message = "Received request parameters", value = {locations = "tokyo, london"}, session_id = "01JE6BWEWP0BAAFGPZTZ9GBNM6", trace_id = "01JE6BWEXF5F6HDD4YFZBP157F")
list(time = "2024-12-03 22:38:17", level = "DEBUG", ns = "japanr2024", message = "Locations parsed", value = ["tokyo", "london"], session_id = "01JE6BWEWP0BAAFGPZTZ9GBNM6", trace_id = "01JE6BWEXF5F6HDD4YFZBP157F")
list(time = "2024-12-03 22:38:17", level = "DEBUG", ns = "japanr2024", message = "Processing location", value = "tokyo", span_id = "01JE6BWEXSW65DD1EC858XMZQ3", session_id = "01JE6BWEWP0BAAFGPZTZ9GBNM6", trace_id = "01JE6BWEXF5F6HDD4YFZBP157F")
list(time = "2024-12-03 22:38:17", level = "DEBUG", ns = "japanr2024", message = "Plotting weather forecast", span_id = "01JE6BWEXSW65DD1EC858XMZQ3", session_id = "01JE6BWEWP0BAAFGPZTZ9GBNM6", trace_id = "01JE6BWEXF5F6HDD4YFZBP157F")
logger::log_threshold("INFO") # DEBUG logs are ignored
logger::log_info(message = "Program started.")
params <- list(x = 1)
logger::log_debug(message = "Received parameters", value = params)
Previous sections introduced some key features of logging.
Let’s dive into the content of logs.
Typically JSON Lines.
{"time":"2024-12-04 21:07:28","level":"INFO","message":"Program started."}
{"time":"2024-12-04 21:07:28","level":"WARN","message":"Some warning."}
Then, packages parse them and outputs data frame (e.g., jsonlite, yyjsonr, duckdb, …).
time | level | message |
---|---|---|
2024-12-04 21:07:28 | INFO | Program started. |
2024-12-04 21:07:28 | WARN | Some warning. |
The log requires self-maintained parsers…
2024-12-04 21:07:28 [INFO] Program started
2024-12-04 21:07:28 [WARN] Some warning.
YES!
But you need advanced configurations.
key | type | desc | example |
---|---|---|---|
time | string | timestamp | 2024-12-03 22:38:17 |
level | string | log level | INFO |
ns | string | name of package, project, … | japanr2024 |
message | string | static message | Received request |
value | any | value related to the message | list(input = "tokyo") |
session_id | string | unique per access | 01JE6BWEWP0BAAFGPZTZ9GBNM6 |
trace_id | string | unique per request | 01JE6BWEXF5F6HDD4YFZBP157F |
span_id | string | unique per task | 01JE6BWEXSW65DD1EC858XMZQ3 |
context | list | (Optional) additional context | list(...) |
key | type | desc | example |
---|---|---|---|
time | string | timestamp | 2024-12-03 22:38:17 |
level | string | log level | INFO |
ns | string | name of package, project, … | japanr2024 |
message | string | static message | Received request |
value | any | value related to the message | list(input = "tokyo") |
session_id | string | unique per access | 01JE6BWEWP0BAAFGPZTZ9GBNM6 |
trace_id | string | unique per request | 01JE6BWEXF5F6HDD4YFZBP157F |
span_id | string | unique per task | 01JE6BWEXSW65DD1EC858XMZQ3 |
context | list | (Optional) additional context | list(...) |
# BAD: dynamic message
x <- 1
logger::log_info(message = glue::glue("x is {x}"))
{"time":"2024-12-07 14:08:20","level":"INFO","message":"x is 1"}
# GOOD: static message and dynamic value are tidyverse-friendly
logger::log_info(message = "x is defined", value = x)
{"time":"2024-12-07 14:08:20","level":"INFO","message":"x is defined","value":1}
key | type | desc | example |
---|---|---|---|
time | string | timestamp | 2024-12-03 22:38:17 |
level | string | log level | INFO |
ns | string | name of package, project, … | japanr2024 |
message | string | static message | Received request |
value | any | value related to the message | list(input = "tokyo") |
session_id | string | unique per access | 01JE6BWEWP0BAAFGPZTZ9GBNM6 |
trace_id | string | unique per request | 01JE6BWEXF5F6HDD4YFZBP157F |
span_id | string | unique per task | 01JE6BWEXSW65DD1EC858XMZQ3 |
context | list | (Optional) additional context | list(...) |
These keywords typically appears in observable engineering.
Identifiers are often generated by ULID or UUIDv4.
key | type | desc | example |
---|---|---|---|
time | string | timestamp | 2024-12-03 22:38:17 |
level | string | log level | INFO |
ns | string | name of package, project, … | japanr2024 |
message | string | static message | Received request |
value | any | value related to the message | list(input = "tokyo") |
session_id | string | unique per access | 01JE6BWEWP0BAAFGPZTZ9GBNM6 |
trace_id | string | unique per request | 01JE6BWEXF5F6HDD4YFZBP157F |
span_id | string | unique per task | 01JE6BWEXSW65DD1EC858XMZQ3 |
context | list | (Optional) additional context | list(...) |
Optional data that helps analyzing the log.
For example, parent-child relationship between logs.
{
...,
"span_id": "01JDNN0EK0715J048SHYW7XZ2V",
"context": {
"parent_span_id": "01JDNN0F76GR0398VJ4MMGBCY7",
}
}
# Configure
logger::log_appender(logger::appender_stdout)
logger::log_formatter(logger::formatter_json)
logger::log_layout(logger::layout_json_parser(c("time", "level", "ns")))
# Log
logger::log_info(message = "Program started.")
{"time":"2024-12-07 14:08:20","level":"INFO","ns":"global","message":"Program started."}
Key points:
.onLoad()
Reference materials:
https://github.com/atusy/japanr-2024-logging
JAPANR2024_LOG_LEVEL=INFO R --no-save -e 'devtools::load_all(); run()'
JAPANR2024_LOG_LEVEL=DEBUG R --no-save -e 'devtools::load_all(); run()'
logdata <- system.file("extdata/japanr2024.jsonl", package = "japanr2024") |>
yyjsonr::read_ndjson_file() |>
dplyr::mutate(time = clock::date_time_parse(time, "Asia/Tokyo")) |>
dplyr::glimpse()
#> Rows: 184
#> Columns: 8
#> $ time <dttm> 2024-12-03 22:38:17, 2024-12-03 22:38:17, 2024-12-03 22:38…
#> $ level <chr> "INFO", "INFO", "DEBUG", "DEBUG", "DEBUG", "DEBUG", "DEBUG"…
#> $ ns <chr> "japanr2024", "japanr2024", "japanr2024", "japanr2024", "ja…
#> $ message <chr> "Started session", "Received request", "Received request pa…
#> $ session_id <chr> "01JE6BWEWP0BAAFGPZTZ9GBNM6", "01JE6BWEWP0BAAFGPZTZ9GBNM6",…
#> $ trace_id <chr> NA, "01JE6BWEXF5F6HDD4YFZBP157F", "01JE6BWEXF5F6HDD4YFZBP15…
#> $ value <list> <NULL>, <NULL>, ["tokyo, london"], <"tokyo", "london">, "t…
#> $ span_id <chr> NA, NA, NA, NA, "01JE6BWEXSW65DD1EC858XMZQ3", "01JE6BWEXSW6…
logdata |>
dplyr::summarize(
time = min(time),
errors = sum(level == "ERROR"),
.by = "trace_id"
) |>
ggplot2::ggplot() +
ggplot2::aes(time, errors) +
ggplot2::geom_point() +
ggplot2::scale_x_datetime(name = "Requested at", date_breaks = "10 sec")
Oh no! Many requests suffer errors!
logdata |>
dplyr::reframe(
start = min(time),
duration = max(time) - start,
locations = lengths(value[message == "Locations parsed"]),
.by = "trace_id"
) |>
ggplot2::ggplot() +
ggplot2::aes(start, duration, color = locations) +
ggplot2::geom_point() +
ggplot2::scale_x_datetime(name = "Requested at", date_breaks = "10 sec") +
ggplot2::scale_y_continuous(name = "Response time") +
NULL
Hmm… users have to wait for several seconds…
logdata |>
dplyr::filter(!is.na(span_id)) |>
dplyr::summarize(
start = min(time),
end = max(time),
location = unlist(value[message == "Processing location"]),
error = "ERROR" %in% level,
.by = c("trace_id", "span_id")
) |>
dplyr::mutate(
location = forcats::fct_relevel(location, rev(unique(location)))
) |>
ggplot2::ggplot() +
ggplot2::aes(
x = start, xend = end,
y = location, yend = location,
color = error,
) +
ggplot2::geom_point() +
ggplot2::geom_segment() +
ggplot2::facet_grid(rows = "trace_id", scales = "free_y") +
ggplot2::theme(
strip.text.y = ggplot2::element_blank(),
axis.title.x = ggplot2::element_blank()
) +
ggplot2::scale_x_datetime(name = "", date_breaks = "10 sec")
# Memoised versions of openmeteo functions
geocode <- memoise::memoise(openmeteo::geocode)
weather_forecast <- memoise::memoise(
openmeteo::weather_forecast,
# Invalidate cache daily
hash = function(x) rlang::hash(list(x, Sys.Date()))
)
# Efficient forecast function
forecast <- function(x) {
g <- geocode(x)
weather_forecast(c(g$latitude, g$longitude), hourly = "temperature_2m")
}
geocode <- function(x) {
tryCatch(
openmeteo::geocode(x),
error = function(e) {
if (e$message == "No matches found") {
logger::log_debug("No matches found for geocode", value = x)
return(NULL)
}
stop(e)
}
)
}