Troubleshooting & FAQ • dbProject

This vignette covers common issues you may encounter when using dbverse packages and how to resolve them.

Database Lock Errors

Error: “Could not set lock on file”

Error: Could not set lock on file "path/to/database.duckdb": 
Conflicting lock is held in /path/to/R (PID 12345) by user...

Cause: DuckDB uses file-level locking to ensure data integrity. Only one process can write to a database file at a time. This error occurs when:

Another R session is connected to the same database
A previous R session didn’t disconnect properly before closing
Multiple scripts are trying to access the same database simultaneously

Solutions:

Kill the conflicting process (shown in the error message):

# Replace 12345 with the actual PID from the error
kill 12345

Restart RStudio to close all R processes

Properly disconnect before closing sessions:

# Always disconnect when done
proj$disconnect()

# Or for direct DBI connections:
DBI::dbDisconnect(con, shutdown = TRUE)

Use read-only mode for concurrent access:

# Multiple processes can read simultaneously
con <- DBI::dbConnect(duckdb::duckdb(), dbdir = "path/to/db", read_only = TRUE)

Connection Issues

Error: “No active or cached connection available”

Cause: The dbProject connection has been closed or was never established. Solution: Reconnect to the database:

proj$reconnect()

Objects showing “Error: Column references …” after restart

Cause: Lazy table references become stale after R restarts because the connection is lost.

Solution: Use dbProject for automatic reconnection:

# Create project (saves connection info)
proj <- dbProject$new(path = "my_project", dbdir = "data.duckdb")

# After restart, just reload
proj <- dbProject$new(path = "my_project", dbdir = "data.duckdb")
proj$reconnect()

# Pinned objects are automatically reconnected
my_data <- proj$pin_read("my_table")

Memory Issues

Large datasets causing R to crash

Cause: Accidentally collecting large lazy tables into memory.

Prevention:

Avoid collect() on large tables:

# BAD - loads entire table into memory
big_df <- big_table |> collect()

# GOOD - keep lazy, push computation to database and only collect what's needed
result <- big_table |> 
  filter(gene == "BRCA1") |>
  collect()

Use pin_write() to materialize intermediates to disk:

# Materialize to database, not memory
filtered_data <- proj$pin_write(x = lazy_filtered, name = "filtered")

Package-Specific Issues

dbMatrix: Slow operations after many transformations

Cause: Complex lazy query chains can cause performance issues.

Solution: Use dplyr::compute() or pin_write() to materialize intermediate results:

# Materialize after expensive operations
complex_result <- my_dbMatrix |>
  some_expensive_transform() |> # e.g. a function containing several SQL joins
  dplyr::compute(name = "materialized", temporary = FALSE)

Getting Help

If you encounter issues not covered here:

Checkout the specific dbverse package on our GitHub organization.
Ensure you’re using the latest dbverse package versions.
Include a minimal reproducible example when reporting issues.