This repository is an instruction page and template collection for building high-quality replication packages for social science research projects. It is designed to be read by humans and by coding agents such as Codex or Claude Code before they prepare, audit, or repair a replication package.
The guide and templates assume an R-based workflow, with master.R, R scripts, and session_info.log as the default examples. The underlying standard is not limited to R. If a project uses Stata, Python, Julia, MATLAB, or another toolchain, users can ask an agentic AI to read this guide and prepare an analogous replication package with the appropriate single entry point, logs, software-environment record, and figure/table crosswalk.
The lightweight templates in this repository illustrate the recommended package designs:
templates/README_TEMPLATE.md: a copyable starting point for a project’s one and only README.md.templates/compact/: compact project structure for smaller projects.templates/build-analyze/: larger project structure with separate build/ and analyze/ stages.examples/horiuchi_tago/: a finished compact replication package example.rules.dropboxignore: a downloadable Dropbox-root ignore file for R/RStudio local and session files.The structure templates intentionally contain only folder structure and essential example files such as .Rproj, .gitignore, README.md, master.R, script stubs, and logging helpers. They do not include full replication packages or large data files.
The example package shows what a completed compact package can look like after applying the guide. It includes a real README.md, master.R, numbered scripts, logs, generated figures and outputs, and a paper-source consistency note.
Use standard Markdown as the authoritative instruction format. Markdown is easiest for agents to read, easiest to host on GitHub or another static site, and does not require R to render. Each replication package should commit one and only one README file: README.md.
The public guide is available at:
https://yhoriuchi.github.io/replication-package-guide/
When using Codex, Claude Code, or another coding agent, give the agent this URL and ask it to read the guide before changing any files. The GitHub repository is available at:
https://github.com/yhoriuchi/replication-package-guide
Keep this Markdown file as the authoritative source so the public page, repository, agents, and human readers use the same instructions.
Before asking an AI agent to polish a replication package for publication, clean the project as much as possible yourself. AI is useful for checking, reorganizing, documenting, and catching inconsistencies, but it should not be treated as a substitute for the author’s judgment about which files, scripts, data sources, and results are actually part of the replication record.
At minimum, remove clearly obsolete files, label exploratory scripts, identify the scripts that generate reported results, gather the paper source files when available, and decide which data can legally be shared. The cleaner the starting point, the more reliable the AI-assisted audit will be.
For a new replication package:
templates/compact/ or templates/build-analyze/ into the new project and use its included README.md as the starting README.templates/README_TEMPLATE.md to the project root as README.md.README.md.source("master.R") from a fresh R session.session_info.log and one log per public script were created.README.md.To see a concrete finished package, inspect examples/horiuchi_tago/. It is included as an example, not as a template to copy blindly.
When preparing a replication package with Codex, Claude Code, or another coding agent, set the agent’s working directory so it can see both:
For Overleaf users, one practical workflow is to use Overleaf’s Dropbox integration and create the replication package inside, or immediately beside, the synced Overleaf project folder. This lets the agent inspect the manuscript source and the replication package in one workspace.
Recommended layout for an Overleaf project with a compact replication package:
Dropbox/Apps/Overleaf/[Paper Title]/
|-- main.tex
|-- sections/
|-- references.bib
|-- figures/ # manuscript-ready figures used by LaTeX
|-- tables/ # manuscript-ready tables used by LaTeX
`-- r/ # replication package or future replication package
|-- README.md
|-- .gitignore
|-- master.R
|-- project.Rproj
|-- scripts/
|-- data/
|-- logs/
|-- output/
|-- figures/ # generated replication figures, if kept separately
`-- tables/ # generated replication tables, if kept separately
This example shows the compact structure. For larger projects, use the same r/ root folder but follow the build/analyze structure described below.
In this layout, figures/ and tables/ at the Overleaf project root are the manuscript-ready files included by LaTeX and submitted with the paper. The r/ folder contains the reproducible workflow. It can later become the core of the public replication package. If R generates a table that is then manually edited for publication, keep the generated version in r/tables/ or r/output/, keep the edited manuscript-ready version in root-level tables/, and document that relationship in the README crosswalk. When assembling the public archive, make sure the final package includes or clearly traces the manuscript-ready files as well as the generated source files.
Keep local and session files out of both Git and Dropbox sync. Use two ignore files because Git and Dropbox are separate systems:
rules.dropboxignore belongs in the root Dropbox folder, such as ~/Dropbox/rules.dropboxignore. It tells Dropbox what not to upload or sync anywhere under that Dropbox folder..gitignore belongs inside each replication package or Git repository, such as Dropbox/Apps/Overleaf/[Paper Title]/r/.gitignore. It tells Git what not to commit for that package.The simplest setup is to download or copy this repository’s rules.dropboxignore file into the root Dropbox folder, then copy the same base rules into the .gitignore file inside each r/ replication package. If an r/.gitignore already exists, append these rules rather than replacing project-specific rules.
Suggested base rules for both rules.dropboxignore and project-level .gitignore files:
# R and RStudio local/session files
**/.Rproj.user/
**/.Rhistory
**/.RData
**/.Ruserdata
# macOS and temporary files
**/.DS_Store
**/*.tmp
**/*.temp
**/*.bak
# package/cache folders
**/renv/library/
**/renv/staging/
**/*_cache/
**/*_files/
# R graphics leftovers
**/Rplots.pdf
Dropbox ignore rules apply only going forward. Files that already synced may need to be removed and recreated after rules.dropboxignore is added. Git ignore rules also do not automatically untrack files that were already committed; after checking carefully, remove such files from Git tracking with git rm --cached [file].
Do not ignore the entire r/ folder if the agent needs to inspect it or if it will become the replication package. Ignore only machine-specific caches, histories, package libraries, and temporary files. The .Rproj file, scripts, public data, generated logs, and reproducibility metadata such as renv.lock should usually remain visible. See Dropbox’s help page on preventing files from syncing for Dropbox-specific details.
This integration makes the most important final check much easier: consistency between the paper and the replication package. The agent should verify that:
figures/ or tables/;Recommended agent request:
Please inspect both the R replication package and the Overleaf/LaTeX source. Check whether every manuscript-ready figure and table used by Overleaf matches the corresponding R-generated figure or table, and flag any copied, renamed, manually edited, stale, or unmatched file.
For public release, include paper source files only when appropriate and permitted. If the paper source cannot be included in the public archive, use it during preparation for the consistency check and document in README.md that the manuscript source was checked against the replication outputs.
Use these copy-paste prompts with Codex, Claude Code, or another coding agent. Each prompt assumes the agent can read this guide and inspect the project files. Use one prompt at a time so the task is specific and easy to verify.
Please read the Replication Package Guide before making changes. Then inspect my project and prepare a complete replication package. If the paper source files are available in the working directory, also check consistency between the paper and the replication outputs.
First decide whether the project should use the compact structure or the build/analyze structure. Use the compact structure when the project is small and all public inputs can be shared directly. Use the build/analyze structure when data construction is complex, uses restricted sources, involves scraping/APIs, or produces analysis-ready datasets that should be treated as the public replication inputs.
Every replication package must include master.R, script-specific log files, session_info.log, a self-contained README, and a complete crosswalk for all figures and tables reported in the paper or appendix. Check that every figure, table, and in-text numerical claim in the paper can be traced to the replication package, including estimates, standard errors, p-values, sample sizes, sampling dates, completion times, response rates, and descriptive statistics. Do not use absolute paths. Do not require manual steps unless they are documented as unavoidable.
Use templates/compact/ or templates/build-analyze/ as the starting structure. Use the selected template's README.md, or templates/README_TEMPLATE.md, as the starting point for README.md. Replace all placeholder text with project-specific documentation.
Please read the Replication Package Guide, then inspect all public R scripts in this project. Add or repair per-script logging so every public script writes a matching log file to logs/ or analyze/logs/.
Each log should record the script name, start and end time, important row counts, sample sizes, reported estimates or test results, warnings, and any other numbers reported in the paper.
Use the project's existing logging style if one exists. Do not change substantive analysis code unless needed to make logging reliable. After editing, run the public replication path and confirm that every public script produces its expected log file.
Please read the Replication Package Guide and prepare or repair the project's single authoritative README.md. Use templates/README_TEMPLATE.md as the model.
Use the guide's standard section order: Description; How To Run; Folder Tree; Files Included In This Package; Data Sources And Restrictions; Build Stage if needed; Analysis Stage; Paper Source And Consistency Checks; Figures And Tables; Software Requirements; Session Information; Recommended Citation; Last Verified.
The `## Figures And Tables` crosswalk must use `### Manuscript` and `### Appendix` subsections. Add one `####` entry for each individual figure or table number used in the manuscript or appendix, even when multiple entries are produced by the same script. Each entry should contain exactly these five fields, in this order: `Output`, `Script`, `Log`, `LaTeX Label`, and `Notes`. Do not put LaTeX labels in the `####` heading. If a field lists multiple files or labels, use indented sub-bullets instead of inline comma- or semicolon-separated paths.
Do not create additional README files. Do not include README.html or README.pdf in the repository. Embedded figure/table previews are optional; the crosswalk is required.
Please read the Replication Package Guide, then audit the project for files and code that should not be in the public replication package.
Identify and remove temporary files, caches, old exploratory outputs, obsolete scripts, unused helper functions, personal files, absolute-path artifacts, and generated files that can be recreated by scripts. Keep source data, public scripts, documentation, final outputs, logs, and files needed to reproduce results.
Before deleting anything substantial, list what you plan to remove and why. Do not remove raw data, analysis-ready public inputs, manuscript source files, or scripts needed for reported results unless I explicitly approve.
Please read the Replication Package Guide, then compare the paper source files with the replication package outputs and logs.
Check every figure, table, and in-text numerical claim in the paper and appendix, including estimates, standard errors, p-values, confidence intervals, sample sizes, sampling dates, field dates, completion times, response rates, missing-data counts, and descriptive statistics.
For each reported item, verify that the value in the paper matches a script, log file, generated table, or generated figure. Report any mismatch with the paper source location, the replication source location, the paper value, and the replication value. Do not silently change paper text or analysis code; explain the discrepancy first.
Please read the Replication Package Guide, then review the public replication scripts for coding errors that could affect reported results.
Focus on data filtering, merges, joins, recoding, factor levels, missing-data handling, weights, clustered or robust standard errors, random seeds, model formulas, multiple-testing adjustments, output paths, and whether scripts run in the documented order from a clean R session.
Prioritize bugs, reproducibility risks, and missing tests or logs. Report findings with file paths, line references, severity, and suggested fixes. Make fixes only when they are clearly safe and within the replication package standard.
Please read the Replication Package Guide, then inspect the Overleaf/LaTeX source files and compare them with the replication package.
Check figure references, table references, labels, captions, file paths, appendix numbering, citations to results, and all in-text numerical claims. Verify that the manuscript points to the correct manuscript-ready figures and tables, that those files match the corresponding R-generated outputs or documented manual edits, and that the reported values match logs or generated outputs.
Report likely reporting errors with the TeX file path, label or nearby text, the value or reference in the paper, the corresponding replication source, and a recommended correction. Do not rewrite the manuscript unless I explicitly ask you to make the edits.
Please read the Replication Package Guide and perform a final pre-release audit of this replication package.
Verify that source("master.R") runs from a fresh R session; every public script creates a log; session_info.log exists; README.md is the only README file; the figure/table crosswalk is complete; paper source consistency has been checked when source files are available; no absolute paths, personal files, caches, or temporary files remain; and restricted data are documented.
Return a concise release-readiness report with pass/fail items, remaining risks, and exact files that need attention.
A replication package is successful when a reader can unzip it, open the project root, run one command, and see exactly how the reported results were produced.
The package should satisfy these requirements:
master.R.README.md that explains the package, the workflow, the required software, and every figure/table output.session_info.log file from a successful full run.Use the compact structure for small or medium projects when:
Use the build/ and analyze/ structure for larger projects when:
When uncertain, choose the simpler structure unless the build stage creates real complexity for users.
Recommended for smaller packages. See templates/compact/ for a lightweight starter version.
r/
|-- README.md
|-- .gitignore
|-- master.R
|-- project.Rproj # optional but recommended
|-- session_info.log
|
|-- data/
| `-- public input data
|
|-- documents/
| |-- paper/
| |-- questionnaires/
| `-- other supporting documents
|
|-- scripts/
| |-- 01_prepare_data.R
| |-- 02_analyze_main_results.R
| `-- 03_make_figures_tables.R
|
|-- functions/ # optional helper functions
|
|-- figures/
| `-- generated figures
|
|-- tables/
| `-- generated tables
|
|-- output/
| `-- intermediate reproducible objects
|
`-- logs/
`-- one log per script
The compact structure should still include logs/. A functions/ folder is optional, but recommended when multiple scripts reuse the same helpers.
Recommended for larger packages. See templates/build-analyze/ for a lightweight starter version.
r/
|-- README.md
|-- .gitignore
|-- master.R
|-- project.Rproj # optional but recommended
|-- session_info.log
|
|-- build/
| |-- data/
| | `-- raw or received inputs, when distributable
| |-- documents/
| | `-- source documentation and data provenance files
| |-- scripts/
| | `-- scripts that create analysis-ready data
| |-- output/
| | |-- analysis_ready/
| | `-- other build outputs
| `-- logs/ # use if build scripts are public and runnable
|
`-- analyze/
|-- scripts/
|-- functions/
|-- figures/
|-- tables/
|-- output/
`-- logs/
The build/ stage constructs analysis-ready datasets. The analyze/ stage produces the manuscript and appendix results. The public replication workflow should normally run from build/output/analysis_ready/ into analyze/.
If the build stage depends on restricted data, do not force users to run it. Keep the build scripts for transparency, remove restricted inputs, include the analysis-ready public files when legally permitted, and explain the limitation in README.md.
The README is the user’s map. It should be complete enough that a reader can understand and verify the package without opening every script.
Every replication README should include:
master.R;Use exactly one README.md regardless of package size. Do not commit generated README.html or README.pdf files unless an archive or journal specifically requires them.
Use this section order unless a project-specific reason makes another order clearer:
## Description## How To Run## Folder Tree## Files Included In This Package## Data Sources And Restrictions## Build Stage, for build/analyze packages only## Analysis Stage## Paper Source And Consistency Checks## Figures And Tables## Software Requirements## Session Information## Recommended Citation## Last VerifiedCommit only one README-style documentation file: README.md. If an archive or journal requires HTML or PDF documentation, generate those files from README.md at release time and make clear that README.md remains the source.
Use Markdown lists rather than a wide table for data sources. Long file paths and restriction notes are easier to scan when each source gets its own entry.
## Data Sources And Restrictions
- **[Source name]**
- Location: `[path/to/file-or-folder]`
- Redistributable: Yes/No
- Notes: [License, access, provenance, or other source notes.]
If any source is restricted, proprietary, confidential, licensed, or otherwise non-redistributable, explain in the same entry:
- **[Restricted source name]**
- Location: `[restricted/path-or-description]`
- Redistributable: No
- Restriction: [Why the source cannot be redistributed.]
- Scripts using this source: `[path/to/script.R]`
- Public replacement: `[path/to/analysis_ready_file]`
- Licensed rebuild: [Whether licensed users can rebuild the data.]
- Public reproducibility: [Whether published results can be reproduced without access to the restricted source.]
The README must include a paper-order crosswalk that maps every reported manuscript and appendix figure/table to its output file, script, log, LaTeX Label, and notes. Embedded previews are optional; they are not a substitute for the crosswalk.
Use this structure:
## Figures And Tables
### Manuscript
#### Figure 1
- Output:
- `figures/main_effect.pdf`
- Script:
- `scripts/02_analyze_main_results.R`
- Log:
- `logs/02_analyze_main_results.log`
- LaTeX Label:
- `fig:main_effect`
- Notes: main treatment effect.
#### Table 1
- Output:
- `tables/main_results.tex`
- `tables/main_results.csv`
- Script:
- `scripts/03_make_tables.R`
- Log:
- `logs/03_make_tables.log`
- LaTeX Label:
- `tab:main_results`
- Notes: manuscript-ready LaTeX table and machine-readable CSV generated from the same model output.
### Appendix
#### Figure A.1
- Output:
- `figures/appendix_balance.pdf`
- Script:
- `scripts/04_appendix_checks.R`
- Log:
- `logs/04_appendix_checks.log`
- LaTeX Label:
- `fig:appendix_balance`
- Notes: appendix balance check.
Formatting rules:
## Figures And Tables as the section heading.### Manuscript and ### Appendix as subsections. If the paper calls the appendix “Supplementary Materials,” still use ### Appendix unless journal requirements make another title clearer.#### entry for every individual figure or table number used in the manuscript or appendix. Use the manuscript number as the heading, such as #### Figure 1, #### Table 2, #### Figure A.1, or #### Table S.3.#### heading. If a single R script generates two figures, create two separate #### entries and repeat the same script and log paths in both entries.#### heading. Put descriptive context in Notes and put labels in LaTeX Label.#### entry should contain exactly five fields, in this order: Output, Script, Log, LaTeX Label, and Notes.No output file, No code, or Not applicable when an item is conceptual, hand-made, or retained from the manuscript source rather than generated by public scripts.Use script names that make the execution order and purpose obvious:
00_list_inputs.R
01_prepare_data.R
02_estimate_main_results.R
03_make_figures.R
04_make_tables.R
05_robustness_checks.R
Rules:
scripts/not_in_paper/ or scripts/archive/ and explain that they are not required.Every public script should write a log file. Logs are not just debugging artifacts; they are part of the replication record.
Logs should include:
The log filename should match the script filename:
scripts/02_estimate_main_results.R
logs/02_estimate_main_results.log
For a build/analyze package:
analyze/scripts/02_estimate_main_results.R
analyze/logs/02_estimate_main_results.log
Place a logging helper in functions/logging.R for compact packages or analyze/functions/logging.R for build/analyze packages.
start_script_log <- function(script_name, log_dir = "logs") {
dir.create(log_dir, recursive = TRUE, showWarnings = FALSE)
log_file <- file.path(log_dir, paste0(script_name, ".log"))
sink(log_file, split = TRUE)
cat("############################################################\n")
cat("Script:", paste0(script_name, ".R"), "\n")
cat("Started:", format(Sys.time(), "%Y-%m-%d %H:%M:%S %Z"), "\n")
cat("############################################################\n\n")
invisible(log_file)
}
end_script_log <- function() {
cat("\n############################################################\n")
cat("Ended:", format(Sys.time(), "%Y-%m-%d %H:%M:%S %Z"), "\n")
cat("############################################################\n")
cat("\n--- warnings() at end of script ---\n")
w <- warnings()
if (is.null(w) || length(w) == 0) {
cat("None\n")
} else {
print(w)
}
while (sink.number() > 0) sink()
}
Use this pattern in each public script:
source("functions/logging.R")
start_script_log("02_estimate_main_results")
tryCatch({
# Script body goes here.
}, error = function(e) {
cat("\nERROR:", conditionMessage(e), "\n")
stop(e)
}, finally = {
end_script_log()
})
For build/analyze packages, adjust paths:
source("analyze/functions/logging.R")
start_script_log("02_estimate_main_results", log_dir = "analyze/logs")
tryCatch({
# Script body goes here.
}, error = function(e) {
cat("\nERROR:", conditionMessage(e), "\n")
stop(e)
}, finally = {
end_script_log()
})
Every package should include master.R. It is the reproducibility entry point and should run the full public replication path from a clean R session.
The master script should:
session_info.log;Suggested skeleton:
# Master file
# Run from the project root after restarting R.
start_time <- Sys.time()
stopifnot(sink.number() == 0)
safe_source <- function(file) {
if (!file.exists(file)) stop("File not found: ", file)
cat("\n============================================================\n")
cat("Running:", file, "\n")
cat("============================================================\n")
source(file, echo = FALSE, print.eval = FALSE)
}
update_readme_environment <- function(readme = "README.md") {
if (!file.exists(readme)) return(invisible(FALSE))
session <- sessionInfo()
r_version <- paste("R version", paste(R.version$major, R.version$minor, sep = "."))
operating_system <- session$running
if (length(operating_system) == 0 || is.na(operating_system[1]) || !nzchar(operating_system[1])) {
sys <- Sys.info()
operating_system <- paste(sys[["sysname"]], sys[["release"]])
}
environment_block <- c(
"### Computing Environment",
"",
paste("Software:", r_version),
paste("Platform:", session$platform),
paste("Computer Operating System:", operating_system)
)
lines <- readLines(readme, warn = FALSE)
existing <- which(grepl("^#{1,6} Computing Environment$", lines))
trim_blank_edges <- function(x) {
if (length(x) == 0) return(x)
nonblank <- which(nzchar(x))
if (length(nonblank) == 0) return(character())
x[seq.int(min(nonblank), max(nonblank))]
}
if (length(existing) > 0) {
start <- existing[1]
following_heading <- which(seq_along(lines) > start & grepl("^#{1,6} ", lines))
end <- if (length(following_heading) > 0) following_heading[1] - 1 else length(lines)
existing_block <- if (start < end) lines[(start + 1):end] else character()
extra_environment_lines <- existing_block[
!grepl("^(Software:|Platform:|Computer Operating System:)", existing_block)
]
extra_environment_lines <- trim_blank_edges(extra_environment_lines)
if (length(extra_environment_lines) > 0) {
environment_block <- c(environment_block, "", extra_environment_lines)
}
environment_block <- c(environment_block, "")
before <- if (start > 1) lines[seq_len(start - 1)] else character()
after <- if (end < length(lines)) lines[(end + 1):length(lines)] else character()
lines <- c(before, environment_block, after)
} else {
environment_block <- c(environment_block, "")
session_heading <- which(grepl("^## .*Session Information$", lines))
if (length(session_heading) > 0) {
following_section <- which(seq_along(lines) > session_heading[1] & grepl("^## ", lines))
insert_before <- if (length(following_section) > 0) following_section[1] else length(lines) + 1
before <- if (insert_before > 1) lines[seq_len(insert_before - 1)] else character()
after <- if (insert_before <= length(lines)) lines[insert_before:length(lines)] else character()
if (length(before) > 0 && nzchar(tail(before, 1))) before <- c(before, "")
lines <- c(before, environment_block, after)
} else {
if (length(lines) > 0 && nzchar(tail(lines, 1))) lines <- c(lines, "")
lines <- c(
lines,
"## Session Information",
"",
"The file `session_info.log` records the R version, platform, loaded packages, and runtime from a successful full run.",
"",
environment_block
)
}
}
writeLines(lines, readme, useBytes = TRUE)
invisible(TRUE)
}
scripts <- c(
"scripts/01_prepare_data.R",
"scripts/02_estimate_main_results.R",
"scripts/03_make_figures.R",
"scripts/04_make_tables.R"
)
for (script in scripts) {
safe_source(script)
}
end_time <- Sys.time()
while (sink.number() > 0) sink()
sink("session_info.log", split = FALSE)
cat("Run Time\n")
cat("Started: ", format(start_time, "%Y-%m-%d %H:%M:%S"), "\n", sep = "")
cat("Ended: ", format(end_time, "%Y-%m-%d %H:%M:%S"), "\n", sep = "")
cat("Elapsed: ", format(end_time - start_time), "\n\n", sep = "")
cat("Session Information\n")
print(sessionInfo())
sink()
update_readme_environment("README.md")
For large packages, master.R should normally run the public analysis path only:
scripts <- c(
"analyze/scripts/00_list_inputs.R",
"analyze/scripts/01_estimate_main_results.R",
"analyze/scripts/02_make_figures.R",
"analyze/scripts/03_make_tables.R"
)
If a small public subset of the build stage can be rebuilt, master.R may check for missing required files and rebuild only those public inputs. Do not require restricted data in the public run.
Use clear data layers:
data/: public raw or received data for compact packages.build/data/: raw or received data for large packages.build/output/analysis_ready/: authoritative analysis inputs for large public replication packages.output/ or analyze/output/: reproducible intermediate objects.figures/ and tables/: final generated results.Data rules:
output/.documents/.set.seed() before simulations, bootstraps, random splits, random forests, MCMC starts, or any stochastic procedure.If any source cannot be redistributed, document it inside ## Data Sources And Restrictions using the restricted-source fields described above. The entry should explain:
The public package should be designed so that users can reproduce published results without restricted access whenever legally and ethically possible.
When manuscript source files are available, treat them as part of the working context for package preparation. This is especially useful for Overleaf projects synced through Dropbox, because the paper source files and the R replication package can be inspected together.
The final consistency pass should check:
If the paper source files are not included in the public replication archive, state in README.md whether they were used during preparation for consistency checks.
At minimum, include a short computing environment summary in README.md and session_info.log from a successful full run. The template master.R files automatically refresh the software, platform, and operating-system lines after writing session_info.log.
Suggested README format:
### Computing Environment
Software: R version [version]
Platform: [R platform]
Computer Operating System: [operating system and version]
Additional details: [RAM, processor/GPU, external tools, or other project-specific requirements when relevant.]
The values should come from the run that produced session_info.log. For example, use the R version, Platform, and Running under lines printed by sessionInfo().
Some projects should report additional computing details when they affect reproducibility or runtime, such as RAM, CPU/GPU, external command-line tools, licensed software, high-performance-computing settings, or non-R language versions.
For stronger reproducibility, also consider:
renv.lock for R package versions;Do not assume the user has the same local folder structure. Avoid setwd() to personal paths.
Before releasing a replication package, verify:
source("master.R") runs from a fresh R session.session_info.log exists and comes from a successful full run.No output file, No code, or Not applicable entry.When an agent prepares a package, it should follow this sequence:
master.R.session_info.log.For a compact package:
README.md
.gitignore
master.R
session_info.log
data/
documents/
scripts/
functions/
figures/
tables/
output/
logs/
For a large package:
README.md
.gitignore
master.R
session_info.log
build/
analyze/
The final archive should feel boring in the best way: obvious structure, one command to run, traceable outputs, and no surprises.