126 Reproducibility Failure Modes

Every pattern ValiChord checks for when evaluating a research repository. Detectors span dependency management, path handling, data documentation, randomness, platform sensitivity, human subjects, and more.

CRITICAL Blocks reproduction outright
SIGNIFICANT Likely to cause failure
LOW CONFIDENCE Worth reviewing

Showing 128 of 128 detectors

ACRITICAL
No README file found
BSIGNIFICANT
Unpinned or missing dependency file
CSIGNIFICANT
Absolute paths that only work on the researcher's machine
DSIGNIFICANT
No execution order or entry point documented
ESIGNIFICANT
Data files present but no data documentation or codebook
FSIGNIFICANT
Undocumented stochasticity — random seeds missing or unset
GLOW CONFIDENCE
README exists but missing critical sections (install, execution, outputs)
HLOW CONFIDENCE
Version numbers hardcoded in source code rather than requirements file
ILOW CONFIDENCE
Intermediate files committed but no script to regenerate them
JSIGNIFICANT
Notebooks with unclear or non-linear execution order
KSIGNIFICANT
Compute environment (OS, RAM, GPU) not documented
LSIGNIFICANT
Code references files that appear to be missing from the repository
MSIGNIFICANT
Multiple or conflicting Python versions referenced
NSIGNIFICANT
No licence file — reuse rights unclear
OSIGNIFICANT
No committed outputs to compare validator results against
PSIGNIFICANT
Pre-registration mentioned but no link or document provided
QLOW CONFIDENCE
Configuration files missing or undocumented
RLOW CONFIDENCE
Statistical test assumptions not documented in README
SLOW CONFIDENCE
Key software packages used but not cited
TLOW CONFIDENCE
No tests present for analysis code
UCRITICAL
Undocumented environment variables or credentials referenced in code
VSIGNIFICANT
No virtual environment specification (conda env, venv, renv)
WCRITICAL
Git LFS pointer files present without LFS setup instructions
XLOW CONFIDENCE
No containerisation or environment isolation (Docker, Singularity)
YLOW CONFIDENCE
Data files present but no source, provenance, or download instructions
ZSIGNIFICANT
No commit hash or version tag recorded in README
3DSIGNIFICANT
3D mesh files present but no viewer software documented
AASIGNIFICANT
Figures committed but no figure-generation script present
ABSIGNIFICANT
Parallel code present with no determinism controls
ACSIGNIFICANT
Deprecated functions used — likely to break in current package versions
ADSIGNIFICANT
No .gitignore — sensitive or generated files may be committed
AESIGNIFICANT
Multiple languages used without integration documentation
AFLOW CONFIDENCE
Output file format not documented
AGCRITICAL
API keys or tokens hardcoded in source files
AHLOW CONFIDENCE
No changelog or version history
AILOW CONFIDENCE
Excessive print-debugging suggests unfinished or exploratory code
AJLOW CONFIDENCE
Sample sizes or thresholds hardcoded in code without explanation
AKLOW CONFIDENCE
External URLs referenced that may become unavailable
ALSIGNIFICANT
Potential personal or sensitive data indicators in code or filenames
AMSIGNIFICANT
Complex multi-step pipeline with no automation script
ANLOW CONFIDENCE
Large blocks of commented-out code left in scripts
AOSIGNIFICANT
R-specific issues: missing sessionInfo(), no seed, no renv
APSIGNIFICANT
Stata-specific issues: no version declaration or log file
AQLOW CONFIDENCE
Large trained model files committed — distribution and versioning risk
ARLOW CONFIDENCE
File encoding issues — non-UTF-8 content may break cross-platform reads
ASSIGNIFICANT
Runtime network calls with no offline fallback or archived copy
ATSIGNIFICANT
Database dependency — connection string or DB schema not documented
AUSIGNIFICANT
Cloud storage dependency (S3, GCS, Azure Blob) with no access instructions
AVLOW CONFIDENCE
Hardcoded dates make code time-sensitive or year-dependent
AWLOW CONFIDENCE
No DOI or persistent identifier in documentation
AXSIGNIFICANT
Dockerfile has reproducibility issues (latest tag, no requirements file)
AYLOW CONFIDENCE
CI/CD workflow file present but no instructions for local reproduction
AZLOW CONFIDENCE
Figures saved as raster formats — vector preferred for publication quality
BALOW CONFIDENCE
Data files present but no checksums to verify download integrity
BBSIGNIFICANT
Shell scripts not marked executable — will fail without chmod
BCLOW CONFIDENCE
Mixed line endings across files (CRLF and LF)
BDLOW CONFIDENCE
No contact information or author email in README
BESIGNIFICANT
Compiled Python bytecode (.pyc files) committed to repository
BFSIGNIFICANT
Jupyter notebook outputs missing — results not visible without running
BGLOW CONFIDENCE
No acknowledgements section — funding or data sources unclear
BHLOW CONFIDENCE
Nested archive files create zip-bomb risk for automated validators
BILOW CONFIDENCE
Unicode characters in file paths — may break on some operating systems
BJCRITICAL
Encrypted or high-entropy data files — content cannot be validated
BKSIGNIFICANT
System clock dependency — results vary by run date
BLCRITICAL
Code depends on full git history — fails with shallow clone
BMLOW CONFIDENCE
No CITATION.cff machine-readable citation file
BNLOW CONFIDENCE
Codebook column names don't match column names in data files
BPLOW CONFIDENCE
Licence only in README — no separate LICENSE file
BRCRITICAL
Credentials or secrets exposed in committed files
BSLOW CONFIDENCE
Archive or backup copies of scripts committed alongside originals
BTLOW CONFIDENCE
Spaces in file or directory names — causes shell quoting failures
BUSIGNIFICANT
conda environment.yml mixes channels without strict channel priority
BVSIGNIFICANT
Shell pipeline script has no error handling (set -e missing)
BWSIGNIFICANT
Code file is effectively empty — likely a stub or failed upload
BXSIGNIFICANT
Pluto notebook has empty manifest — package versions not locked
BYSIGNIFICANT
Julia repo has Project.toml but no Manifest.toml — versions unresolved
BZSIGNIFICANT
MATLAB .mat file saved in v7.3 (HDF5) format — version compatibility risk
CASIGNIFICANT
Script referenced in README does not exist in the repository
CBSIGNIFICANT
Snakemake workflow has no per-rule environment isolation
CCSIGNIFICANT
README mentions external tools on PATH with no version specified
CDSIGNIFICANT
Dockerfile has RUN pip install before COPY — build cache will fail
CESIGNIFICANT
devtools::install_github() calls without commit or tag pin
CFLOW CONFIDENCE
Jupyter notebook has committed cell outputs — may contain sensitive data
CGSIGNIFICANT
requirements.txt contains git+ URLs or unpinned version constraints
CHSIGNIFICANT
R source() calls reference files not present in the repository
CISIGNIFICANT
Code fetches live data at runtime with no local archived copy
CJSIGNIFICANT
README references config files not present in the repository
CKLOW CONFIDENCE
Multiple README files with conflicting instructions
CLSIGNIFICANT
BiocManager::install() called without version= argument
CMSIGNIFICANT
Nextflow processes lack container or conda environment directives
CNSIGNIFICANT
requirements.txt contains known incompatible package version combinations
COSIGNIFICANT
MATLAB code uses undocumented internal functions (likely to change)
CPSIGNIFICANT
Python 2 syntax in a Python 3 repository
CQSIGNIFICANT
Julia script calls Pkg.add() at runtime with no Project.toml
CRSIGNIFICANT
Shell script has Windows CRLF line endings — will fail on Linux/macOS
CSSIGNIFICANT
Model binary loaded via pickle — version-sensitive and a security risk
CUSIGNIFICANT
conda environment.yml has unpinned or loosely-pinned packages
CVSIGNIFICANT
Code uses CUDA with no CPU fallback — fails without a GPU
CWSIGNIFICANT
R script uses reticulate to call Python — cross-language environment not documented
CXSIGNIFICANT
Python packages require system C/C++ libraries not in Dockerfile
CYSIGNIFICANT
README documents a checksum but no code verifies it at runtime
CZSIGNIFICANT
Dockerfile uses an end-of-life Python base image
DASIGNIFICANT
spaCy/NLTK models loaded but not downloaded in Dockerfile
DBSIGNIFICANT
Repository is a Shiny app — interactive verification steps not documented
DCSIGNIFICANT
Repo contains multiple independent sub-projects presented as a single pipeline
DDLOW CONFIDENCE
OS-specific shell commands contradict cross-platform claims
DESIGNIFICANT
PyTorch seeds set but torch.use_deterministic_algorithms() absent
DFSIGNIFICANT
README references external data URL but no fetch script or checksum provided
DGSIGNIFICANT
Pipeline requires manual or GUI pre-processing steps not documented in README
DUPSIGNIFICANT
Data files with identical content — likely accidental duplicates
DZSIGNIFICANT
Repository file tree is duplicated — likely a double-zipped archive
EPSIGNIFICANT
Data-only deposit with no extraction methodology documented
FDSIGNIFICANT
Duplicate-format file pairs — same data stored in multiple formats
FLSIGNIFICANT
Filenames with stems longer than 64 characters — Windows path limit risk
FWSIGNIFICANT
Figures embedded in Word documents rather than exported as image files
HSSIGNIFICANT
Data files may contain human subjects data — anonymisation not documented
ICLOW CONFIDENCE
Inconsistent file extension casing (e.g. .R and .r both present)
IC2LOW CONFIDENCE
Inconsistent spacing conventions across filenames
NDCRITICAL
No data files found — cannot reproduce results without data
NNLOW CONFIDENCE
Non-sequential numbering gap in a file series
NXSIGNIFICANT
Files with no extension — type and purpose unclear
NZSIGNIFICANT
Archive files nested inside the deposit — packaging anti-pattern
PAPLOW CONFIDENCE
Pre-registration document present but no link to public registry
SPLOW CONFIDENCE
Specialist or proprietary software required (SPSS, Stata, MATLAB, ArcGIS…)
TVLOW CONFIDENCE
Inconsistent zero-padding in numbered filename sequences
UESIGNIFICANT
Unicode characters in filenames — may fail on Windows or certain tools