ValiChord — 126 Reproducibility Failure Modes

ACRITICAL

No README file found

BSIGNIFICANT

Unpinned or missing dependency file

CSIGNIFICANT

Absolute paths that only work on the researcher's machine

DSIGNIFICANT

No execution order or entry point documented

ESIGNIFICANT

Data files present but no data documentation or codebook

FSIGNIFICANT

Undocumented stochasticity — random seeds missing or unset

GLOW CONFIDENCE

README exists but missing critical sections (install, execution, outputs)

HLOW CONFIDENCE

Version numbers hardcoded in source code rather than requirements file

ILOW CONFIDENCE

Intermediate files committed but no script to regenerate them

JSIGNIFICANT

Notebooks with unclear or non-linear execution order

KSIGNIFICANT

Compute environment (OS, RAM, GPU) not documented

LSIGNIFICANT

Code references files that appear to be missing from the repository

MSIGNIFICANT

Multiple or conflicting Python versions referenced

NSIGNIFICANT

No licence file — reuse rights unclear

OSIGNIFICANT

No committed outputs to compare validator results against

PSIGNIFICANT

Pre-registration mentioned but no link or document provided

QLOW CONFIDENCE

Configuration files missing or undocumented

RLOW CONFIDENCE

Statistical test assumptions not documented in README

SLOW CONFIDENCE

Key software packages used but not cited

TLOW CONFIDENCE

No tests present for analysis code

UCRITICAL

Undocumented environment variables or credentials referenced in code

VSIGNIFICANT

No virtual environment specification (conda env, venv, renv)

WCRITICAL

Git LFS pointer files present without LFS setup instructions

XLOW CONFIDENCE

No containerisation or environment isolation (Docker, Singularity)

YLOW CONFIDENCE

Data files present but no source, provenance, or download instructions

ZSIGNIFICANT

No commit hash or version tag recorded in README

3DSIGNIFICANT

3D mesh files present but no viewer software documented

AASIGNIFICANT

Figures committed but no figure-generation script present

ABSIGNIFICANT

Parallel code present with no determinism controls

ACSIGNIFICANT

Deprecated functions used — likely to break in current package versions

ADSIGNIFICANT

No .gitignore — sensitive or generated files may be committed

AESIGNIFICANT

Multiple languages used without integration documentation

AFLOW CONFIDENCE

Output file format not documented

AGCRITICAL

API keys or tokens hardcoded in source files

AHLOW CONFIDENCE

No changelog or version history

AILOW CONFIDENCE

Excessive print-debugging suggests unfinished or exploratory code

AJLOW CONFIDENCE

Sample sizes or thresholds hardcoded in code without explanation

AKLOW CONFIDENCE

External URLs referenced that may become unavailable

ALSIGNIFICANT

Potential personal or sensitive data indicators in code or filenames

AMSIGNIFICANT

Complex multi-step pipeline with no automation script

ANLOW CONFIDENCE

Large blocks of commented-out code left in scripts

AOSIGNIFICANT

R-specific issues: missing sessionInfo(), no seed, no renv

APSIGNIFICANT

Stata-specific issues: no version declaration or log file

AQLOW CONFIDENCE

Large trained model files committed — distribution and versioning risk

ARLOW CONFIDENCE

File encoding issues — non-UTF-8 content may break cross-platform reads

ASSIGNIFICANT

Runtime network calls with no offline fallback or archived copy

ATSIGNIFICANT

Database dependency — connection string or DB schema not documented

AUSIGNIFICANT

Cloud storage dependency (S3, GCS, Azure Blob) with no access instructions

AVLOW CONFIDENCE

Hardcoded dates make code time-sensitive or year-dependent

AWLOW CONFIDENCE

No DOI or persistent identifier in documentation

AXSIGNIFICANT

Dockerfile has reproducibility issues (latest tag, no requirements file)

AYLOW CONFIDENCE

CI/CD workflow file present but no instructions for local reproduction

AZLOW CONFIDENCE

Figures saved as raster formats — vector preferred for publication quality

BALOW CONFIDENCE

Data files present but no checksums to verify download integrity

BBSIGNIFICANT

Shell scripts not marked executable — will fail without chmod

BCLOW CONFIDENCE

Mixed line endings across files (CRLF and LF)

BDLOW CONFIDENCE

No contact information or author email in README

BESIGNIFICANT

Compiled Python bytecode (.pyc files) committed to repository

BFSIGNIFICANT

Jupyter notebook outputs missing — results not visible without running

BGLOW CONFIDENCE

No acknowledgements section — funding or data sources unclear

BHLOW CONFIDENCE

Nested archive files create zip-bomb risk for automated validators

BILOW CONFIDENCE

Unicode characters in file paths — may break on some operating systems

BJCRITICAL

Encrypted or high-entropy data files — content cannot be validated

BKSIGNIFICANT

System clock dependency — results vary by run date

BLCRITICAL

Code depends on full git history — fails with shallow clone

BMLOW CONFIDENCE

No CITATION.cff machine-readable citation file

BNLOW CONFIDENCE

Codebook column names don't match column names in data files

BPLOW CONFIDENCE

Licence only in README — no separate LICENSE file

BRCRITICAL

Credentials or secrets exposed in committed files

BSLOW CONFIDENCE

Archive or backup copies of scripts committed alongside originals

BTLOW CONFIDENCE

Spaces in file or directory names — causes shell quoting failures

BUSIGNIFICANT

conda environment.yml mixes channels without strict channel priority

BVSIGNIFICANT

Shell pipeline script has no error handling (set -e missing)

BWSIGNIFICANT

Code file is effectively empty — likely a stub or failed upload

BXSIGNIFICANT

Pluto notebook has empty manifest — package versions not locked

BYSIGNIFICANT

Julia repo has Project.toml but no Manifest.toml — versions unresolved

BZSIGNIFICANT

MATLAB .mat file saved in v7.3 (HDF5) format — version compatibility risk

CASIGNIFICANT

Script referenced in README does not exist in the repository

CBSIGNIFICANT

Snakemake workflow has no per-rule environment isolation

CCSIGNIFICANT

README mentions external tools on PATH with no version specified

CDSIGNIFICANT

Dockerfile has RUN pip install before COPY — build cache will fail

CESIGNIFICANT

devtools::install_github() calls without commit or tag pin

CFLOW CONFIDENCE

Jupyter notebook has committed cell outputs — may contain sensitive data

CGSIGNIFICANT

requirements.txt contains git+ URLs or unpinned version constraints

CHSIGNIFICANT

R source() calls reference files not present in the repository

CISIGNIFICANT

Code fetches live data at runtime with no local archived copy

CJSIGNIFICANT

README references config files not present in the repository

CKLOW CONFIDENCE

Multiple README files with conflicting instructions

CLSIGNIFICANT

BiocManager::install() called without version= argument

CMSIGNIFICANT

Nextflow processes lack container or conda environment directives

CNSIGNIFICANT

requirements.txt contains known incompatible package version combinations

COSIGNIFICANT

MATLAB code uses undocumented internal functions (likely to change)

CPSIGNIFICANT

Python 2 syntax in a Python 3 repository

CQSIGNIFICANT

Julia script calls Pkg.add() at runtime with no Project.toml

CRSIGNIFICANT

Shell script has Windows CRLF line endings — will fail on Linux/macOS

CSSIGNIFICANT

Model binary loaded via pickle — version-sensitive and a security risk

CUSIGNIFICANT

conda environment.yml has unpinned or loosely-pinned packages

CVSIGNIFICANT

Code uses CUDA with no CPU fallback — fails without a GPU

CWSIGNIFICANT

R script uses reticulate to call Python — cross-language environment not documented

CXSIGNIFICANT

Python packages require system C/C++ libraries not in Dockerfile

CYSIGNIFICANT

README documents a checksum but no code verifies it at runtime

CZSIGNIFICANT

Dockerfile uses an end-of-life Python base image

DASIGNIFICANT

spaCy/NLTK models loaded but not downloaded in Dockerfile

DBSIGNIFICANT

Repository is a Shiny app — interactive verification steps not documented

DCSIGNIFICANT

Repo contains multiple independent sub-projects presented as a single pipeline

DDLOW CONFIDENCE

OS-specific shell commands contradict cross-platform claims

DESIGNIFICANT

PyTorch seeds set but torch.use_deterministic_algorithms() absent

DFSIGNIFICANT

README references external data URL but no fetch script or checksum provided

DGSIGNIFICANT

Pipeline requires manual or GUI pre-processing steps not documented in README

DUPSIGNIFICANT

Data files with identical content — likely accidental duplicates

DZSIGNIFICANT

Repository file tree is duplicated — likely a double-zipped archive

EPSIGNIFICANT

Data-only deposit with no extraction methodology documented

FDSIGNIFICANT

Duplicate-format file pairs — same data stored in multiple formats

FLSIGNIFICANT

Filenames with stems longer than 64 characters — Windows path limit risk

FWSIGNIFICANT

Figures embedded in Word documents rather than exported as image files

HSSIGNIFICANT

Data files may contain human subjects data — anonymisation not documented

ICLOW CONFIDENCE

Inconsistent file extension casing (e.g. .R and .r both present)

IC2LOW CONFIDENCE

Inconsistent spacing conventions across filenames

NDCRITICAL

No data files found — cannot reproduce results without data

NNLOW CONFIDENCE

Non-sequential numbering gap in a file series

NXSIGNIFICANT

Files with no extension — type and purpose unclear

NZSIGNIFICANT

Archive files nested inside the deposit — packaging anti-pattern

PAPLOW CONFIDENCE

Pre-registration document present but no link to public registry

SPLOW CONFIDENCE

Specialist or proprietary software required (SPSS, Stata, MATLAB, ArcGIS…)

TVLOW CONFIDENCE

Inconsistent zero-padding in numbered filename sequences

UESIGNIFICANT

Unicode characters in filenames — may fail on Windows or certain tools