docs: incorporate some AI guidance for contributors

rtyler · rtyler · commit 5c91cbfb6a3f · 2026-01-27T06:03:41.000-08:00
This what @thisisnic suggested via apache/arrow#48952 and has discussed on the Apache Arrow developers mailing list. I think this is good guidance to start with for AI contributions Signed-off-by: R. Tyler Croy <rtyler@brokenco.de>
diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
@@ -8,6 +8,44 @@ If you want to start contributing, first look at our good first issues: https://
 
 If you want to contribute something more substantial, see our "Projects seeking contributors" section on our roadmap: https://github.com/delta-io/delta-rs/issues/1128
 
+## AI-generated code
+
+We recognise that AI coding assistants are now a regular part of many
+developers' workflows and can improve productivity. Thoughtful use of these
+tools can be beneficial, but AI-generated PRs can sometimes lead to
+undesirable additional maintainer burden. PRs that appear to be fully
+generated by AI with little to no engagement from the author may be closed
+without further review.
+
+Human-generated mistakes tend to be easier to spot and reason about, and
+code review is intended to be a collaborative learning experience that
+benefits both submitter and reviewer. When a PR appears to have been
+generated without much engagement from the submitter, reviewers with access
+to AI tools could more efficiently generate the code directly, and since
+the submitter is not likely to learn from the review process, their time is
+more productively spent researching and reporting on the issue.
+
+We are not opposed to the use of AI tools in generating PRs, but recommend
+the following:
+
+* Only submit a PR if you are able to debug and own the changes yourself -
+  review all generated code to understand every detail. [Apache Datafusion has a useful explanation of **why fully AI-generated PRs without understanding are not helpful**](https://datafusion.apache.org/contributor-guide/index.html#why-fully-ai-generated-prs-without-understanding-are-not-helpful).
+* Match the style and conventions used in the rest of the codebase, including
+  PR titles and descriptions
+* Be upfront about AI usage and summarise what was AI-generated
+* If there are parts you don't fully understand, leave comments on your own PR
+  explaining what steps you took to verify correctness
+* Watch for AI's tendency to generate overly verbose comments, unnecessary
+  test cases, and incorrect fixes
+* Break down large PRs into smaller ones to make review easier
+
+PR authors are also responsible for disclosing any copyrighted materials in
+submitted contributions. See the `[Apache Software
+Foundation's](https://apache.org) [guidance on AI-generated
+code](https://www.apache.org/legal/generative-tooling.html) for further
+information on licensing considerations.
+
+
 ## Claiming an issue
 
 If you want to claim an issue to work on, you can write the word `take` as a comment in it and you will be automatically assigned.