Add AI policy#13
Conversation
| 1. The contributor must specify which parts of the code have been generated by AI. | ||
| 1. The contributor must declare that they have reviewed all code generated by AI and that this code passes the test suite locally. | ||
| 1. AI can be used to generate code but all other parts of the contributing process (including the process of opening a PR and communicating with the maintainers) must be done by a human. | ||
| 1. All user-facing parts of a contribution (e.g., documenting a new function argument, adding a vignette, editing the changelog) must be done by a human. |
There was a problem hiding this comment.
This is an interesting point. Documentation and the changelog are probably the most tedious and easiest to replace with AI. I don't recall what our reasoning was in the meeting, but I would personally push back on this.
There was a problem hiding this comment.
I also don't have more details from the meeting, we could soften this. I personally prefer when docs are written by hand because I also see better the things I've missed or the potential weaknesses of a function implementation when I have to explain it to someone. I feel like giving this task to AI gets rid of this moment of reflection. On the user side, I feel like the developer doesn't care if I can clearly see that documentation is written by AI.
That said, not all docs require the same thought process so some could be delegated to AI (still reviewed by us anyway). If we follow the Diataxis framework (which I think was mentioned in the original WP notes), then the reference pages are supposed to be quite dry and straight to the point, so this might be done by AI and reviewed by us.
Happy to discuss more about that, this is just a draft of course.
There was a problem hiding this comment.
I think there is a big difference here between creating these things and taking them directly versus using them as a rough starting draft and extensively proofing and modifying them by hand. I don't think that we should necessarily ban AI use here but I also see that if we permit them, people could get overly lazy and just submit pure AI content which they have not carefully considered.
| 1. All user-facing parts of a contribution (e.g., documenting a new function argument, adding a vignette, editing the changelog) must be done by a human. | ||
| 1. If you wish to include context from an interaction with AI in your comments, it must be in a quote block (e.g., using >) and disclosed as such. It must be accompanied by human commentary explaining the relevance and implications of the context. Do not share long snippets. | ||
|
|
||
| Not following these rules may lead to a PR being closed. |
There was a problem hiding this comment.
I feel like there should be a harsher punishment for repeated offences.
There was a problem hiding this comment.
Something like two offences lead to a ban from the github repo? Apparently you can specify the ban duration: https://docs.github.com/en/communities/maintaining-your-safety-on-github/blocking-a-user-from-your-organization
There was a problem hiding this comment.
I agree that repeat offenders should be banned - or at least that we should give ourselves the ability to ban repeat offenders if necessary.
|
Just plopping this here for reference: https://github.com/melissawm/open-source-ai-contribution-policies |
I like Zulip's one a lot, especially the section "Using AI for communication": https://github.com/zulip/zulip/blob/main/CONTRIBUTING.md#ai-use-policy-and-guidelines Their rule of thumb for PR description also works nicely with AI-generated docs:
The coding part of their policy has some requirements that are a bit too much IMO, such as splitting commits. |
bethany-j-allen
left a comment
There was a problem hiding this comment.
Thanks for drafting @etiennebacher! Here's some initial thoughts, happy to discuss in more detail.
| @@ -0,0 +1,16 @@ | |||
|
|
|||
| ### Github | |||
|
|
|||
There was a problem hiding this comment.
I think this should open with a brief paragraph about the ethical basis for needing an AI policy, e.g.
- We are an organisation which aims to uphold high ethical standards (also see Code of Conduct)
- We acknowledge the potential usefulness and growing prevalence of AI, but wish to see it used only where reasonable and appropriate
- Using AI is not a substitute for human thought, and anyone submitting AI generated content must carefully check and take responsibility for this content
I personally also dislike the environmental repercussions of AI use but understand if not everyone agrees and/or we feel this does not need to be mentioned here.
There was a problem hiding this comment.
We also discussed at length in the meeting how pervasive AI use could be a barrier to learning, and that helping people to learn is one of our major goals. Not sure whether this could also be worth mentioning here.
|
|
||
| ### Github | ||
|
|
||
| We accept contributions generated by AI as long as they comply with the following criteria: |
There was a problem hiding this comment.
By AI I guess we actually mean LLMs here? Perhaps we should be specific
There was a problem hiding this comment.
This is vague and perhaps doesn't belong in the policy but I wonder whether we should also get information on the specific AI tools used? Perhaps we could ask for this in the PR description.
|
|
||
| We accept contributions generated by AI as long as they comply with the following criteria: | ||
|
|
||
| 1. The contributor must specify which parts of the code have been generated by AI. |
There was a problem hiding this comment.
"parts" could be interpreted ambiguously here. Is this intended to mean e.g. code blocks?
Do we also want to make sure wording ensures any AI use is reported, e.g. has been generated with the help of AI?
Is there a specific point in the pipeline where this occurs? E.g. must specify at the point of submission
| We accept contributions generated by AI as long as they comply with the following criteria: | ||
|
|
||
| 1. The contributor must specify which parts of the code have been generated by AI. | ||
| 1. The contributor must declare that they have reviewed all code generated by AI and that this code passes the test suite locally. |
There was a problem hiding this comment.
Passing tests is good but perhaps could rephrase to specify that generated code should also successfully serve described/intended purpose - tests do not necessarily determine this.
|
|
||
| 1. The contributor must specify which parts of the code have been generated by AI. | ||
| 1. The contributor must declare that they have reviewed all code generated by AI and that this code passes the test suite locally. | ||
| 1. AI can be used to generate code but all other parts of the contributing process (including the process of opening a PR and communicating with the maintainers) must be done by a human. |
There was a problem hiding this comment.
I think we discussed this in the meeting but suggests that we are banning people from using translation tools when communicating with us, which only provides an additional barrier to contributing. Did we have a clear justification for this? If we do want to maintain this ban then I think we should explain in more detail here why we insist on people not using translation tools when communicating with us.
| 1. The contributor must specify which parts of the code have been generated by AI. | ||
| 1. The contributor must declare that they have reviewed all code generated by AI and that this code passes the test suite locally. | ||
| 1. AI can be used to generate code but all other parts of the contributing process (including the process of opening a PR and communicating with the maintainers) must be done by a human. | ||
| 1. All user-facing parts of a contribution (e.g., documenting a new function argument, adding a vignette, editing the changelog) must be done by a human. |
There was a problem hiding this comment.
I think there is a big difference here between creating these things and taking them directly versus using them as a rough starting draft and extensively proofing and modifying them by hand. I don't think that we should necessarily ban AI use here but I also see that if we permit them, people could get overly lazy and just submit pure AI content which they have not carefully considered.
| 1. All user-facing parts of a contribution (e.g., documenting a new function argument, adding a vignette, editing the changelog) must be done by a human. | ||
| 1. If you wish to include context from an interaction with AI in your comments, it must be in a quote block (e.g., using >) and disclosed as such. It must be accompanied by human commentary explaining the relevance and implications of the context. Do not share long snippets. | ||
|
|
||
| Not following these rules may lead to a PR being closed. |
There was a problem hiding this comment.
I agree that repeat offenders should be banned - or at least that we should give ourselves the ability to ban repeat offenders if necessary.
Part of #11