GitHub will use Copilot interaction data including inputs, outputs, and code snippets to train its AI models starting April 24, unless users opt out.
GitHub has announced it will begin using Copilot interaction data to train its AI models starting April 24, 2026, unless users take action to opt out. The change affects how the company handles the vast amounts of code and conversations generated through its popular AI coding assistant.
The data collection will include inputs that developers type into Copilot, the AI-generated outputs it produces, and code snippets from both. This represents a significant expansion of GitHub's data usage practices, as the company has historically been more restrictive about how it handles user-generated code and interactions.
GitHub says the training data will help improve the performance and capabilities of its AI models, though it hasn't specified which models will benefit or how the data will be used beyond general training purposes. The company is giving users a one-month notice period before the changes take effect.
To opt out, users must navigate to their GitHub account settings and disable the data sharing option before the April 24 deadline. GitHub has not indicated whether there will be any functional differences for users who opt out, such as reduced Copilot capabilities or feature limitations.
The announcement comes amid growing scrutiny of how AI companies collect and use training data. GitHub's decision to make this change opt-out rather than opt-in has drawn criticism from some developers who argue that code and development work should remain private by default.
GitHub Copilot, launched in 2021, has become one of the most widely adopted AI coding tools, with millions of developers using it to generate code suggestions, complete functions, and assist with software development tasks. The service is built on OpenAI's Codex model and has expanded to include support for multiple programming languages and development environments.
This move by GitHub reflects a broader trend in the AI industry where companies are increasingly looking to user interaction data as a valuable resource for improving their models. Similar practices have been adopted by other AI companies, though GitHub's position as a platform for developers' code makes this particular case more sensitive.
For developers concerned about privacy or intellectual property, the April 24 deadline provides a clear timeframe to review their Copilot usage and decide whether to continue using the service under the new terms. Those who do not take action will have their interaction data included in GitHub's AI training datasets by default.
The change highlights the ongoing tension between the rapid advancement of AI capabilities and user privacy expectations, particularly in the software development community where code often contains proprietary information or represents significant intellectual property.

Comments
Please log in or register to join the discussion