The apps developed by engineers who rely on code-generating AI systems for generating code are more prone to security threats, suggests an expert report.
Code-generating AI systems that use machine learning algorithms to suggest lines of codes and functions based on the context of existing code have become increasingly popular among software engineers.
However, a recent study conducted by researchers affiliated with Stanford University found that such codes, when used in developing apps, may introduce security vulnerabilities.
The study specifically focused on Codex. OpenAI’s Codex is an AI code-generation framework.
The research team had 47 developers with varied levels of programming expertise to access Codex to solve security-related problems with various programming languages.
The researchers discovered that as compared to participants in the control group, individuals who had access to Codex created insecure and improper code from a cybersecurity standpoint.
The participants with access to Codex even perceived the insecure answers to be secure.
Code-Generating Systems Aren’t Entirely Useless
Megha Shrivastava, the study’s co-author and a Stanford postgraduate student emphasized that the study’s findings don’t constitute a full indictment of Codex or any other code-generation programs.
For one thing, the research participants lacked security experience, which may have helped them discover more subtle coding problems. Shrivastava feels that code-generation systems are useful with low-risk-level tasks.
She also believes it’s possible to enhance coding suggestions with some fine-tuning.
Shrivastava went on to add that the companies that have an in-house code generator may work better.
She says such a system will be extensively trained and better capable of generating outputs that are more compatible with the security and coding norms of the company.
Neil Perry, the study’s lead co-author and a Stanford Ph.D. candidate, says that code-generating systems aren’t so developed and capable of replacing human developers.
He also points out that the developers who use code-generating systems to create codes outside their area of expertise need to be concerned.
He even encourages expert code developers to double-check the context and outcome of a code generator if they use one to enhance the speed of task completion for a project.
However, Neil also admits that AI assistant code-generation tools are exhilarating inventions, and many developers are keen to include such tools to generate program codes.
But he also indicates that these tools bring up issues that question the reliability of their usage. The concerns of experts also indicate that potential security vulnerabilities aren’t the only flaws of a code-generating AI system.
A huge chunk of Codex’s training code is under a restrictive license. Users could even instruct Copilot to produce codes from Quake, personal codebase snippets, and sample code from books.
Thus many legal experts say that Copilot can potentially put developers and companies at risk of copyright infringement.
GitHub, however, tries to mitigate this risk by implementing mechanisms for refining user prompts to make them more secure, similar to a supervisor examining and editing rough versions of code.
Cryptography library developers may also need to guarantee that their default settings are safe because code-generation systems rely on these default values, which aren’t always free of vulnerabilities.
It’ll be important for companies and developers to carefully consider the potential risks and benefits of using code-generating AI systems as they continue to become more widely available and relied upon.