Models trained on larger pools of data outside of permissively licensed open source code can provide superior performance, but enterprises using them run the risk of running afoul of IP and copyright violations, Tabnine president Peter Guagenti said. The Code Provenance and Attribution capability addresses this tradeoff and increases productivity while not sacrificing compliance, according to Guagenti. And, with copyright law for using AI-generated content still unsettled, Tabnine’s proactive stance aims to reduce the risk of IP infringement when enterprises use models such as Anthropic’s Claude, OpenAI’s GPT-4o, and Cohere’s Command R+ for software development.
The Code Provenance and Attribution capability supports software development activities including code generation, code fixing, generating test cases, and implementing Jira issues. Future plans include allowing users to identify specific repositories, such as those maintained by competitors, for generated code checks. Tabnine also plans to add a censorship capability to allow administrators to remove matching code before it is displayed to the developer.