AI Copyright Cases Spotlight Key Discovery Practice Issues

By Phil Favro, Contributing Author for HaystackID

Federal courts have frequently emphasized the importance of reasonable limits in the discovery process. From the Chief Justice of the United States to district and magistrate judges across the country, the judiciary has embraced the need to keep discovery within manageable bounds. This generally means that discovery—while ostensibly broad—does not exceed the constraints of reasonableness under the Federal Rules of Civil Procedure (FRCP). In today’s litigation environment, where even small cases can involve large volumes of ESI, courts are increasingly prepared to circumscribe discovery requests to more readily ensure that they do not result in undue burdens or unreasonable delays. As the Honorable Jeffrey Cole frequently observed during his approximately 20 years on the bench: “All good things, including discovery, must come to an end.”

This maxim is particularly appropriate given the direction from recent case law involving copyright infringement claims over the use of artificial intelligence (AI) platforms. In an order from the In re Google Generative AI Copyright Litigationmatter, along with a decision from Onan v. Databricks, Inc., courts have restricted parties from taking additional depositions or seeking document discovery near or after the close of discovery. Google Generative AI and Onan represent a broader trend demonstrating that courts will allow parties to seek discoverable information as long as it is done within the bounds of the litigation schedule. These cases additionally spotlight key discovery practice issues, such as the need for early custodian identification, expedited data analysis using the latest discovery search technologies, and proactive follow-up with litigation adversaries to obtain requested discovery.

Google Generative AI: Untimely, Cumulative, and Disproportionate Discovery

In Google Generative AI, which involves claims that Google used copyrighted works without authorization to train its large language models, the court addressed several discovery requests over a period of months. In a February 2026 decision, Magistrate Judge Susan van Keulen addressed three discovery requests the plaintiffs lodged in the final days before the discovery cutoff. In denying these requests, the court spotlighted a “fatal flaw” underlying the plaintiffs’ requests for relief: seeking a considerable expansion of discovery without substantiating the basis for those requests.

Additional Depositions

The court first addressed the plaintiffs’ request for six additional depositions. Among the individuals the plaintiffs sought to depose were Google CEO Sundar Pichai, along with other company executives. In denying the plaintiffs’ request, the court found that the plaintiffs had not made the requisite showing of “unique first-hand knowledge” required for the executive (or apex) witnesses, like Pichai. Nor had the plaintiffs demonstrated that testimony from any of the proposed deponents would be unique or not cumulative or duplicative of information already provided by other custodians and witnesses. Judge van Keulen was particularly critical of the timing, reasoning that a 50% increase in depositions on the eve of discovery cutoff would create scheduling issues. Moreover, the court observed that the plaintiffs “raised the need for additional depositions for months, without seeking leave from the court,” negating any argument that the need was newly discovered.

Additional Document Custodians

The plaintiffs also sought to add six new document custodians just 18 days before the close of fact discovery, a request that would have increased the custodian count by approximately 25%. The court also denied this request, finding that it was not proportional under the circumstances, nor doable based on the current discovery schedule. Furthermore, Judge van Keulen reasoned that the requested discovery was not timely, particularly since the plaintiffs had known about the custodians at issue for months and failed to seek relief from the court until the close of discovery.

Non-Custodial Discovery

Similarly, Judge van Keulen rejected the plaintiffs’ request for a “massive search for non-custodial documents.” The request contemplated Google searching “six specific document repository systems, as well as ‘any other shared data repository system’” for information responsive to certain of the plaintiffs’ document requests. As an initial matter, the court indicated that the plaintiffs neglected to satisfy a key procedural issue, i.e., showing that Google had somehow failed to provide responsive information for the document requests at issue. More significantly, though, Judge van Keulen reasoned that the plaintiffs had not supported their request for such a “massive undertaking,” particularly at the close of discovery: “the size of the request at this late hour is simply neither proportional to the needs of the litigation nor feasible on the current case schedule.”

Onan: Balancing Cutoffs with Proportionality

The court in Onan similarly denied a request for document discovery where the parties seeking discovery had failed to demonstrate that it was justified under the circumstances. In Onan—which (like Google Generative AI) involves infringement claims arising from defendants’ alleged use of copyrighted works to train their large language models—plaintiffs sought to add a new document custodian after the close of discovery, arguing that the custodian had “unique” information about their claims. While acknowledging the custodian might have “unique documents” relevant to the claims in the matter, the court found the requested discovery could very well have been duplicative of other information that the defendants already produced to the plaintiffs. Moreover, the court found the request was untimely, particularly since the plaintiffs had delayed seeking his documents for months. While rejecting that discovery, the court nonetheless permitted limited discovery compelling the production of third-party licensing agreements, along with requiring defendants to respond to nine additional interrogatories.

Prioritize Data and Custodians, Expedite Data Analysis, and Proactive Follow-Up

Google Generative AI and Onan provide a clear roadmap for practitioners navigating complex ESI disputes. Requesting parties should prioritize the identification of data sources, including custodians and non-custodial repositories, at the very outset of the case. Obtaining this information early allows for a more defensible discovery plan and militates against what Google Generative AI termed the “fatal flaw” of seeking relief on the eve of discovery cutoff. Both Google Generative AI and Onan teach that courts will safeguard the discovery schedule against late-breaking custodial requests if the information justifying that request was available months beforehand.

Another practical takeaway from Google Generative AI and Onan is the need for counsel and clients to quickly review information once it’s produced in discovery. Indeed, counsel should promptly analyze produced information to determine what discovery—either custodial or noncustodial—may be required. As Google Generative AI and Onan make clear, courts are increasingly wary of litigants who delay digesting information only to seek last-minute extensions. To bridge the gap between production and analysis, legal teams should leverage effective, enabling technologies to expedite their review of produced information. Traditional methods like early case assessment, data clustering, and technology-assisted review (TAR) remain vital for managing high volumes of ESI. Furthermore, teams may consider human-assisted review (HAR), where AI-powered analysis uses natural language prompts to identify key information more efficiently than perhaps search terms or other traditional search techniques.

A third practical lesson from Google Generative AI and Onan is the importance of a proactive follow-up. In both cases, the courts rejected discovery where the requesting parties neglected to follow up for months on known discovery issues. Diligent follow-up with litigation adversaries and, where needed, timely motions for relief are typically the steps litigants can follow to ensure that discovery is obtained before the court brings the process to its inevitable end. Following this approach—while supporting positions with hard evidence substantiating the production of this information—can help litigants ultimately obtain the discovery they seek.

Practitioners who follow these proactive methods will likely find more success than failure in obtaining sought-after information in discovery. In contrast, counsel who disregard these practices and unreasonably wait until close of discovery to seek oral or written discovery do so at their peril. Google Generative AI and Onan demonstrate that courts increasingly view diligence and timeliness as prerequisites for judicial relief.

About Phil Favro

Phil Favro is the founder of Favro Law PLLC, where he counsels clients on ESI, AI, and discovery issues and serves as a special master, mediator, and expert witness. Phil is nationally recognized for his expertise on ESI, discovery, and information governance, with courts acknowledging his credentials. See, e.g., Oakley v. MSG Networks, Inc., No. 17-CV-6903 (RJS), 2025 WL 2061665 (S.D.N.Y. July 23, 2025). This background makes Phil particularly well-suited to counsel clients and advise courts on information-related issues. As a special master, Phil is acclaimed for his collaborative approach, working with parties to find stipulated solutions to complex issues. For disputes that require adjudication, he is renowned for the clarity and vigor of his written dispositions, which are available on legal search engines.

About HaystackID®

HaystackID® solves complex data challenges related to legal, compliance, regulatory, and cyber requirements. Core offerings include Global Advisory, Cybersecurity, Core Intelligence AI™, and ReviewRight® Global Managed Review, supported by its unified CoreFlex™ service interface. Recognized globally by industry leaders, including Chambers, Gartner, IDC, and Legaltech News, HaystackID helps corporations and legal practices manage data gravity, where information demands action, and workflow gravity, where critical requirements demand coordinated expertise, delivering innovative solutions with a continual focus on security, privacy, and integrity. Learn more at HaystackID.com.

Assisted by GAI and LLM technologies.

SOURCE: HaystackID

AI Copyright Cases Spotlight Key Discovery Practice Issues

Google Generative AI: Untimely, Cumulative, and Disproportionate Discovery

Onan: Balancing Cutoffs with Proportionality

Prioritize Data and Custodians, Expedite Data Analysis, and Proactive Follow-Up

About Phil Favro

About HaystackID®

Related

Sign up for our Newsletter

Start typing and press enter to search