There is another question regarding the use of Jane Friedman's name.
Is the author in question really named Jane Friedman? I'm sure whether trademarking can force someone to not use their own name, but if they use some other name, particularly that of a famous author, that seems much more problematic. (I'm not a lawyer, either.) The other Stephen King didn't get in legal trouble for using his own name to market horror novels, but eventually, Amazon forced him to add a middle initial to avoid confusion, something it should have done much faster. I'm not sure whether that was because the well-known Stephen King registered trademarked his name or not.
With regard to AL Hawke's comment, we're in somewhat new territory with regard to copyright. There's no question Friedman's books are copyrighted. That means that AI developers have to be able to claim fair use in order to get away with using her material and others in training. If, as Friedman asserts in the article, developers obtained copies of her work through scraping pirate sites, then it's going to be very hard for them to claim fair use. As I pointed out in another thread, to be on 100% solid ground, they'd need to have bought copies of all the books they used. Likely, they'd also have had to access the content in some way that didn't violate DMCA.
All we know about the database process is that the material comes from the "publicly accessible internet." Pirate sites that actually have books could be publicly accessible, at least in theory. But that wouldn't make them legitimate data sources. Clearly, authors are not posting their own whole novels online.
Could an author prove that AI developers used pirate copies or even his or her particular work? Maybe. If an author could prompt the AI to spit out part of one of their works (which has happened at least once) then yes, at least on the using the books part. Developers could be called to testify under oath about sources. The databases could even be subpoenaed.
The companies would doubtless try to block such a move on the basis that the information is proprietary. But how can a company claim data they scraped off the "publicly accessible internet" is proprietary information?
Friedman may not be suing over copyright, but some writers are. It will be interesting to see how that plays out.