Intellectual property theft

Jan 06, 2024

"Artificial intelligence is a tool for stealing intellectual property from the person who created it."
Tom Sancton, author

I refer readers back to the post by Tom Sancton a final time.

We all use intellectual property created by others. And we didn’t pay for it.

Sancton is the victim of a blatant ripoff of his book The Bettencourt Affair. His Guest Post on January 2 reported that Amazon has for sale a 50-page "Summary of Tom Sancton's The Bettencourt Affair" alongside Sancton's book.

Buy the real 419-page book for $14.19 or the summary for $3.99. The summary used Artificial Intelligence programs to "read" Sancton's book and summarize it, being careful not to lift any passages word-for-word. It could be done quickly and cheaply, using AI. We see the risk to content creators. It commodifies their work. It destroys the business model that allows writers and other content creators to make a living.

It is theft. Clever theft, and therefore legal, for now at least.

AI is a great disrupter because in many situations it is useful and free and easy and adequate. Most people will not feel bad about buying the Summary book if they see them side by side at Amazon because buying that "Cliff Notes" version won't feel like receiving stolen property. A book shopper has no way to know what arrangements the Summary publisher made with Sancton, the source. (Answer, none. It was just uploaded, converted by AI, and sold as what it is, a competing product using Sancton's material.)

I expect quick adoption of AI programs onto home computers. People will install and use AI tools because there will be places where it seems so benign. A person who robs a bank knows they are stealing. Shoplifters, too. But a person who uses information scraped out of the internet by a search engine crawler or by an AI tool likely won’t get that triggering feeling of guilt. They are using the information, but not stealing it, not exactly. They will be receiving stolen property, or at least property converted and used without permission -- and not realize it.

I use Wikipedia from time to time to get casual information. When did Sgt. Pepper get released? (1967) Did William Howard Taft carry Oregon in 1908? (Yes.) Using Wiki is quick and easy and free. I send Wiki $50 a couple times every year in gratitude, and that goes to Wiki. What I am not doing, though, is rewarding the original research that became the basis for the Wiki article. That information got read, synthesized, and summarized by the people (not AI, not yet) who wrote the Wiki article and they did not compensate the people whose information they used. There was no mechanism for that.

There needs to be one. If reliable information is free or can be scraped up and re-used effortlessly, then we will stop getting reliable information. We will get propaganda, infomercials, and advertising instead. Gresham's Law of money has an analogue in news and information. Bad money drives out good money. Bad, sloppy information, and mis-information will circulate because there will be a business model for it; people have something to sell you and it can be produced very inexpensively. Well-curated information will struggle to exist. It is expensive, and won't compete well against a stolen version of itself.

I recognize that people hate toll roads. I hate them. I don't like paywalls on websites, either. But we are at a brief moment when Artificial Intelligence tools are still new enough that perhaps Congress can overcome resistance from tech companies, aggregators, and businesses built around information theft. I suspect that a sustainable future will involve a system of fees for use of material and it will be bundled with payments from AI companies and search engines. I expect it to be cumbersome and a nuisance. I expect to end up paying for some things I now get for free. I suspect there will be public resistance. People don't like to pay for things they got accustomed to getting for free.

I welcome a micropayment system. There will either be a marketplace for information, or there will be mass piracy. We are better off with a marketplace.

3 Comments

Mike Williams

Jan 6

The New York Times is suing OpenAI for copyright infringement. I am sure the NYT is well lawyered. The Times has many examples where AI has lifted large sections of the NYT’s copyrighted materials, far more than the fair use permitted by copyright laws. This case should set some boundaries going forward.

Expand full comment

Jan 12

Yahoo Finance describes the NYTimes copyright lawsuit as follows:

“OpenAI's copyright conundrum

Today's Takeaway is by Hamza Shaban, Senior Reporter.

‌

It's not stealing if it's innovating.

That's one prickly way of describing the position of AI companies that rely on the internet's copyrighted works to inspire their models.

This week, OpenAI, the company behind the culture-shifting AI chatbot ChatGPT, elaborated on its public case for rethinking intellectual property in the age of AI.

This week, in response to the New York Times' copyright infringement lawsuit against it and Microsoft (MSFT), OpenAI sought to clarify its business and motives, writing in a blog post: "Training AI models using publicly available internet materials is fair use, as supported by long-standing and widely accepted precedents. We view this principle as fair to creators, necessary for innovators, and critical for US competitiveness."

In a submission responding to an inquiry of the UK Parliament late last year, the company wrote: "Because copyright today covers virtually every sort of human expression — including blogposts, photographs, forum posts, scraps of software code, and government documents — it would be impossible to train today's leading AI models without using copyrighted materials."

And what makes OpenAI's arguments interesting and consequential is the novelty of the debate.

It's unclear to what extent existing copyright law speaks to AI, and the process of ingesting existing material to train powerful models that aim to generate and capture new types of value.

But in a tech industry move that by now seems familiar, AI companies are acting as if their permissive interpretation of the law is the natural mode of engagement, and that restrictions don't apply to them until they are proven wrong.

The maneuver resembles social media companies dodging accountability from real moderation responsibilities, while reaping the rewards of publishing other people's content. It also brings to mind the early days of ride-sharing and the gig economy, when popular apps rushed to claim market share while operating in a legal void.

And with both industries continuing to thrive while the law remains unsettled, AI companies must ask: Why tread lightly when inevitability is on your side?

1 more comment...

Up Close with Peter Sage

Intellectual property theft

"Artificial intelligence is a tool for stealing intellectual property from the person who created it." Tom Sancton, author

I refer readers back to the post by Tom Sancton a final time.

We all use intellectual property created by others. And we didn’t pay for it.

"Artificial intelligence is a tool for stealing intellectual property from the person who created it."
Tom Sancton, author