×
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT

Whose data is it anyway?

Isn’t it time that Big Tech started treating user data as intellectual property (IP) and be afforded the same protections as their data and their inventions? Notions of inequality extend far beyond just the economic.

Follow Us :

Comments

In the fall of 2022, Sam Altman’s company, OpenAI, sprung ChatGPT4, a large language model-based AI product capable of automating 40% of the white collar jobs in the financial, legal and medical sectors, upon an unsuspecting public. Since then, large audio, video and image AI models have also been introduced into the marketplace. Little was said about how the company had obtained all of its data to train the model from millions of websites across the globe without informing or obtaining permission from the data owners, especially Wikipedia. This brazen theft of data came to light only after several well-known authors and the New York Times filed lawsuits complaining of gross copyright violations.

While Altman is being feted by the glitterati of the business and technology world for showing them ways to extract maximum value for their money-making ventures using his company’s AI software, simultaneously tens of thousands of people are also being laid off across the world because of automation.

Between the years 2005 and 2020, Google and Facebook became enormously wealthy and powerful by grabbing your data for free and selling it to advertisers. Like OpenAI, Google also obtained a significant amount of information by digitising almost all material printed since the invention of the Gutenberg press, also free of cost.

Whether it is Google’s language software or Microsoft’s photo captioning system, these companies are heavily reliant on user feedback -- free, of course -- to improve their products. Here are two examples.

If you were to use Google’s voice recognition system for dictation/transcription purposes, the system asks you to correct its mistakes. The company developed the software but you get to fix it! As for Google’s language translation software to handle over a hundred different languages, the company did not hire scores of linguists across the globe to get the translations right but relied on raw user data and subsequent user feedback to train its AI learning modules. The initial data set for training the neural network was obtained by scanning and storing dual-language dictionaries. Simple, huh?

If you were to post an uncaptioned picture depicting yourself and two of your friends at some beach, Microsoft attaches a generic label ‘three people at a beach’ to the picture and ‘invites’ you to provide a more descriptive caption. You fall for this gimmick and provide details – who the three people are, which beach, what occasion and when the picture was taken – adding to the treasure trove of information the company already has on you.

Statistics compiled on internet usage patterns worldwide indicates that 80% of the information in the digital universe is created by individuals through their phone calls, emails, online financial transactions, postings on social media, and audio and video recordings.

Isn’t it time that Big Tech started treating user data as intellectual property (IP) and be afforded the same protections as their data and their inventions? Notions of inequality extend far beyond just the economic.

Roughly 30 years ago, the internet became commonly available across the globe. It was initially conceived as an open marketplace (‘digital universe’) where websites, large and small, could compete on equal terms and the unfettered flow of information and ideas was permissible. This idealistic version of the internet would have necessitated internet service providers (ISPs), public and private, to adopt and enforce strict net neutrality rules. Under these rules, an ISP would have had to treat all websites, apps, and other online services equally, and neither favour nor speed up traffic to any sites – fee-based or not.

Sadly, the immense popularity of Facebook and Google has resulted in ISPs entering into sweetheart deals with these companies, thereby calling into question the very notion of net neutrality. The same is going to happen with OpenAI. If you have gotten used to downloading hundreds of apps and streaming videos for free, don’t be too surprised when companies start charging fees.

It is time for you, dear reader, to take control of your data or take to the streets. Your livelihood may depend on it.

ADVERTISEMENT
Published 28 January 2024, 01:02 IST

Deccan Herald is on WhatsApp Channels| Join now for Breaking News & Editor's Picks

Follow us on :

Follow Us

ADVERTISEMENT
ADVERTISEMENT