Some of us can still remember the good ol’ days of the “dotcom” boom back in ’99 to ’01. It was like a frickin’ explosion of websites catering to every person’s whim and fancy. You wanted something specific? Boom, there was a website for that! All you had to do was remember the name. But then things started to change. We saw the rise of directories like Yahoo and AOL, where websites were fighting for a prime spot and ratings. It was like a friggin’ competition, man.
But you know what happened next? Those directories got replaced by search engines, and Google emerged as the freakin’ ruler of them all. They had this kickass crawling bot that would visit every damn website, follow all the links, and copy all the juicy data. Then they processed that data and gave you a list of the most relevant websites based on your search. It was like magic, dude. You didn’t have to remember the domain names anymore, just type in some keywords and bam! You got yourself a link.
But here’s the thing, this transition didn’t come without its problems, man. Publishers and newsmakers were scratching their heads, trying to figure out what to do. On one hand, they wanted their websites to be ranked on top of the search results. They wanted to be searchable and optimized for those search engines, you know? But on the other hand, they were losing out on advertising revenue and visibility. People weren’t seeing those front page ads anymore because they were taken straight to the relevant links. It was a tough dilemma, man.
Most publishers and newsmakers decided to adapt, though. They embraced the new reality of search and made their content fully-searchable. They even let the crawling bots get their hands on premium paywalled content so it would have a higher chance of getting picked up. This opened up a whole new world, man. User-generated content started popping up in the form of blogs and contributor networks. Suddenly, this new media was competing with traditional media for traffic and advertising revenue. Meanwhile, the search engines were raking in the big bucks.
And let me tell ya, this shift in power and advertising revenue has had a huge impact on publisher valuations, man. Like, Forbes, one of the big boys, was acquired for $415 million in 2016. And now it’s expected to change hands again for around $800 million! That’s some serious cash, man. Fortune was acquired for a measly $150 million, and Jeff Bezos scooped up the Washington Post for just $250 million. I mean, those are nothing compared to the market cap of Google, which is a whopping $1.66 billion. Facebook and Twitter are also sittin’ pretty with market caps of $760 billion and $44 billion, respectively.
So you see, these tech giants became the big dogs. They became the traffic aggregators and took a big bite out of the advertising revenue. And you wanna know who suffered? The content creators, man. Those guys funding professional journalism lost out on that sweet ad dough.
This unfair redistribution of ad revenue led publishers to focus on search engine optimization (SEO), my friend. They started coming up with flashier titles and catering to what the consumers wanted, rather than focusing on balanced and professional reporting. It’s like they were trying to get on the search engine’s good side, you know? Some governments even stepped in to enforce fair distribution of advertising resources. Canada, for example, introduced a bill that requires online giants to share ad revenue with publishers. But of course, the search engines and social networks ain’t too happy about that, man.
Now let’s talk about this new dilemma that publishers are facing. You ready? So there’s this thing called ChatGPT, and it’s a freakin’ awesome language model. Rumor has it that it was trained on data from Microsoft Bing’s crawling bot. OpenAI even revealed their own web crawler, called ChatGPT Bot. And guess what, some publishers and creators started blocking that bot to protect their content. They don’t want their stuff getting out there, man.
You see, these transformer-based Large Language Models (LLMs), like ChatGPT 4.0, are so damn good that they can outperform humans in many tasks. And I’m not just talking about simple tasks, man. We’re talking analytical tasks, bro. But here’s the thing, most high-quality publishers like Forbes.com and Nature Publishing Group have banned the use of generative tools for content creation. They’re keeping things tight, man.
I wrote an article about this before, and let me tell ya, publishers with a ton of proprietary content are in a sweet spot. They could develop their own trustworthy chat bots or license their content to generative AI companies. But they gotta be careful. If they let their content be crawled and processed by those generative AI bots without proper watermarking and copyright notices, they could lose their advantage. But at the same time, if they don’t let the bots crawl their stuff, their content might not get accessed as much. It’s a tough call, man.
Oh, and here’s a fun fact: a lot of the published content has already been crawled by search engines. It’s been used for training generative AI systems, man. Google went all out and digitized books and crawled the whole freaking internet. So the question is, did those leading AI players already use that content for training their LLMs? We gotta do some massive tests to find out, man.
Here’s what we gotta remember, though. If publishers start allowing the use of generative tools, we might see a decline in the quality of content. And I’m not just talking about spammy stuff, man. We’re talking about serious publishers losing out on advertising revenue. They need that dough to maintain their high editorial standards. That’s where lawmakers might need to step in, you know? They gotta make sure independent media is supported and professional journalism is encouraged. Otherwise, we might end up with a polluted internet full of AI-generated content. And that ain’t good for nobody, man.
If you wanna keep up with what’s going on, follow me on Twitter or LinkedIn. Check out my website and my other work, too. Stay informed, my friends.