Can ChatGPT crawl live data from URLs? – No, but this is often not necessary.

·6 mins
Christoph C. Cemper
Some users would like the well-known ChatGPT from OpenAI to crawl data live from the web page when URLs are entered and include it in the result.
Alt text

Some users would like the well-known  ChatGPT from OpenAI to crawl data live from the web page when URLs are entered and include it in the result. But that is unfortunately (still) utopia.

As already in name “GPT” for General Pretrained Transformer is, it is a static, in advance, trained  language model. The “learned in advance” is right there, and therefore excludes individual crawling through the model.

It has also been more than amply disclosed in countless reports on the innovative AI Chat tool, as well as the documentation of OpenAI itself, that the data on which GPT3.5 (ChatGPT’s basis) was trained contains a version of the Internet until the end of 2021.

So from the use case, neither the  large language model or ChatGPT, nor the web interface is suitable to replace an SEO tool such as Google Search Console, where a current version is fetched in a few seconds at the push of a button.

But that is also not necessary.

Why should ChatGPT crawl live?>

Why should ChatGPT crawl live? #

Of course, search engine optimizers (SEOs) want to use the most up-to-date data possible when analyzing their content or that of their competitors. But ChatGPT is not an SEO crawling tool and never claimed to be.

The idea is to use the most up-to-date content possible from a URL, and then build your own content based on that. Various content tools offer measurement methods like keyword density or the more advanced  TF*IDF (a well-known concept from the 70s, which is used in many text search engines) to give copywriters inspiration to create more comprehensive articles, which should then also rank better in search engines. Holistic content is also sometimes referred to.

Now it is so that in  large language models like GPT3.5, which is used in ChatGPT, particularly holistic content already exists. After all, the training of the model was (to put it simply) based on a relatively complete Internet crawl. However, a TF*IDF tool for text optimization is not it therefore nevertheless.

“Fine Tuning” is the technical term for “relearning” content and is possible with other OpenAI offerings, but not (yet) with ChatGPT. Fine tuning the  language model to new content can help improve the quality of the language model and the output of the AI. However, the associated effort to train the language model has to be paid for by even much higher fees. ChatGPT does not offer this option as a free tool at the moment, but only responds to data until the end of 2021.

Why enter the URL in ChatGPT?>

Why enter the URL in ChatGPT? #

In many aspects, we often don’t understand why a  language model responds the way it does. This is also the case for the creators and operators of AI such as ChatGPT itself.

If one includes an existing, established URL in the prompts, then relevant aspects from the model are “inferred” (from the English “inference”) for this purpose, i.e. statistically likely generated, just as all text output is only statistically likely generated.

So, by entering the URL into the  language model, one can set certain anchor points, perhaps referring to an old version of the content at that URL. Or perhaps only relevant words are extracted from the URL. Of course, this depends a lot on how “speaking” the URL is.

URLs with speaking names like https://www.my-camping-store.de/gas-ovens naturally work better in prompts than cryptic URLS like https://www.coolstuff.com/c12/p422.

But speaking URLs worked better in the Google search engine 20 years ago, so why shouldn`t more specifics help with a modern  language model like ChatGPT?

However, for creating pretty good articles using the “Outrank Article” prompt template from  AIPRM for SEO, specifying URLs works surprisingly well. As is so often the case, the result can then be greatly improved by subsequent prompts.

ChatGPT sometimes also immediately clarifies itself that it cannot crawl the Internet.

Will ChatGPT replace Google Search? #

It turns out that with clever prompts, you can generate amazingly good content that matches what has existed in the past. For most topics, like folding mattresses, there is a limited amount of topics and only little topicality.

If I ask ChatGPT for the January 2023 washing machine test winner, I’m in the wrong tool.

The desire for crawling and timeliness perhaps comes from the fact that for weeks people have been thinking quite excitedly about whether ChatGPT could be the Google killer. But if you understand how such a  language model is built, and how inactual the content in it is, then that doesn’t make sense either.

language model is not a search engine, is not an SEO tool, and certainly – in its current form – is not a Google replacement for finding up-to-date content.

However, a  language model as of the end of 2021 is very handy for generating a concrete and very comprehensive (“holistic”) answer when the content does not need to be up-to-date. For the basics of software development, tinkering with small applications as well as in programmer tutorials it is very suitable. The latest version of the configuration files for the Caddy web server it is unsuitable.

If one considers in such a way, then many reports about the possible replacement of Google, also from  renowned  sources, appear with today’s view nevertheless very uninformed, but above all lurid. For the interested reader I recommend to inform in any case about the  PalM model, a Google  language model, which is based on over 500 billion parameters, instead of 175 billion.

Google PalM is apparently far superior to GPT3/GPT3.5 – but unfortunately it is not freely available. Google, as a language based business model, has been dealing with “artificial intelligence” and language models like GPT3 for a decade.

*Diagram from  https://lifearchitect.ai/iq-testing-ai/

Can ChatGPT still help me create good content?>

Can ChatGPT still help me create good content? #

ChatGPT’s output is controlled by the prompts alone, the input command. The better the prompt, the better the output. The more context the  language model gets, the better.

Typing URLs into the prompt does not trigger crawling of the URL, but that is neither necessary nor promised to produce good content. When it is “revealed” after almost two months that this is just not so, I am a bit surprised.

Some users still swear by crawling because the inference results are just as good. But then of course you have to ask yourself – for which prompt? For the topic “folding mattresses” just in 2022 not so much has happened that a relearning of the  language model would have been necessary.

But the attempt to get current content like the last sports results from the static  language model must fail.

*This text was created using only my human labor and a cup of coffee. The comma errors were fixed by  Languagetool. The image of the robot having to think while typing was created by  Midjourney.


Use the Cheat Code for AI today. Install AIPRM for free.

Only a few clicks until you also experience the AIPRM-moment in your AI usage!

As Seen On

brand logo of FE-news-logo.svg
brand logo of MSN-logo.png
brand logo of MSSP-alert-logo.svg
brand logo of MV-pro-logo.svg
brand logo of PA-Life-logo.png
brand logo of ZDNet-logo.svg
brand logo of az-big-media-logo.webp
brand logo of business-magazine-albania-logo.png
brand logo of business-money-logo.png
brand logo of business-today-logo.svg
brand logo of cybernews-logo-light.png
brand logo of daily-record-logo.png
brand logo of daily-star-logo.png
brand logo of digi-talk-logo.svg
brand logo of forbes-logo-light.png
brand logo of hackr-io-logo-light.webp
brand logo of hr-director-logo.png
brand logo of information-matters-logo.webp
brand logo of infoworld-logo.png
brand logo of live-science-logo.png
brand logo of liverpool_echo_logo.png
brand logo of makeuseof-logo-light.png
brand logo of marketing-profs-logo.svg
brand logo of mirror-uk-logo.png
brand logo of observatorio-IA-logo.svg
brand logo of seeking-alpha-logo.webp
brand logo of selecthub-logo.png
brand logo of sme-news-logo.png
brand logo of sote-logo.svg
brand logo of tech-digest-logo.png
brand logo of techno-economy-logo.png
brand logo of techopedia-logo.svg
brand logo of us-sun-logo.webp
brand logo of wales-online-logo.png
brand logo of zapier-logo-light.png

What Our Users Say

Amazing extension!

"Amazing extension, love it!”

user name display picture
Behram Mahamat Saleh
Software engineer at CGI

It's the best value money can buy on the internet right now

"I just resubscribed.
I was just being a smart ass with my original message...lol
For the amount of use I get out of it every day you could charge a lot more and still get away with it.
It's the best value money can buy on the internet right now.
Add on the fact that you have great customer service and you're a definite homerun for any business.
Bret”

user name display picture
Bret Jenny
Chairman at Greater Las Vegas Association of Realtors Blockchain Committee

AIPRM Transforms Content Marketing: Effortless Planning, Quality Boost, and a Publishing Surge

"AIPRM changed the game for us in our content marketing system. Now all employees can plan and develop great content that ranks and is valuable to clients and the company within minutes.
Now our biggest bottleneck is publishing the content we have, because there's 4x the quantity.
While the quality has only improved.”

user name display picture
Adam Temple
CEO at Evolved Extraction Solutions