Home » Help » Why aren’t all web pages indexed in search engines?

Why aren’t all web pages indexed in search engines?

Why aren’t all web pages indexed in search engines?

Saturday, 6 April, 2019

When people talk about the internet, they’re usually referring to the World Wide Web.

Created in 1991 by Sir Tim Berners-Lee, the www prefix in website addresses indicates content hosted on the surface web.

In other words, it’s intended to appear within page results in search engine indexes.

When you enter a search term into Google or Bing, the algorithms powering these search engine indexes will return relevant results based on webpage analysis.

However, this isn’t the only content hosted on the internet.

Indeed, the phrase ‘surface web’ is rather fitting, since there’s far more material in the internet’s depths than on its surface.

Deep, dark and rather scary

Content not included in search engine indexes falls into two categories:

The deep web. This is online data which isn’t intended to be publicly accessible.

There are numerous reasons why that might be the case:

  1. Product databases. When you’re looking to buy an item, you want to see a clean and tidy webpage featuring a product description, some photos and availability data.

    You don’t want to see the complex databases responsible for confirming stock levels of different sizes and colours.

  2. Developmental websites. If you’ve ever created a website using a platform like WordPress or Wix, you’ll know the satisfaction of hitting the Publish button.

    Until that moment, the website will be cloaked from search engines because it’s incomplete. It’ll stay that way until a user makes it live.

  3. Intranets. Many companies have dedicated web portals, enabling employees to log in and view information, share documents or communicate with each other.

    Intranets host sensitive corporate data like internal reports or private messages between colleagues, which shouldn’t be publicly visible in a Google search.

  4. Financial platforms. Imagine a scenario where confidential online banking information was displayed in third party search results, and anyone could access it.

    When dedicated financial services webpages open up (minus the usual bookmarks and browser bar options), it’s to prevent personal data being publicly accessible.

  5. Archived data. Companies generate huge amounts of information, and it may be confusing or inappropriate if older data is visible.

    Archived material is still hosted online where relevant individuals may view it, but it shouldn’t be published alongside contemporary webpage data.

The dark web. While the deep web provides the underpinnings for searchable internet pages, the dark web is a rather different entity.

Visible only through the privacy-focused Tor browser, which prevents third parties tracking individual user activity, dark web material tends to be illicit or dangerous.

Beyond the reach of regulatory spotlights, the internet’s worst secrets are stored. This is the natural home of extreme pornography websites and drug dealing marketplaces.

Websites are located at web addresses comprising lengthy strings of random alphanumeric characters, ending with a .onion suffix (Tor stands for The Onion Router).

The dark web is a perilous place for the unwary to tread, and payment is generally made using untraceable cryptocurrencies, though much of its advertised content is fraudulent.

Paying two bitcoin to a self-proclaimed assassin is unlikely to result in your nemesis being gunned down in an alley. And you won’t be able to claim a refund, either.

It’s easy to appreciate why Google and Bing feel such material doesn’t deserve mainstream publicity.

Instead of search engine indexes, dark webpage addresses tend to be published on bulletin boards, which are often out of date and consequently inaccurate in their page descriptions.

Unless you know what you’re doing, the dark web is best avoided entirely.

Neil Cumins author picture


Neil is our resident tech expert. He's written guides on loads of broadband head-scratchers and is determined to solve all your technology problems!

News What's the story?

Keep up with the latest developments in UK broadband.

Home broadband speeds are improving – but how quickly?

It’s comforting to know our internet is getting faster, but progress still lags behind Government targets.

Home broadband speeds are improving – but how quickly?Home broadband speeds are improving – but how quickly? Read more

The biggest malware threats of 2020…so far

It’s been a year few of us will forget in a hurry, and we're only halfway through.

Read more

Instagram could become the main news source for young people.

Reuters finds changes in the way younger users consume the news.

Read more

Help Learn with us

Make the most of the internet with our broadband library.

The importance of reviewing and renewing your broadband contract

Even if you have a busy family life and a demanding career, it’s vital to set aside time for reviewing and renewing your broadband contract

The importance of reviewing and renewing your broadband contractThe importance of reviewing and renewing your broadband contract Read more

How to check if your broadband is down

It might seem obvious that an outage has occurred, but there are easy ways to check if your broadband is down, or whether the problem is more localised

Read more

A guide to Big Tech alternatives.

It seems like we’re reliant on a small group of companies, are there alternatives?

Read more

Quick tips for boosting home broadband speed

Read more