Why aren’t all web pages indexed in search engines?

Why aren’t all web pages indexed in search engines?

Saturday, 6 April, 2019

When people talk about the internet, they’re usually referring to the World Wide Web.

Created in 1991 by Sir Tim Berners-Lee, the www prefix in website addresses indicates content hosted on the surface web.

In other words, it’s intended to appear within page results in search engine indexes.

When you enter a search term into Google or Bing, the algorithms powering these search engine indexes will return relevant results based on webpage analysis.

However, this isn’t the only content hosted on the internet.

Indeed, the phrase ‘surface web’ is rather fitting, since there’s far more material in the internet’s depths than on its surface.

Deep, dark and rather scary

Content not included in search engine indexes falls into two categories:

The deep web. This is online data which isn’t intended to be publicly accessible.

There are numerous reasons why that might be the case:

  1. Product databases. When you’re looking to buy an item, you want to see a clean and tidy webpage featuring a product description, some photos and availability data.

    You don’t want to see the complex databases responsible for confirming stock levels of different sizes and colours.

  2. Developmental websites. If you’ve ever created a website using a platform like WordPress or Wix, you’ll know the satisfaction of hitting the Publish button.

    Until that moment, the website will be cloaked from search engines because it’s incomplete. It’ll stay that way until a user makes it live.

  3. Intranets. Many companies have dedicated web portals, enabling employees to log in and view information, share documents or communicate with each other.

    Intranets host sensitive corporate data like internal reports or private messages between colleagues, which shouldn’t be publicly visible in a Google search.

  4. Financial platforms. Imagine a scenario where confidential online banking information was displayed in third party search results, and anyone could access it.

    When dedicated financial services webpages open up (minus the usual bookmarks and browser bar options), it’s to prevent personal data being publicly accessible.

  5. Archived data. Companies generate huge amounts of information, and it may be confusing or inappropriate if older data is visible.

    Archived material is still hosted online where relevant individuals may view it, but it shouldn’t be published alongside contemporary webpage data.

The dark web. While the deep web provides the underpinnings for searchable internet pages, the dark web is a rather different entity.

Visible only through the privacy-focused Tor browser, which prevents third parties tracking individual user activity, dark web material tends to be illicit or dangerous.

Beyond the reach of regulatory spotlights, the internet’s worst secrets are stored. This is the natural home of extreme pornography websites and drug dealing marketplaces.

Websites are located at web addresses comprising lengthy strings of random alphanumeric characters, ending with a .onion suffix (Tor stands for The Onion Router).

The dark web is a perilous place for the unwary to tread, and payment is generally made using untraceable cryptocurrencies, though much of its advertised content is fraudulent.

Paying two bitcoin to a self-proclaimed assassin is unlikely to result in your nemesis being gunned down in an alley. And you won’t be able to claim a refund, either.

It’s easy to appreciate why Google and Bing feel such material doesn’t deserve mainstream publicity.

Instead of search engine indexes, dark webpage addresses tend to be published on bulletin boards, which are often out of date and consequently inaccurate in their page descriptions.

Unless you know what you’re doing, the dark web is best avoided entirely.

Neil Cumins author picture

By:

Neil is our resident tech expert. He's written guides on loads of broadband head-scratchers and is determined to solve all your technology problems!

News What's the story?

Keep up with the latest developments in UK broadband.

United Nations warns of ‘digital welfare dystopia’

The UN has warned internet users of handing over their data to ‘big tech’ and accused companies of exploiting the poorest users.

United Nations warns of ‘digital welfare dystopia’United Nations warns of ‘digital welfare dystopia’ Read more

BT and O2 launch 5G in the same week!

BT and O2 are the latest networks to enter the bitter high street 5G battle.

Read more

UK Porn block for children has been scrapped.

The government’s controversial ‘porn blocker’ plan, mired in delays and problems, has been officially scrapped.

Read more

Gigaclear undertake costly fibre install UNDER River Severn to reach rural customers

The upstart ISP embarks on ambitious plan to ensure rural customers have access to full fibre broadband!

Read more

Help Learn with us

Make the most of the internet with our broadband library.

Minimum connection speeds for common online activities

Read more

How many companies provide full fibre broadband?

Read more