Clicky

5 Ways to Use the Wayback Machine for SEO

5 Ways to Use the Wayback Machine for SEO

Sometimes a simple tool can give you incredibly powerful insights.

The Wayback Machine is one such tool.

The Wayback Machine takes historical screenshots of web pages and stores them in its public database. Anyone can use the Wayback Machine to view previous versions of pages or entire sites.

Here are five clever ways you can use the Wayback Machine for SEO.

Get the daily newsletters looking for marketers.

1. Find legacy URLs from old versions of the site

1. Find legacy URLs from old versions of the site

One of the most useful ways to use the Wayback Machine is to find historical URLs that have never been redirected.

The Wayback Machine collects information about your site over time. So it could have access to URL data from 10+ years ago.

This is especially important for websites that have been around for a long time. It is likely that the person who managed the site years ago changed the company or left roles and may not have used SEO best practices during website migrations.

The Wayback Machine can be a savior here. You can quickly find old URLs that have never been redirected to live versions.

For example, the 2003 Bose Headphones page (http://www.bose.com/products/headphones/) was never redirected:

Using the Wayback Machine, it is easy to discover legacy versions of key content from previous web versions. You can then find URLs to redirect that you probably wouldn’t have discovered otherwise.

Do you want to take this to the next level? Read Patrick Stox’s article on using the Wayback Machine API to find historical redirects. By querying the API, you can mass export legacy URLs. This can be much more effective for larger websites.

2. Find previous page content

2. Find previous page content

Website content changes over time. This happens for a variety of reasons (e.g., SEO, CRO, website migration, or highlighting different aspects of a product). There is always an inherent risk of any content changes, especially if they are significant.

This is where the Wayback Machine comes into play.

If you see large losses in rankings after changing content, you can check the Wayback Machine to show previous versions of old pages. Restoring the content to its original version could help your content regain lost visibility.

For example, NYMag’s “Best Pillows For Neck Pain” article has lost organic visibility since mid-2020 for terms like “neck pillow”. This has resulted in organic traffic loss over time.

Comparing the page with early 2020, we see that they have changed the content since then. The 2020 version included a quote from a chiropractor from the American Chiropractic Association in the introduction and kept the products above the fold.

However, in the current version, they added more content to the introduction, pushed the products under the fold, and moved the quote from the American Chiropractic Association at the bottom of the page.

While this may not be the only cause of the rankings, looking back at the previous content during top rankings could help them try to restore some content to older versions to see if this helps improve visibility.

3. Finding old robots.txt file

3. Finding old robots.txt file

Another great use of the Wayback Machine is to check how your robots.txt has changed from previous versions. This can be especially useful during website migration if your robots.txt file has changed and you do not have a version of the original file.

Fortunately, the Wayback Machine crawls a lot of robots.txt files. Just look at how many times the IBM robots.txt file was crawled in 2012:

Using this, you can analyze how robots.txt has changed over time. For example, IBM’s robots.txt file looks completely different from what it used to. Here is the file in 2012:

Looking at the current robots.txt file on the site, you can see that the commands have changed:

Using The Wayback Machine can be an extremely effective way to search for older versions of your robots.txt file. This is especially useful if the information was lost during a website migration.

4. What sections competitors are adding to their pages

4. What sections competitors are adding to their pages

Websites in competitive spaces routinely add or update content. For your top-priority keywords, your competitors are likely to make frequent updates to their pages in an attempt to improve their visibility. It can be difficult to track these changes.

Fortunately, the Wayback Machine allows you to understand what updates competitors are doing to their content.

For example, we can use The Wayback Machine to view Serious Eats’ best cast iron page on June 27, 2021:

Looking at the page today, we can immediately see that they have made some dramatic changes to the page:

Reviewing the existing page, we can see that they have:

This is extremely valuable information to have while conducting a competitive analysis. These changes can now inform the editing strategy we are applying to our own page.

Determining the content differences can be difficult. It requires manual overhaul. However, you can use tools like Diffchecker to easily spot the content changes.

5. How frequently competitors are updating content

5. How frequently competitors are updating content

Use the Wayback Machine to determine how often competitors update content.

This is especially useful if you are in a SERP landscape where content freshness is important for visibility.

For example, CNET’s Advanced Page for the Best Android Phone of 2022. At the top of the article, you can see the timestamp for when the article was last updated:

As technology moves extremely fast, freshness is probably important for terms like “best android phones” because the products change frequently. That’s why we may want to explore how often we need to update our own content to stay competitive.

Using the Wayback Machine, we can build a timeline of how often CNET updates these articles. By looking at the previous timestamp on the page, we can look for the most recent historical version of the Wayback Machine that predates that date. For example, to find the update that occurred before March 5, 2022, we can search for what the March 2, 2022 version looked like.

By repeating this process, we can develop a timeline for how often CNET updates this page:

Based on the data, it is safe to say that CNET is updating this article on a monthly basis. We may want to apply the same update rate to our content in order to stay competitive with CNET.

Back to the Wayback

In a world where the network is always changing, the Wayback Machine is invaluable.

You can use this tool in several ways to recover lost information and gather insights into the direction of competing strategies.

Make sure the Wayback Machine is in your SEO tool.

Opinions expressed in this article are those of the guest author and not necessarily Search Engine Land. Staff authors are listed here.

New on Search Engine Land

About The Author

Chris Long is the vice president of marketing at Go Fish Digital. Chris works with unique issues and advanced search situations to help his clients improve organic traffic through a deep understanding of Google’s algorithm and web technology. Chris is a contributor to Moz, Search Engine Land and The Next Web. He is also a speaker at industry conferences such as SMX East and the State Of Search. You can connect with him on Twitter and LinkedIn.

How do you use Webcitation?

How can I use WebCite® as a reader? Simply click on the WebCite® link provided by publishers or citing authors in their WebCite® enhanced references to retrieve the archived document in case the original URL stops working, or to see what the citing author saw when he cited the URL.

How do you use WebCite? WebCite is an on-demand web archive service located at https://www.webcitation.org/…. Go to https://www.webcitation.org/archive.

  • Enter the URL of the webpage you want to archive in the “URL to Archive [url]” field.
  • Enter an email address in the “Your (citing author) Email [email]” field.

When did Yahoo buy GeoCities?

GeoCities, 1999: $ 3.7 Billion When Yahoo! bought GeoCities for $ 3.7 billion in 1999, CNN Money called it a move that would & quot; consolidate Yahoo! as a winner in the online popularity contest. & quot; History shows us otherwise. GeoCities was then the third most visited & quot; website & quot; on the Web.

Why was GeoCities closed? Yahoo Japan announced today (October 1) that it will cease (link in Japanese) its GeoCities service in March 2019, 22 years after its launch. The company said in a statement that it was difficult to encapsulate in one word the reason for the closure, but that profitability and technology issues were major factors.

What happened to my GeoCities website?

The web hosting company GeoCities was a model of this early internet era, but in March 2019 (almost 25 years after its creation in 1994) it will cease to exist. Yahoo Japan announced that it would close GeoCities.co.jp on March 31, 2019. Yahoo bought GeoCities in 1999 for $ 3.6 billion.

What replaced GeoCities?

With the end of GeoCities in the United States, Yahoo! no longer offered free web hosting, except in Japan, where the service lasted for ten more years. Yahoo! urged users to upgrade their accounts to the Yahoo! Web Hosting Service.

How can I find my old GeoCities website?

To start browsing the archive, navigate to the Neighborhoods page. Like the original GeoCities itself, the archive is categorized by the theme of each site. Clicking on Neighborhood will bring up an open directory page that has an index of all sites.

Is GeoCities archived?

Geurbs was one of the first places your average person could make a website for free. The Geocities Gallery aims to archive these websites and return them to functionality, MIDI and everything. Geocities was a web hosting service launched in 1994.

When did GeoCities go public?

After being released in 1998 during a period when GeoCities rose to unprecedented prominence as the top five players on the Web, the following years almost embodied the grand burst of the Internet bubble.

When did GeoCities start?

But GeoCities launched in 1995 (it was originally called the Beverly Hills Internet), when there were only a few million people online.

What company bought GeoCities in 1999?

buys GeoCities – January 28, 1999. NEW YORK (CNNfn) – Yahoo! Inc. confirmed Thursday that it will buy GeoCities, a fast-growing web community, under a $ 3.6 billion deal that will strengthen Yahoo! as first place in the online popular contest.

Did Yahoo buy GeoCities?

GeoCities, 1999: $ 3.7 Billion When Yahoo! bought GeoCities for $ 3.7 billion in 1999, CNN Money called it a move that “would solidify Yahoo! ‘s position as a forerunner in the online popularity contest.” History shows us otherwise. Then GeoCities was the third most visited “website” on the Web.

What company bought GeoCities in 1999?

buys GeoCities – January 28, 1999. NEW YORK (CNNfn) – Yahoo! Inc. confirmed Thursday that it will buy GeoCities, a fast-growing web community, under a $ 3.6 billion deal that will strengthen Yahoo! as first place in the online popular contest.

What replaced GeoCities?

With the end of GeoCities in the United States, Yahoo! no longer offered free web hosting, except in Japan, where the service lasted for ten more years. Yahoo! urged users to upgrade their accounts to the Yahoo! Web Hosting Service.

Why was GeoCities discontinued?

1) that it will cease (link in Japanese) its GeoCities service in March 2019, 22 years after its launch. The company said in a statement that it was difficult to encapsulate in one word the reason for the closure, but that profitability and technology issues were major factors.

How do I find my archived Internet?

Viewing archived websites. Go to https://web.archive.org in your browser. You can use the Wayback Machine to view older versions of websites on any computer, phone, or tablet. Enter the website you want to view.

How long does the Wayback Machine take to load?

In 2014 there was a six-month delay between when a website was crawled and when it became available for viewing on the Wayback Machine. Currently, the delay is 3 to 10 hours.

Why doesn’t Wayback Machine load? Why is this happening? Summary: Pages may fail to load completely when Wayback encounters a problem while rewriting all URLs on that page into its archive forms. Wayback “rewrites” countless links like this to make it possible to browse through the myriad of pages in an online archive collection.

Is the Wayback Machine slow?

Many researchers and historians use it extensively to preserve digital artifacts. However, Wayback Machine has some limitations, as it is very slow and unresponsive on many crawling websites.

Is the Wayback Machine illegal?

Analysis Wayback Machine’s web archive is legitimate evidence that can be used in a lawsuit, a U.S. appellate court has ruled.

Is Internet Archive illegal?

In addition to Internet archives, the Internet Archive maintains extensive collections of digital media that are certified by the uploader to be in the public domain in the United States or licensed under a license that allows redistribution, such as Creative Commons licenses.

Is Wayback Machine trustworthy?

The Wayback Machine is a great tool for watching how the network started and how it has evolved over time. But it is not a reliable tool for archiving web pages. Not all pages are captured by the Wayback Machine and you have no control over the pages they capture.

Is using Wayback Machine legal?

Legal status Only the content creator can decide where their content is published or duplicated, so the Archive should delete pages from its system at the request of the creator. The exclusion policies for the Wayback Machine can be found in the FAQ section of the website.

Why does Wayback Machine take so long to load?

Why is the Wayback slow? Well, it’s a combination of multiple factors. The saved sites must be tracked to the server that has them, and then the webpage must be unpacked for you from compressed databases, and then returned. It just takes time.

Why is Web archive so slow?

One reason is because archive.org offers free services and when things are free, many people usually access the service. When many people use a website, the bandwidth becomes limited and the site slows down. If you are trying to access the information on the site on Friday evening, expect archive.org to be slower.