Hi there I’m Chris Berkley I’m a digital marketing consultant and this is part 2 of my article about redirect chains. In part 1I talked about what redirect chains are and their implications for SEO and in this article, part 2, I’ll talk about how to find them and what to do with them.
The first thing you have to do in order to find redirect chains is you have to start with a group of old URLs. I have three ways that I really use to find these and I’ll walk you through those now. OK so I said I had three strategies to find redirect chains and it’s actually more like four. All four of those strategies are going to use the paid version of Screaming Frog SEO Spider and remember that our initial goal here is to find a list of URLs that we can then use Screaming Frog to crawl and Screaming Frog will find the redirect chains for us.
Strategies one and two are going to give us a list of URLs that we can put into Screaming Frog while in strategies three and four we’re going to use Screaming Frog itself to find those URLs directly. So I’m going to use Phillymag.com as an example. This is not a client of mine but its website has been around a long time and probably has some redirect chains. So strategy one we’re going to use Google Analytics and while I don’t actually have Google Analytics access for Phillymag.com I’m gonna show you how to do it on one of my websites in the same manner. So we’ll go into Google Analytics and we’ll go to Acquisition into Channels and then select Organic Search and then once we’re in there we’re going to set the date range to as far back as possible and then go down and make sure that our primary dimension is landing page, then go all the way down to the bottom and set that to 5,000 so we get as many as possible.
Then we’ll go up top and click ‘Export CSV’ and this will export a spreadsheet of all the old landing pages that received traffic at one time and that’s probably going to contain a lot of old URLs. Method number two is also going to find a list of URLs but with this one we’re going to use a tool by Moz called Open Site Explorer. This is a free tool if you create an account with them which is awesome and this is going to find all the backlinks pointing to a particular website.
We’ll copy Phillymag.com’s URL and we’ll put it in the search bar here and then this is going to show us the backlinks, but we want a version that we can export so we’re gonna make sure the target here is set to be ‘root domain’ so we get all the backlinks pointing to the site we’re going to keep the link source as only external and then the link type we’ll set to link equity so we only get links that are driving link equity and then you can request csv and you’ll get you can export a spreadsheet with a list of those URLs. So now we’re going to use Screaming Frog to crawl those lists of URLs but first we need to adjust a couple settings in Screaming Frog so first we’ll make sure that under mode, we’ve selected List mode. Secondly we’ll make sure we go to Configuration > Spider and then under the ‘Advanced’ tab, make sure you check the box that says ‘Always follow redirects’ and click OK and the third setting is under Configuration > Robots.txt settings and make sure that you’ve checked the ‘Ignore robots.txt’ box.
From there you can go into either of your lists, either your export from Google Analytics or your export from Moz and copy all of, in this case, the Phillymag.com pages. Go into Screaming Frog, click upload, paste it’ll enter them, and you click OK and it starts crawling them. Now here I’m going to pause and stop and I’m going to actually show you strategies three and four and then we’ll loop back around to what you do with the final output once Screaming Frog is done crawling all of these. Strategy three, we’re going to use Screaming Frog to actually crawl the site and what that’s going to do is going to find any redirect chains that are currently on the site. So first we’re gonna go into Screaming Frog and again we’re gonna remember we have to change those settings. We’re going to change the mode to Spider off of list mode.
Going to go to Configuration > Spider > Advanced > ‘Always follow redirects’ and then back in Configuration > Robots.txt settings and ‘Ignore robots.txt.’ These are the same settings that we did before but when you change from List mode to Spider mode you have to reset them because Screaming Frog forgets your settings. So then we’ll grab the Phillymag.com URL, take it into Screaming Frog and we will just start crawling it and then we’ll let that run till it finishes. Now Strategy four is going to do something fairly similar – it’s a little bit more complex. We’re going to go to the Wayback Machine, just Archive.org/web and we’re going to take the Phillymag.com URL and put it in there.
Now what the Wayback Machine does, is it archives old web pages and URLs as they existed and we’ll just want to do is we can pick any number of older years here and then see what the website looked like at that time and then crawl it and get the URLs from it. I already have 2011 loaded in another tab here so just grab this you can see it contains the exact Phillymag.com URL.
Copy that, go back to Screaming Frog and we’ll start over by crawling web. archive.org, the cached Philly.com, Phillymag.com version. You can see it’s already starting to populate URLs here, now this is a little more complex because what you have to do here is you’re actually gonna get these, you’ll need to export a list of them and remove the web.archive.org part so that you have the just the Phillymag.com URL. This is definitely the most tedious method but this is a good way to find older URLs if everything else fails. Then once you’ve done that, so once you all your crawls have finished, whether you run a crawl on a list, we’ve run a crawl on the site itself.
You’ll need to go into Reports and then Redirect chains and that’ll basically export a CSV of the redirect chains, which is going to look like this. And I’ll go up top first thing I’ll do is I’ll filter this and you can see the third column here is Redirect loops. So my last article I talked about redirect chains and this is redirect loops. Screaming Frog will inherently look for any possible loops. So we can go largest to smallest and I don’t see TRUE in there so there are no redirect loops that Screaming Frog found and that’s good. And then I’ll sort by the number of redirects, largest to smallest – you can see we’ve got at least two URLs that have four redirect chains, three redirect chains and then quite a bit more that have two.
You can ignore any redirect chains that have one because that’s not a redirect chain that’s just a single redirect and there’s nothing wrong there. One thing I’ll do with this in order to manipulate the data and show me any possible errors is I’ll just add some conditional formatting and I’m going to highlight the cells that are equal to 302 we’ll do that in yellow. A 302 does not pass as much link equity as a 301. Right off the bat we can see we have a 302 and we’ll also do some conditional formatting to highlight any cells that are 404s in red so this will give us an easy way to scan pretty quickly and figure out how big of a problem we might have, even outside of redirect chains. Obviously 302s don’t pass the link equity that 301s do, so you might have a redirect chain that’s not desirable but that’s even worse.
And as we scroll down a little bit more you can see that we’ve got some 404s in here too so if you ping the single URL it might return 301 but then when you look you’ll find that you actually might have a 404 as the end destination. So this basically covers everything we want to do with redirect chains and again you can manipulate this data in any number of ways and run a few different reports to find all the redirect chains that are possible. That wraps it up for redirect chains. If you have any questions feel free to leave a comment. Thanks!.