The inspiration for this investigatory post came from a recent conversation I had with an ecommerce client, which went something like this:
~~~~~~~~~~~~~~~~~~~~~~~~~
Client: Hey Mark, wanted to get your read on this service that Company X pitched me on. They sent over www.exampleofriskystrategy.com as a model for what they could do for us - mind taking a look?
Me: Sure, no problem...(opens site)...hmm, I'm not seeing anything too extraordinary...
Client: Ah, you need to go to the footer and click on that 'Top Searches' link.
Me: Okay... (already feeling that this was not going to pass the sniff test).
(clicks link....)
Me: Woahhhhhh.....
~~~~~~~~~~~~~~~~~~~~~~~~~
What I opened up was a page with almost 8k links ... not 800, but a full page of 8,000 links.
Maybe not the best strategy to make an enormous HTML sitemap, but that's why people hire consultants like SEER, right?
But it gets worse - this 'Top Searches' page had a different URL structure than the rest of the site. Sensing my question, the client said: "Yeah, as you can see they've got their own subfolder there so they can manage everything remotely."
At this point I inquired as to why they would need to manage this remotely, and then I learned that all Company X required was a product feed to auto-generate these links, and the ~8k landing pages. Giving a third party company access to your site and allowing the creation of 1,000s of pages is hardly ever a good idea, but I'm a glass half full kind of guy, so I kept digging.
And the plot definitely thickened...
Clicking on one of these pages took me to what looked like a product category landing page...but something was off. It took me a few seconds to figure it out, but I finally realized that the product category page itself did not have any related products. To be clear, it would be like a gardening site having a link for 'watering cans', then having a landing page like www.gardeningsite.com/products/watering-cans - without any actual watering cans. I'm going to use the gardening example moving forward to keep things clear :)
To make matters worse, the ONLY time the keyword was mentioned on the page was in the h1 (and in the Title).
Ok - but if there aren't any watering can products on the page, and if 'watering cans' is not in any of the copy - what is the stuff on the page?!?
When I pulled a string of text from the copy on the product category page and dropped it in quotes into Google with a site:companyx.com I got 3.6 million results.
So what we've got are thousands, hundreds of thousands, millions? of pages with no unique content, no real access point for users (save for a buried footer link), no relevant products - and as a kicker, the site actually has a regular watering cans page available through their nav.
This got me thinking...
~~~~~~~~~~~~~~~~~~~~~~~~~
Me: "What was the name of the company who contacted you again?"
Client: "Company X"
Me: "Right, gotcha"
CTL+U...CTL+F..."C..o...m...p...a"
~~~~~~~~~~~~~~~~~~~~~~~~~
Sure enough, tucked into the code was a tidy little footprint in the form of a tracking pixel with the URL for Company X. Presumably this is planted there so that they can pitch www.gardeningsite.com on a per visit basis, possibly even going as far as to say "Free Set-up, only pay for the visits we drive via optimized landing pages." There was also a separate Google Analytics UA code for Company X, which I'm assuming they're using to spot-check data. I then used SpyOnWeb to pull the UA code - 44 other domains popped up.
Obviously this would be a major concern if you were the owner of www.gardeningsite.com - right? Well, maybe not.
What we've found more and more is that many folks just are not aware of the risks involved with automating page creation with thin, duplicated content and scaling that strategy across thousands of pages with an easily trackable footprint (on 40+ other domains). Which is why I'm hoping this post illuminates some of the clear red flags associated with services like this.
~~~~~~~~~~~~~~~~~~~~~~~~~
Red Flag 1: You give us access, we do the rest - no resources required on your end.
Ecomm Manger's POV: Great, no need to shift Joe Programmer's schedule to handle these SEO pages.
SEO's POV: Bad idea. Rarely would we ever recommend giving access, for the simple reason that you won't always know what is happening on your site (case in point).
~~~~~~~~~~~~~~~~~~~~~~~~~
Red Flag 2: All we need is a product feed, we'll use that to create all the pages.
Ecomm Manger's POV: Done, easy - we've already got that set up.
SEO's POV: How is it that the product feed will generate unique content for 1000s of pages? Hint: It won't.
~~~~~~~~~~~~~~~~~~~~~~~~~
Red Flag 3: We'll only charge you for the visits you get.
Ecomm Manager's POV: Nice, no upfront cost, no need to reconfigure my budget, plus I can track the ROI easily.
SEO's POV: Right...only charged for traffic you get, until you're penalized and lose all your traffic.
~~~~~~~~~~~~~~~~~~~~~~~~~
Bottom line Red Flag: Hitting all of these keywords will be an easy, low cost, low resource, low risk endeavor.
Ecomm Manager's POV: Great!
SEO's POV: BS! (And that is putting it kindly).
~~~~~~~~~~~~~~~~~~~~~~~~~
I guess in conclusion my final thought is that as an agency it is our duty to vet these things out and advise our clients (which we did), but for so many sites that might not have an seo-savvy ecomm manager or a team of SEOs at their disposal, identifying the problems with these types of automated page creation tools can be a difficult task. Ultimately I think that the 'bottom line red flag' should always be used as a sniff test - and hopefully the outline above can serve as a quick guide for how to do some initial digging the next time you get pitched.
If you've had any experiences with similar services (good or bad) feel free to drop a comment in the notes section below. I'm curious to know how prevalent stuff like this really is - especially in light of the recent algo updates which seem like they were engineered to hit stuff like this head on.
* Note: Special thanks to @AnnieCushing, @WissamDandan, @aknecht and a few others who helped us find SpyOnWeb last night.