These are some basic steps which I feel every search engine optimizer must follow to optimize a website from scratch. If these issues are resolved at the start then you will not face any optimization issues later on.
Check for 404 pages/broken links on website Broken links are poison for any website. If you have access to a crawler then the first thing to do is check for broken links on the website. This will give you a clear picture of the navigation structure of the website as well as links which are broken. To save time you can also request the website owner to provide you access to their Google Webmaster account which will outline the complete crawl issues of the website. After you have the complete list of broken links on the website you should have them corrected as soon as possible. I have developed an online 404 checker for this purpose which is free 🙂 so please do use it.
Enable search engine friendly URLs on website
If the website URL structure is dynamic in nature i.e. have “?” and “&” in its URLs then you should try and have search engine friendly URLs enabled on the website. The benefit of these URLs is that they will be keyword rich and help optimize those pages on search engine results. Most websites nowadays already have these URLs enabled as support is available on both Linux and Windows hosting servers.
Check for duplicate pages on website and remove them Try and check for duplicate content by using moz or plagspotter. Both are paid however Moz offers a 30 days trial period. You can cancel the trial period after the completion of 30 days. Both these websites let you know if there are more instances of web pages which have the same content as yours and it helps you take action. Most of the time the duplicate content may be due to minor navigational issues in your website which can be resolved with some tweaks however these need to be catered to otherwise the website may never come high on search engine results and may even get banned.
Implement the canonical tag on website
As per Google definition
“A canonical page is the preferred version of a set of pages with highly similar content”
Google provides this one stop easy solution for website owners to implement the canonical tag and inform Google and other search engines which is the main version of the page the search engines should give priority when giving results. You can read more about canonical tags here and here. The best part of this solution is that it can help resolve your duplicate page content issues very quickly.
Make sure that the meta tags are unique for each page on website Most websites have nearly the same meta title, meta description and meta keywords on multiple pages which leads search engines to think of these pages as duplicates. Due to this search engines do not think highly of these pages and they do not come up on search results often. Try and create as unique meta content as possible related to the web page which will help your website in the long run.
Some times Google does not display the right title of the web page in its search listings. It does not mean that Google has not indexed your website. If you look at the Google cache of your web page you will see that it has indexed the page with the correct title.
This usually happens when Google feels that the page title is not descriptive enough of the content being presented on the web page. It also takes information from Dmoz however this depends whether your website is listed there or not. If it is listed on Dmoz then please add the following tag in the <head> section of your web page.
<meta name=”robots” content=”NOODP”>
This will tell Googlebot and other robots to ignore content from Dmoz and only use the title of the web page for display in it’s search listings.
Most directory and listing websites on the internet need to record how many times their listings have been viewed by visitors and how many have clicked on them. Not only that they need to discount the impressions and clicks of search engine crawlers and spam bots from the actual data so that their data is correct and they have a clear picture of the ROI they are generating.
The most obvious way would be to record all this information into the database however this would not be ideal in the following cases
1. If the same visitor refreshes the listing page several times the data would not be accurate as the code would run each time the page is refreshed.
2. The database size could grow at an exponential rate and even if it’s manageable the amount of redundant data would not help in identifying correct ROI.
3. Members and investors of the website would not get the complete picture and this may increase frustration levels.
Enter Google Analytics!
For e.g. if you are trying to record impressions on a business listing page you could put in the following code.
_gaq.push([‘_trackEvent’, ‘Business’, ‘Listing’, ‘Business Name here’]);
If you are trying to record the number of clicks for that business you can add the following line to your anchor tag (<a>)
_gaq.push([‘_trackEvent’, ‘Business’, ‘Click’, ‘Business Name here’]);
If you would like to know how you want to place this line in your anchor tag then please see below
The best part is that Google discounts redundant clicks and impressions as well as search engine crawlers and spam bots from its data so you can have the accurate picture.
My company’s SEO team lead came to me with an interesting problem and I thought I share it with everyone.
It so happens that his team were doing SEO related work on an e-commerce website and they were seeing a lot of the website pages indexed in Google which had the text
The page you're looking for doesn't exist. If you followed a link from elsewhere in the site, please contact us.
Return to home page
One look at the content on these pages and I assumed that it was their custom 404 page however Google thought otherwise. It was treating all these pages as duplicate content pages and the website ranking was getting affected by this.
As the website had not been developed by us we contacted the concerned technical team for that website and the client and informed them regarding this issue. They came back to us a couple of days later saying that these pages are 404 pages and this is a non issue for them. To put it short they did not believe us that their 404 pages could be the cause of the duplicate content problem.
I then decided to check the status that was being returned when anybody browsed the website and voila we got our answer. All the so called custom 404 pages returned a 200 status meaning they really existed. The web server should have sent out a 404 status if they really were 404 pages. This would have informed the bots which could then term those pages as ‘non-existent’ and the problem would have been resolved.
The pieces started to come together.
Google considered all these pages as actual pages and due to the same content being displayed on all of them the website was being penalized for duplicate content. The problem was more to do with the e-commerce software they were using to run their website as it did not send out the 404 status. It was simply displaying a page and saying that the page was not found.
This incident should be an eye opener for website owners, SEO’s and web developers alike as a simple mistake by your software could bring down your website rankings like nine pins.
Most of the times you will find that even when the website is live and running and there are no bugs/issues on the website you will see that not many people are buying the product/service you are offering. You may find that people are visiting your website but due to some reason the number of people converting is still low.
This is often a confusing issue as you don’t know what the exact problem is. Is the problem related to the website performance itself or is related to traffic?
If you go to your SEO team and ask them about website traffic they will provide all the traffic reports and the number of hits, pageviews etc and you won’t be able to point the finger at them as they have all the facts with them.
Then if you ask about website performance from development, they will point to website uptime, no issues/bugs on the website as well as 0 complaints by customers. You will again know that you can’t point the finger at them as they also have the facts with them.
What do you do?
You should do the following
You will HAVE to go through your website and try to identify pages where you think the website has the highest number of bounce rate i.e. pages where the customers go away from your website. If you think that the page(s) in question are not able to bring out the right message to the customer then you will need to make changes to them and experiment with them.
Google has provided this website optimizer tool to help you identify possible issues which you may be overlooking or which the customer may have been having a hard time finding.
You have to create a different URL for the page you are trying to experiment. Google will then use both the new and old versions of the same page and provide you with stats on which page converted better after a certain period .
The Google Web Optimizer helps you to make an informed decision based on the results it provides instead of you blindly making the change in the first place and then ruing your loss if the change did not go according to plan.
If you want to know when GoogleBot visited your website recently you can use any of the following ways to do so
1. Go through your web logs which Apache/IIS generate for your website. Scan for GoogleBot. A very crude way indeed.
2. If you have signed up for Google Sitemaps then you can login into your account and find out when did GoogleBot visit your website. Not only that you also get up to date indexed pages of your website in Google as well as any crawl errors (if any).
3. If you don’t have time for the above two then just go to GBotVisit. Enter your website URL and copy the code it generates into the footer of your website (its up to you where you want to place it). Now, you know in real time when GoogleBot visits your website.