Saturday, April 24, 2010

Spider Crawl Data

So I've been blogging about page speed, loading time, or down-load time and how Google rates web sites. Well what is up with Google's spider. Google Bot has been hitting my site pretty hard. Now as I look at the Crawl errors report I find 12 pages that Google indicates as 'unavailable' ~ maybe the server is getting over loaded.

Now I have to ask if Google is the one that's slowing down my site as they continually pull down page after page. AWSTATS indicates that Googlebot has used up 198.13MBytes of bandwidth so far this month. The Yahoo spider was the next worse offender with 125MB of bandwidth used.

Was this a Catch 22 situation? I was updating hundreds of pages to reduce down-load times by adding a new search bar. GoogleBot picked up on the page up-dates and tried to retrieve them all within a few days ultimately slowing down my server. What is up with that? It may well be that I'll have to wait a few weeks before the data comes back to normal.

I mean look at the change in the time it required to download a page; a low of 68mS all the way up to 1,608mS. And if that data is true then why does the site performance tab indicate a 3.7 second average load time while the crawl rate indicates a 186mS average load time [0.186 seconds] ~ that's a pretty big difference..... I can only conclude that the page takes .186 seconds to download, but the Java Script for Google Analytics and any pic file from Google Picasa take up the rest of the time.

Friday, April 23, 2010

AMDs 6-Core Processor

Did I just read an article saying AMD would come out with a new six-core processor for a mere $200. I think it may be time to up-grade my PC and get a new one. What; I'll have to upgrade to the new MS OS, what ever that is. I'll have to go out and find out what people are saying about OS7 [Windows 7]

Windows 7
-- Improved performance with multi-core processors
-- Improved boot performance
-- Does not yet support USB 3.0 [available with SP1 late 2010]
Desktop
-- No USB 3.0 support yet [that I found]
-- PCI Express 3.0 not yet been released [later this year]
-- Serial ATA revision 3.0 [no products found yet]
-- DDR3  [mainstream support]
Ok I guess this blog posting is pointless. I would like the 6-core processor, but not at any cost. My office applications wouldn't even really see any advantage any way. But more importantly I find no support for USB 3.0 or Serial ATA version 3.0. I also don't want Windows 7 before the first service pack.

So why purchase a PC that will be out-of-date 6 months after I get it. By next year all USB drives will be USB 3.0. I'm not sure if I would wait for PCIe 3.0 support, I'd be waiting another year. So no new PC for me this year, I'd just end up giving it away 6 months later.

I did note that the AMD3 socket processor style is out. ~
Six core AMD Processors; Phenom II X6 1055T and Phenom II X6 1090T

Thursday, April 22, 2010

Web Site Performance

I've got a few more weeks of data from Google's Site Performance. The data seems to indicate that the site speed is about the same it was in the last posting. However now Google indicates I'm 7% slower, so I have to assume that the other sites being used as a comparison are also getting faster? Because it still indicates 3.7 seconds to download a page.


Their text; "On average, pages in your site take 3.7 seconds to load (updated on Apr 20, 2010). This is slower than 62% of sites. These estimates are of medium accuracy (between 100 and 1000 data points). The chart below shows how your site's average page load time has changed over the last few months. For your reference, it also shows the 20th percentile value across all sites, separating slow and fast load times.".

So far 833 pages have received the newer smaller html code for Google's Search Bar, but I guess the 1k of text reduction per page doesn't seem to be helping any. Looks like this is a scrolling graph, as Nov has fallen off the end and replaced by newer data in Apr. The Page Speed suggestions do not seem to be updating, as Google still shows the pages and suggestions as before.

Thursday, April 15, 2010

Web SIte Speed Enhancments

So to follow up with 3 of the last four postings regarding page downloading times and so on. I'll go ahead and detail a few of the things I've been doing to the web site to speed up down-load times. Or really to decrease page down-load times, depending on the page that received a change or not. I really can't make a single change that would effect the entire web site, each page is completely separate from another page.

So the new Google search bar is now on 684 pages. Each page that gets the new search code sees a reduction in html code or text of 1,480 characters [1,480 bytes] ~ I started replacing the search bar code a few days before I posted about it with Custom Search Bar. Just 1k Byte may not sound like a lot of data, but in fact it is when you consider how often these pages are down loaded, perhaps 15,000 page requests per day. It's a big saving on server bandwidth [over time] and Google sees the page as 1K smaller too, which was the point. People really pay for server bandwidth by the month so my reduction would be 400,000 pages x 1,480 Bytes, once all the pages get switched over. The second benefit is that the old search code from Google used an Engineering site logo which Google saw as another DNS look up that was a drain on the page loading time [logo stored off-site].  So this change saves the site 1 DNS look up and 1KB per page.

Using data from Feedburner [the blog feed] and Adsense [advertiser] I determined that there were not that many people reading the blog as a news feed. Plus the news feed was not generating any revenue, so I decided to remove the feedburner banners from the web site. Right, why publicize; the banners take up space, slow the page down and produce no income from the site. In addition the banners required 663 characters of html text and required an additional DNS look-up. The down side is that the banners were only on about 6 pages, so the savings is small, but those 6 pages should make the entire site appear faster [to a small degree].

Five gif files have been removed from the site, two were reinserted into this blog. The attached graphic shows monthly traffic to the web site for 2009. In addition to the page losing the graphic and seeing the size reduction this blog gets a link from the web site indicating the new location of the graphic. The 5 pages also no longer require another DNS look-up because the graphics were out on Google Picasa. Now the FAQ pages never received that many hits and the gif's were out on Picasa so my server sees no change. However Google will see the loss of a DNS look-up and the disappearance of five 80K Byte pic files.

In addition to removing those pic files I also reduced the size of another 14 gif files, saving between 20 and 30K Bytes per file. Yet another small change, but these files were local so the server will also see a reduction.

Any single change is small but the aggregate speed increase to the web site should help. I'll find out in a few weeks when Google up-dates the Site Performance report again. It's hard to tell, but this is Search Engine Optimization [SEO] because Google uses down-load time as part of its Page Rank algorithm.

Wednesday, April 14, 2010

Google now ranks pages by Speed

Seems hard to believe, but they do indicate that one of the 200 different criteria for Page Rank is page loading time. Recall that just a few posts ago I indicated that their own speed rating was complaining about their own Google products [Web Site Speed Performance] being used on my pages. Even back in 2007 I had complained about the (then new) W3C strict coding style with the Physical Page Size post, as even the smallest function required a large amount of coding.

Anyway most of the comments (including mine) to Google's blog posting were negative, and for good reason. How do you trade off page content with page loading speed. Many people mentioned Google Analytics code [tracking] or Adsense code [ads] as issues with loading time. However there were two comments I would like to bring over from two different posters [each from a different Google Blog];

"Doesn't this punish the small operator who has less control over their, usually shared, hosting? Or those in countries that have lesser infrastructure? At the same time, allowing bigger business to throw money at the speed problem and gain a better ranking?"

"With the recent court ruling with the FCC vs. Comcast speed might be tiered or throttled in the future. Is that a concern?"

Anyway it's already been said on those blogs. It's the Gmail Buzz debacle 1 month later, I just don't get it ~ as I remove Google product the rest of the night.

The attached graphic is daily stats for the engineering website for March 2010 [generated by AWSTATS]. Nice to look at, but the real reason it's here is because it was just deleted from my site ~ replaced with a link to this posting. It was a large graphic that required an additional DNS look-up [because it was located out on Google Picasa]. The next few posts will also contain one of these FAQ pics as they are moved off the site to increase the speed of those pages. The pics also get dumped from Picasa.

Monday, April 05, 2010

Blogging and Feed Stats

So it's been awhile sense I spoke about the benefits of blogging, so it must be time to address the issue again.

Much of this blog deals with web master stuff, SEO techniques and web analytics. But almost any topic is fair game.

The first reason is to bring in traffic to your web site [Engineering Buses]. This particular blog brings in 2 or 3 visitors a day to the web site. Sixty five percent of that incoming traffic is from new visitors. Now that may not sound like a lot of people, but it's still new traffic from people that may not have otherwise found my web site. So in a sense a blog is like free advertising. The other blog which only relates to new page additions to the web site brings in twice the number of visitors.

The second reason, at least for me, is that I can blog about any topic. So I can cover topics I would not otherwise address. Remember each blog posting is just like a web page, so I can blog and generate a new web page about any topic which wouldn't fit or relate to the web site. There are currently 528 posting in this blog and another 212 posting in the other blog ~ 740 additional web pages.

Of course you don't even have to visit the blog, you can read it as a blog feed. The attached graphic is the blog feed stats from people reading the feed generated by feedburner. You can access the feed by clicking on the rotating Feeds banner to the left.

Saturday, April 03, 2010

Web Site Speed Performance

I figured I would follow yesterdays posting regarding adding the new, smaller, script for the Google search bar with how Google sees my site when down-loading. Remember their web spider, GoogleBot, reads 500 of my pages every day. See a previous blog posting on Special crawl setting. So Google would know if my server was slow or not.

Here is what Google had to say; "Performance overview
On average, pages in your site take 3.7 seconds to load (updated on Mar 26, 2010). This is slower than 55% of sites. These estimates are of medium accuracy (between 100 and 1000 data points). The chart above shows how your site's average page load time has changed over the last few months. For your reference, it also shows the 20th percentile value across all sites, separating slow and fast load times."

Google Webmaster Tools gives a lot of page examples and what I could do to speed them up.
Their first suggestion is to 'Enable qzip compression' to reduce the page size. That's a nice idea but it makes working on the page a bit hard. Why don't I just save the 2k and continue to replace the Google search bar. I mean I am careful about up-loading large graphic files. In fact for pic files that can't be reduce, get uploaded to GoogleSites, and I only use a link to the file.

Their second suggestion is to 'Minimize DNS lookups', well guess what the DNS lookups are being used to access Google products. There are three common look-ups that they are referring to.
A logo used with the old Google search bar, which gets removed as the new search bar replaces it.
The Google Analytics code that I use as the site counter, provided by Google.
Finally, Google is complaining about pic files that I'm storing on Google Picasa that it has to down load. I just posted about Google Picasa off-line the other day too and how I used Picasa to save bandwidth.

So there are three of the four things Google says is slowing down my site, and there all Google products, does that make any sense. The fourth compression issue may not be an issue at all if the rest of the Google code on my page was a tad bit smaller. For example the new search bar code that is much smaller than it's been over the last five years. The adsense code also got smaller a few years ago, but could also be smaller.

Friday, April 02, 2010

Custom Search Bar

I started to add a new Google search bar to the web site. The new version of the search bar replaces the current one already used on the site. I'm not really sure when Google came out with the new code. I'm also not real keen with the reduction in options, but that's another story.

The code for the search bar is only 430 characters, while the current version used contained 1,910 characters. That's a reduction of 1,480 characters per html page [depending on the search bar used]. So a character is one byte of text, or 8 bits of data. Eight bits x 1,480 bytes = 11,840 bits per page ~ that's the size of a pic file.
Say 10,000 page views a day x 11,000 bits, that's 110M Bytes per day.

Now I've only started to change the search bar on a few pages so far, so I'll be changing the code the rest of the year. The pages with the search bar at the top center of the page keep the same location. While the Dictionary style pages with a side bar will have their search bar moved up to the top of the side bar. Although you should keep the search bar in the same place for all your pages so people can't find it. These pages are having it moved to the top, just to the right.

I may be saving 110MB of bandwidth/day, but the pages are getting larger as the site grows. So in reality I will not see any reduction in bandwidth, maybe just keeping the bandwidth around the 13G Byte number. Take note of the last posting to see the increase in bandwidth over the last month. Remember it's nice to load up a page with graphics or what ever but keep your visitors in mind and keep the page-loading time down. Otherwise you my have people clicking away because the page takes to long to load.

Today I'll add the new search bar to the few dozen pages that get down-loaded the most, that way I'll see an immediate reduction in bandwidth. The other pages can wait until they need some other up-date.
Graphic; US Coast Guard HH-65C helicopters.

Thursday, April 01, 2010

SEO stuff really work

The numbers are in from last month, so I guess I should post them. The numbers seems to be on the increase, more than I figured. Although I can always predict the out-come based on the average numbers ~ maybe 10,000 visits per day, and 5,000 on the week-ends. However I did not expect to see the large jump in page views, which is finally up to the numbers back in 2006. I guess the trick now is to keep the numbers up there~

How to read the data:


Server Bandwidth:
The lowest curve is server bandwidth and does not relate to the other numbers on the chart. The bandwidth is hovering around 148,000 [in the graph] but really equates to 14G Bytes as the numbers were changed to fit the graph. I track bandwidth just to make sure the server does not see a heavy load.

Unique Visits:
Are visits from a computer within a month, but any one computer is only counted one time. If any one computer returns for a second visit it's counted by the Visits curve.

Visits:
A site visit is registered each time a person visits the site within a month and each time the person returns to the site. Site Visits should always be equal to or greater than Unique Visits.

Page Views:
Are the number of pages a person views per month, regardless of how many times the visitor returns to the web site. Page Views should always be equal to or greater than Site Visits. Page views are really the only data point that is falling. Page Views is related to Bounce Rate, which is the percentage a person visits one page and then leaves the site.

Another way to see the same data, as site visits, or number of visits ~ so a comparison can be made year over year. This chart makes it easy to see that site visits are higher than any other month and any previous year.
2005 was the year I started to follow Search Engine Optimization [SEO] techniques. I guess the SEO stuff really works.