Posts Tagged ‘search engines’

Preserve domain name after redirect with mod_rewrite and some useful mod rewrite redirect and other examples – Redirect domain without changing URL

Friday, July 11th, 2014

redirect_domain_name_without_changing_url_apache_rewrite_rule_preventing_host_in_ip_mod_rewrite
If you're a webhosting company sysadmin, sooner or later you will be asked by application developer or some client to redirect from an Apache webserver to some other webserver / URL's IP, in a way that the IP gets preserved after the redirect.

I'm aware of two major ways to do the redirect on webserver level:

1. To redirect From Apache host A to Webserver on host B using ReverseProxy mod_proxy

2. To use Mod Rewrite to redirect all client requests on host A to host B.

There is quite a lot to be said and is said and written online on using mod_rewrite to redirect URLs.
So in this article I will not say nothing new but just present some basic scenarios on Redirecting with mod rewrite and some use cases.
Hope this examples, will help some colleague sys-admin to solve some his crazy boss redirection tasks 🙂 I'm saying crazy boss because I already worked for a  start-up company which was into internet marketing and the CEO has insane SEO ideas, often impossible to achieve …

a) Dynamic URL Redirect from Apache host A to host B without changing domain name in browser URL and keeping everything after the query in

Lets say you want to redirect incoming traffic to DomainA to DomainB keeping whole user browser request, i.e.

Redirect:

http://your-domainA.com/whole/a/lot/of/sub/directory/query.php


Passthe the whole request including /whole/a/lot/of/sub/directory/query.php

so when Apache redirects to redirect to:

http://your-domainB.com/whole/a/lot/of/sub/directory/query.php

In browser 
To do it with Mod_Rewrite either you have to add in .htaccess mod_rewite rules:

RewriteEngine On
RewriteCond %{HTTP_HOST} ^your-domainA.com [OR]
RewriteCond %{HTTP_HOST} ^http://your-domainA.com
RewriteRule ^(.*) http://your-domainB.com/$1 [P]

or include this somewhere in VirtualHost configuration of your domain
 

Above mod_rewrite will make any request to your-domainA.com to forward to your-domainB.com while preserving the hostname in browser URL bar to old domain http://your-domainA.com, however still contet will be served by http://your-domainB.com
 

http://yourdomainA.com/YOUR-CUSTOM-REQUEST-ADDRESS


to redirect to

http://yourdomainB.com/YOUR-CUSTOM-REQUEST-ADDRESS


WARNING !!  If you're concerned about your SEO well positioning in search Engines, be sure to never ever use such redirects. Making such redirects will cause two domains to show up duplicate content
and will make Search Engines to reduce your Google, Yahoo, Yandex etc. Pagerank !!

Besides that such, redirect will use mod_rewrite on each and every redirect so from performance stand point it is a CPU killer (for such redirect using native mod_proxy ProxyPass is much more efficient – on websites with hundred of thousands of requests daily using such redirects will cause you to spend your  hardware badly  …)

P.S. ! Mod_Rewrite and Proxy modules needs to be previously enabled
On Debian Linux, make sure following links are existing and pointing to proper existing files from /etc/apache2/mods-available/ to /etc/apache2/mods-enabled

debian:~#  ls -al /etc/apache2/mods-available/*proxy*
-rw-r–r– 1 root root  87 Jul 26  2011 /etc/apache2/mods-available/proxy_ajp.load
-rw-r–r– 1 root root 355 Jul 26  2011 /etc/apache2/mods-available/proxy_balancer.conf
-rw-r–r– 1 root root  97 Jul 26  2011 /etc/apache2/mods-available/proxy_balancer.load
-rw-r–r– 1 root root 803 Jul 26  2011 /etc/apache2/mods-available/proxy.conf
-rw-r–r– 1 root root  95 Jul 26  2011 /etc/apache2/mods-available/proxy_connect.load
-rw-r–r– 1 root root 141 Jul 26  2011 /etc/apache2/mods-available/proxy_ftp.conf
-rw-r–r– 1 root root  87 Jul 26  2011 /etc/apache2/mods-available/proxy_ftp.load
-rw-r–r– 1 root root  89 Jul 26  2011 /etc/apache2/mods-available/proxy_http.load
-rw-r–r– 1 root root  62 Jul 26  2011 /etc/apache2/mods-available/proxy.load
-rw-r–r– 1 root root  89 Jul 26  2011 /etc/apache2/mods-available/proxy_scgi.load

debian:/etc/apache2/mods-avaialble:~# ls *proxy*
proxy.conf@  proxy_connect.load@  proxy_http.load@  proxy.load@


If it is is not enabled to enable proxy support in Apache on Debian / Ubuntu Linux, either create the symbolic links as you see them from above paste or issue with root:
 

a2enmod proxy_http
a2enmod proxy

 

b) Redirect Main Domain requests to other Domain specific URL
 

RewriteEngine On
RewriteCond %{HTTP_HOST} ^your-domainA.com
RewriteRule ^(.*) http://your-domainB.com/YOUR-CUSTOM-URL [P]

Note that no matter what kind of subdirectory you request on http://your-domain.com (lets say you type in http://your-domainA.com/My-monkey-sucks ) it will get redirected to:

http://your-domainB.com/YOUR-CUSTOM-URL

Sometimes this is convenient for SEO, because it can make you to redirect any requests (including mistakenly typed requests by users or Bot Crawlers to real existing landing page).

c) Redirecting an IP address to a Domain Name

This probably a very rare thing to do as usually a Domain Name is redirected to an IP, however if you ever need to redirect IP to Domain Name:

RewriteCond %{HTTP_HOST} ^##.##.##.##
RewriteRule (.*) http://your-domainB.com/$1 [R=301,L]

Replace ## with digits of your IP address, the is used to escape the (.) – dots are normally interpreted by mod_rewrite.

d) Rewritting URL extensions from .htm to .php, doc to docx etc.

Lets say you're updating an old website with .htm or .html to serve .php files with same names as old .htmls use following rewrite rules:. Or all your old .doc files are converted and replaced with .docx and you need to make Apache redirect all .doc requests to .docx.
 

Options +FollowSymlinks
RewriteEngine on
RewriteRule ^(.*).html$ $1.php [NC]

Options +FollowSymlinks
RewriteEngine on
RewriteRule ^(.*).doc$ $1.docx [NC]

The [NC] flag at the end means "No Case", or "case-insensitive"; Meaning it will not matter whether files are requested with capital or small letters, they will just show files if file under requested name is matched.

Using such a redirect will not cause Apache to redirect old files .html, .htm, .doc and they will still be accessible again creating duplicate content which will have a negavite impact on Search Engine Optimization.

The better way to do old extensioned files redirect is by using:

Options +FollowSymlinks
RewriteEngine on
RewriteRule ^(.+).htm$ http://your-domainB.com/$1.php [R,NC]

[R] flag would tell make mod_rewrite send HTTP "MOVED TEMPORARILY" redirection, aka, "302" to browser. This would cause search engines and other spidering entities will automatically update their links to the new locations.

e) Grabbing content from URL with Mod Rewrite and passing it to another domain

Lets say you want zip files contained in directory files/ to be redirected from your current webserver on domainA to domainB's download.php script and be passed as argument to the script

Options +FollowSymlinks
RewriteEngine on
RewriteRule ^files/([^/]+)/([^/]+).zip https://www.pc-freak.net/download.php?section=$1&file=$2 [R,NC]


f) Shortening URLs with mod_rewrite

This is ueful If you have a long URL address accessible via some fuzzy long hard to remember URL address and you want to make it acessible via a shorter URL without phyisally moving the files within a short named directory, do:

Options +FollowSymlinks
RewriteEngine On
RewriteRule ^james-brown /james-brown/files/download/download.php

Above rule would make requests coming to http://your-domain.com/james-brown?file=my.zip be opened via http://mysite/public/james-brown/files/download/download.php?file=my.zip

g) Get rid of the www in your domain name

Nowdays many people are used to typing www.your-domain.com, if this annoys you and you want them not to see in served URLs the annoying www nonsense, use this:

Options +FollowSymlinks
RewriteEngine on
RewriteCond %{http_host} ^www.your-domain.com [NC]
RewriteRule ^(.*)$ http://your-domain.com/$1 [R=301,NC]

That's mostly some common uses of mod rewrite redirection, there are thousands of nice ones. If you know others, please share?


References and thanks to:

How to redirect domain without changing the URL

More .htaccess tips and tricks – part 2

 

 

Improve Website Apache Webserver SEO without Website source code moficitations with Google PageSpeed module on Debian, Ubuntu, CentOS, Fedora and SuSE Linux

Thursday, December 18th, 2014

Improve-website-apache-webserver-seo-without-website-source-code-modifications-with-Google-PageSpeed-Apache-module

For hosting companies and even personal website speed performance becomes increasingly important factor that gives higher and higher weight on overall PageRank and is one of the key things for Successful Site Search Engine Optimization (positioning) in Search Engines of a not specially SEO friendly crafted website.

Virtually all Google / Yahoo / Bing,  Yahoo  etc. Search Engines give better pagerank to websites which load faster and has little or no downtimes, for the reason a faster loading time of a website pages means better user experience and is indicator that the website is well maintained. 

Often websites deployed written for purpose of a business-es or just community CMS / Blog Website Open Source systems such as Joomla, Drupal and WordPress by default are not made to provide fantastic speed right after deploy without install of custom plugins and website tuning, i.e.:

  • Content size optimization (gzipping)
  • More efficient way to deliver CSS / Javascript (MinifyJS / CSS files into single ones
  • HTML optimization
  • Stripping (useful) page Comments
  • Adding <head> if missing on pages etc.

. Therefore as I said in many of my previous LAMP Optimization articles page  (opening) speed could make really Bad Users / Clients experience when the site grows too big or is badly optimized it gives degraded page speed times (often page loads 20 / 30 seconds waiting for the page to load!). Having Pages lagging on big information sites or EShos has both Ruining Company's Image on the market and quickly convinces the user to use another service from the already thosands available and thus drives out (potential) customers.

As Programming code maintainance and improvement is usually very costly, companies that want to save money or can't afford it (because of the shrinking budgets dictacted by the global economic crisis), the best thing to do is to ask your sysadmin to Squeeze the Best out of the WebService and Servers without major (Backend Code) infrastructural changes.

To  Speed up Apache and create Proper Page Caching without installing on server external PHP Caching modules such as Eaccelerator  / PHP APC caching and without
extra CMS modules
such as lets say WordPress W3 Total Cache there is Google Develop Apache Webserver external module – PageSpeed.

Here is Google Pagespeed Module overview :
 

PageSpeed speeds up your site and reduces page load time. This open-source webserver module automatically applies web performance best practices to pages and associated assets (CSS, JavaScript, images) without requiring that you modify your existing content or workflow.


What does Apache Google PageSpeed actually does?
 

  • Automatic website and asset optimization
  • Latest web optimization techniques
  • 40+ configurable optimization filters
  • Free, open-source, and frequently updated
  • Deployed by individual sites, hosting providers, CDNs


1. Install PageSpeed on Debian / Ubuntu, deb derivatives) Linux

a) Download and install module 

On 64 bit deb based Linux:

cd /usr/local/src
wget https://dl-ssl.google.com/dl/linux/direct/mod-pagespeed-stable_current_amd64.deb 
dpkg -i mod-pagespeed-stable_current_amd64.deb
apt-get -f install


On 32 bit Linux:

cd /usr/local/src
wget https://dl-ssl.google.com/dl/linux/direct/mod-pagespeed-stable_current_i386.deb
dpkg -i 
direct/mod-pagespeed-stable_current_i386.deb
apt-get -f install


b) Restart Apache
 

sudo /etc/init.d/apache2 restart

Important files and folders placed on server by deb installer are:

/usr/bin/pagespeed_js_minify – binary that does Javascript minification
/etc/apache2/mods-available/pagespeed.conf – Pagespeed config
/etc/apache2/mods-available/pagespeed.load – Load module directives in Apache
/etc/cron.daily/mod-pagespeed – mod_pagespeed cron script for checking and installing latest updates.
/var/cache/mod_pagespeed – Mod Pagespeed cahing folder (useful to install memcached to increase even further caching performance)
/var/log/pagespeed – Directory to store pagespeed log files

 

2. Install PageSpeed on (RPM based CentOS, Fedora, RHEL / SuSE Linux)


RPM 64 bit package install:
 

rpm -Uvh https://dl-ssl.google.com/dl/linux/direct/mod-pagespeed-beta_current_x86_64.rpm

 


32 bit pack version:
 

rpm -Uvh https://dl-ssl.google.com/dl/linux/direct/mod-pagespeed-stable_current_i386.rpm


Modify pagespeed mod config 

Restart Apache

sudo /etc/init.d/httpd restart


Important config files and folders created during RPM install are:

  • /etc/cron.daily/mod-pagespeed : mod_pagespeed cron script for checking and installing latest updates.
  • /etc/httpd/conf.d/pagespeed.conf : The main configuration file for Apache.
  • /usr/lib/httpd/modules/mod_pagespeed.so : mod_pagespeed module for Apache.
  • /var/www/mod_pagespeed/cache : File caching direcotry for web sites.
  • /var/www/mod_pagespeed/files : File generate prefix for web sites.

3. Configuring Google PageSpeed module

 

To configure PageSpeed you can either edit the package installed bundled pagespeed.conf (/etc/apache2/mods-available/pagspeed.conf,  /etc/httpd/conf.d/pagespeed.conf) or insert configuration items inside Apache VirtualHosts config files or even if you need flexibility and you don't have straight access to Apache config files (on shared hosting servers where module is available) through .htaccess.
Anyways try to avoid adding pagespeed directives to .htaccess as it will be too slow and inefficient.

Configuration is managed by setting different so-called "Rewrite Levels". Default behavior is to use Level of "Corefilters.", a set of filters (module behavior configs) which according to Google is safe for use. PageSpeed Filters is a set of actions applied to Web Delivered files.

Default config setting is hence:
 

ModPagespeedRewriteLevel CoreFilters

Disabling default set of filters is done with:
 

ModPagespeedRewriteLevel PassThrough

"Corefilters" default filter set as of time of writting this article:
 

add_head
combine_css
convert_jpeg_to_progressive
convert_meta_tags
extend_cache
flatten_css_imports
inline_css
inline_import_to_link
inline_javascript
rewrite_css
rewrite_images
rewrite_javascript
rewrite_style_attributes_with_url

Complete documentation on Configuring PageSpeed Filters is here.

If caching is turned on, default PageSped caching is configured in /var/cache/mod_pagespeed/
Enabling someof the non-Corefilters that sometimes are useful for SEO (reduce of served / returned pagesize) are:
 

ModPagespeedEnableFilters pedantic,remove_comments

By default pagespeed does some things (such as inline_css, inline_javascript and rewrite_images (Optimize, removing Excess pixels).  My litle experience with pagespeed shows in some cases this could break websites), so I found for my case useful to disable some of the filters:

 

vim /etc/apache2/mods-available/pagespeed.conf

 

ModPagespeedDisableFilters rewrite_images,convert_jpeg_to_progressive,inline_css,inline_javascript

 

4. Testing if PageSpeed is Enabled pagespeed_admin

By default PageSpeed has Admin which by default is only allowed to be accessed from server localhost (127.0.0.1) to get basic statistics either install text browser like lynx / elinks or add more access IPs again in pagespeed config / vhosts pagespeed.conf include more Allow lines like below:

 

    <Location /pagespeed_admin>
        Order allow,deny
        Allow from localhost
        Allow from 127.0.0.1
        Allow from 192.168.1.1
        Allow from xxx.xxx.xxx.xxx

        #Allow from All
        SetHandler pagespeed_admin
    </Location>
    <Location /pagespeed_global_admin>
        Order allow,deny
        Allow from localhost
        Allow from 127.0.0.1

        Allow from 192.168.1.1
        Allow from xxx.xxx.xxx.xxx
        SetHandler pagespeed_global_admin
    </Location>

 

Once configured pagespeed_admin access it with favourite browser on:

http://127.0.0.1/pagespeed_admin
http://127.0.0.1/pagespeed_global_admin

improve-website-apache-webserver-seo-without-source-code-modifications-google-pagespeed_admin_panel

Other way to test it is enabled is by creating php file with good old <? phpinfo(); ?> – PHP stats enabled / disabled features code:

pagespeed-in-phpinfo-x-mod-pagespeed-output-screenshot-apache-webserver

I've also tested also pagespeed unstable release, but experienced some segmentation faults in both error.log and access.log so finally decided to keep using stable release.

PageSpeed is a great way to boost your server sites performance, however it comes on certain costs as expect your server CPU Load to jump drastically, (in my case it jumped more than twice), there are Linux servers where enabling the module could totally stone the servers, so before implementing the module on a Production system environment, always first test thouroughfully with loaded pagespeed on UAT (testing) environment with AB or Siege (Apache Benchmarking Tools).

How to set custom page titles in Joomla 1.5 manually for better SEO

Thursday, June 2nd, 2011

he Joomla CMS default behaviour is that Page titles of the Joomla Articles created are always set to the page Title assigned to each of the articles.

This is not very good behaviour in terms of SEO, as the page title of each link on the main page is different and there is no continuous repeating pattern in all of the joomla pages.
Everyone that has even basic idea of SEO knows that page titles are very important weight factor to make indexing inside Search Engines succesful.

There is a well know SEO rule which is the more reoccuring pattern one has in his page titles, more is stressed on the keywords contained in the title.
As I said for some weird reason Joomla has no common page Title for all my the created Article pages linked via the Main Menu*

Thus in order to improve this bad default Joomla SEO behaviour one has to change the default auto assigned titles for created pages, manually.

Two things are necessery to change each of the joomla already existing TITLES.

1. Go to each of the pages (.e.g. Home etc.) and change the Parameters System Page Title settings

After logging in with administrator in Joomla, navigate to Menus -> Main Menu*

Further on choose a menu item from all your existing items, let’s say Home and click on it.

On the left side below the Save, Apply, Close and Help buttons you will notice the menus:

Parameters (Basic), Parameters (Component), Parameters (System)

When clicked on Parameters (System) a submenu will appear:
Joomla Main Menu Parameters System Page Title better SEO

Above is a screenshot of the up-described Parameters (System) [Page Title] location

You need to change where it reads on the screenshot CHANGE THE TITLE HERE !!!!!! 😉

After entering your own desired page title go and save the article via the Apply or Save button (also visible in the screenshot).

Now as the custom Page Title is set, next step is to enable the custom Page Title for the respective Article in Article Manager

2. Enable custom Page Title for created pages in Joomla

Go to the Article Manager by following the menus:

Content -> Article Manager

Select the Article of which you want to change the Page Title to some custom text and click over it.

As the article opens for edit in an html editor, navigate to Parameters (Advanced) tab and therein change the Show Title from default setting value:
Use Global
to
Yes

Once again use the Save or Apply button to confirm the new settings and open your website in a new tab, try to browse and check the title of the articles parameters just edited. It should show up in the Title (page heading) the custom input Title.

Now repeat the same procedure for all pages (Articles), existing in Joomla to attune the Page Titles to some Google friendly strings and enjoy the better Search engine indexing which should likely follow.

How to add Apache 301 redirect to VirtualHost in Apache

Sunday, September 25th, 2011

I’ve had two domain names which were pointing to the same website content.
As one can read in any SEO guide around this is a really bad practice as search engines things automatically there is a duplicate site content and this has automatically a negative effect on the site pagerank.
To deal with situation where multiple domains are pointing to the same websites its suggested by many SEO specialists that a 301 redirect is created from all the domain websites to a single website domain which will open the actual website.

Making the 301 direct domain from the sample domain my-redirect-domain.com to www.mydomain.com can be done with a virtualhost dfefinition in either httpd.conf or with the respective file containing the domain virtualhost definitions:
Here is the exact VirtualHost code I use to make a 301 redirect.

<VirtualHost *>  ServerAdmin support@mydomain.com  ServerName my-redirected-domain.com
ServerAlias my-redirected-domain.com www.my-redirected-domain.com
RewriteEngine on RewriteRule ^/(.*) http://www.mydomain.com/$1 [L,R=301]
</VirtualHost>

After placing the VirtualHost redirect, an apache redirect is required.
Further on when a Gooogle or Yahoo Bot visits the website and does any request to my-redirect-domain.com or www.my-redirect-domain.com , they will be redirected with a 301 reuturned code to www.mydomain.com

This kind of redirect however can have a negative impact on the Apache CPU use (performance), especially if the my-redirect-domain.com is high traffic domain. This is because the redirect is done with mod_rewrite.

Therefore it might be better on high traffic domains to create the mod_rewrite redirect by using a vhost like:

<VirtualHost *>
ServerAdmin support@mydomain.com
ServerName my-redirected-domain.com
Redirect 301 / http://www.mydomain.com/
</VirtualHost>

The downside of using the Apache 301 redirect capabilities like in the above example is that any passed domain urls like let’s say http://www.my-redirected-domain.com/support/ would not be 301 redirected to http://www.mydomain.com/support/ but instead the redirect will be done straight to http://www.mydomain.com/

Recovering long lost website information (data) with wayback machine

Monday, May 9th, 2011

Wayback machine, see 2 years old website from cache service

I needed a handy way to recover some old data of an expired domain containing a website, with some really imprtant texts.
The domains has expired before one year and it was not renewed for the reason that it’s holder was not aware his website was gone. In the meantime somebody registered this domain as a way to generate ads profit from it the website was receiving about 500 to 1000 visitors per day.
Now I have the task to recover this website permanently lost from the internet data. I was not able to retrieve anything from the old domain name be contained via google cache, yahoo cache, bing etc.
It appears most of the search engines store a cached version of a crawled website for only 34 months. I’ve found also a search engine gigablast which was claimed to store crawled website data for 1 year, but unfortunately gigablast contained not any version of the website I was looking for.Luckily (thanks God) after a bit of head-banging there I found a website that helped me retrieve at least some parts from the old lost website.

The website which helped me is called WayBack Machine

The Wayback Machine , guys keeps website info snapshots of most of the domain names on the internet for a couple of years back, here is how wayback machine website describes its own provided services:

The Internet Archive's Wayback Machine puts the history of the World Wide Web at your fingertips.

Another handy feature wayback machine provides is checking out how certain websites looked like a couple of years before, let’s say you want to go back in the past and see how yahoo’s website looked like 2 years ago.

Just go to web.archive.org and type in yahoo and select a 2 years old website snapshot and enjoy 😉

It’s really funny how ridiculous many websites looked like just few years from now 😉