Posts Tagged ‘browser’

Improve wordpress admin password encryption authentication keys security with WordPress Unique Authentication Keys and Salts

Friday, October 9th, 2020

wordpress-improve-security-logo-linux

Having a wordpress blog or website with an admistrator and access via a Secured SSL channel is common nowadays. However there are plenty of SSL encryption leaks already out there and many of which are either slow to be patched or the hosting companies does not care enough to patch on time the libssl Linux libraries / webserver level. Taking that in consideration many websites hosted on some unmaintained one-time run not-frequently updated Linux servers are still vulneable and it might happen that, if you paid for some shared hosting in the past and someone else besides you hosted the website and forget you even your wordpress installation is still living on one of this SSL vulnerable hosts. In situations like that malicious hackers could break up the SSL security up to some level or even if the SSL is secured use MITM (MAN IN THE MIDDLE) attack to simulate your well secured and trusted SSID Name WIFi network to  redirects the network traffic you use (via an SSL transparent Proxy) to connect to WordPress Administrator Dashbiard via https://your-domain.com/wp-admin. Once your traffic is going through the malicious hax0r even if you haven't used the password to authenticate every time, e.g. you have saved the password in browser and WordPress Admin Panel authentication is achieved via a Cookie the cookies generated and used one time by Woddpress site could be easily stealed one time and later from the vicious 1337 h4x0r and reverse the hash with an interceptor Tool and login to your wordpress …

Therefore to improve the wordpress site security it very important to have configured WordPress Unique Authentication Keys and Salts (known also as the WordPress security keys).

They're used by WordPress installation to have a uniquely generated different key and Salt from the default one to the opened WordPress Blog / Site Admin session every time.

So what are the Authentication Unique Keys and Salts and why they are Used?

Like with almost any other web application, when PHP session is opened to WordPress, the code creates a number of Cookies stored locally on your computer.

Two of the cookies created are called:

 wordpress_[hash]
wordpress_logged_in_[hash]

First  cookie is used only in the admin pages (WordPress dashboard), while the second cookie is used throughout WordPress to determine if you are logged in to WordPress or not. Note: [hash] is a random hashed value typically assigned to your session, therefore in reality the cookies name would be named something like wordpress_ffc02f68bc9926448e9222893b6c29a9.

WordPress session stores your authentication details (i.e. WordPress username and password) in both of the above mentioned cookies.

The authentication details are hashed, hence it is almost impossible for anyone to reverse the hash and guess your password through a cookie should it be stolen. By almost impossible it also means that with today’s computers it is practically unfeasible to do so.

WordPress security keys are made up of four authentication keys and four hashing salts (random generated data) that when used together they add an extra layer to your cookies and passwords. 

The authentication details in these cookies are hashed using the random pattern specified in the WordPress security keys. I will not get into too much details but as you might have heard in Cryptography Salts and Keys are important – an indepth explanation on Salts Cryptography (here). A good reading for those who want to know more on how does the authentication based and salts work is on stackexchange.

How to Set up Salt and Key Authentication on WordPress
 

To be used by WP Salts and Key should be configured under wp-config.php usually they look like so:

wordpress-website-blog-salts-keys-wp-config-screenshot-linux

!!! Note !!!  that generating (manually or generated via a random generator program), the definition strings you have to use a random string value of more than 60 characters to prevent predictability 

The default on any newly installed WordPress Website is to have the 4 definitions with _KEY and the four _SALTs to be unconfigured strings looks something like:

default-WordPress-security-keys-and-salts-entries-in-wordPress-wp-config-php-file

Most people never ever take a look at wp-config.php as only the Web GUI Is used for any maintainance, tasks so there is a great chance that if you never heard specifically by some WordPress Security Expert forum or some Security plugin (such as WP Titan Anti Spam & Security) installed to report the WP KEY / SALT you might have never noticed it in the config.

There are 8 WordPress security keys in current WP Installs, but not all of them have been introduced at the same time.
Historically they were introduced in WP versions in below order:

WordPress 2.6: AUTH_KEY, SECURE_AUTH_KEY, LOGGED_IN_KEY
WordPress 2.7: NONCE_KEY
WordPress 3.0: AUTH_SALT, SECURE_AUTH_SALT, LOGGED_IN_SALT, NONCE_SALT

Setting a custom random generated values is an easy task as there is already online Wordpress Security key Random generator.
You can visit above address and you will get an automatic randomly generated values which could be straight copy / pasted to your wp-config.php.

Howeever if you're a paranoic on the guessability of the random generator algorithm, I would advice you use the generator and change some random values yourself on each of the 8 line, the end result in the configuration should be something similar to:

 

define('AUTH_KEY',         '|w+=W(od$V|^hy$F5w)g6O-:e[WI=NHY/!Ez@grd5=##!;jHle_vFPqz}D5|+87Q');
define('SECURE_AUTH_KEY',  'rGReh.<%QBJ{DP )p=BfYmp6fHmIG~ePeHC[MtDxZiZD;;_OMp`sVcKH:JAqe$dA');
define('LOGGED_IN_KEY',    '%v8mQ!)jYvzG(eCt>)bdr+Rpy5@t fTm5fb:o?@aVzDQw8T[w+aoQ{g0ZW`7F-44');
define('NONCE_KEY',        '$o9FfF{S@Z-(/F-.6fC/}+K 6-?V.XG#MU^s?4Z,4vQ)/~-[D.X0<+ly0W9L3,Pj');
define('AUTH_SALT',        ':]/2K1j(4I:DPJ`(,rK!qYt_~n8uSf>=4`{?LC]%%KWm6@j|aht@R.i*ZfgS4lsj');
define('SECURE_AUTH_SALT', 'XY{~:{P&P0Vw6^i44Op*nDeXd.Ec+|c=S~BYcH!^j39VNr#&FK~wq.3wZle_?oq-');
define('LOGGED_IN_SALT',   '8D|2+uKX;F!v~8-Va20=*d3nb#4|-fv0$ND~s=7>N|/-2]rk@F`DKVoh5Y5i,w*K');
define('NONCE_SALT',       'ho[<2C~z/:{ocwD{T-w+!+r2394xasz*N-V;_>AWDUaPEh`V4KO1,h&+c>c?jC$H');

 


Wordpress-auth-key-secure-auth-salt-Linux-wordpress-admin-security-hardening

Once above defines are set, do not forget to comment or remove old AUTH_KEY / SECURE_AUTH_KEY / LOGGED_IN_KEY / AUTH_SALT / SECURE_AUTH_SALT / LOGGED_IN_SALT /NONCE_SALT keys.

The values are configured one time and never have to be changed, WordPress installation automatic updates or Installed WP Plugins will not tamper the value with time.
You should never expand or show your private generated keys to anyone otherwise this could be used to hack your website site.
It is also a good security practice to change this keys, especially if you have some suspects someone has somehow stolen your wp-onfig keys. 
 

Closure

Having AUTH KEYs and Properly configured is essential step to improve your WordPress site security. Anytime having any doubt for a browser hijacked session (or if you have logged in) to your /wp-admin via unsecured public Computer with a chance of a stolen site cookies you should reset keys / salts to a new random values. Setting the auth keys is not a panacea and frequent WP site core updates and plugins should be made to secure your install. Always do frequent audits to WP owned websites with a tool such as WPScan is essential to keep your WP Website unhacked.

 

 

Block Web server over loading Bad Crawler Bots and Search Engine Spiders with .htaccess rules

Monday, September 18th, 2017

howto-block-webserver-overloading-bad-crawler-bots-spiders-with-htaccess-modrewrite-rules-file

In last post, I've talked about the problem of Search Index Crawler Robots aggressively crawling websites and how to stop them (the article is here) explaning how to raise delays between Bot URL requests to website and how to completely probhit some bots from crawling with robots.txt.

As explained in article the consequence of too many badly written or agressive behaviour Spider is the "server stoning" and therefore degraded Web Server performance as a cause or even a short time Denial of Service Attack, depending on how well was the initial Server Scaling done.

The bots we want to filter are not to be confused with the legitimate bots, that drives real traffic to your website, just for information

 The 10 Most Popular WebCrawlers Bots as of time of writting are:
 

1. GoogleBot (The Google Crawler bots, funnily bots become less active on Saturday and Sundays :))

2. BingBot (Bing.com Crawler bots)

3. SlurpBot (also famous as Yahoo! Slurp)

4. DuckDuckBot (The dutch search engine duckduckgo.com crawler bots)

5. Baiduspider (The Chineese most famous search engine used as a substitute of Google in China)

6. YandexBot (Russian Yandex Search engine crawler bots used in Russia as a substitute for Google )

7. Sogou Spider (leading Chineese Search Engine launched in 2004)

8. Exabot (A French Search Engine, launched in 2000, crawler for ExaLead Search Engine)

9. FaceBot (Facebook External hit, this crawler is crawling a certain webpage only once the user shares or paste link with video, music, blog whatever  in chat to another user)

10. Alexa Crawler (la_archiver is a web crawler for Amazon's Alexa Internet Rankings, Alexa is a great site to evaluate the approximate page popularity on the internet, Alexa SiteInfo page has historically been the Swift Army knife for anyone wanting to quickly evaluate a webpage approx. ranking while compared to other pages)

Above legitimate bots are known to follow most if not all of W3C – World Wide Web Consorium (W3.Org) standards and therefore, they respect the content commands for allowance or restrictions on a single site as given from robots.txt but unfortunately many of the so called Bad-Bots or Mirroring scripts that are burning your Web Server CPU and Memory mentioned in previous article are either not following /robots.txt prescriptions completely or partially.

Hence with the robots.txt unrespective bots, the case the only way to get rid of most of the webspiders that are just loading your bandwidth and server hardware is to filter / block them is by using Apache's mod_rewrite through

 

.htaccess


file

Create if not existing in the DocumentRoot of your website .htaccess file with whatever text editor, or create it your windows / mac os desktop and transfer via FTP / SecureFTP to server.

I prefer to do it directly on server with vim (text editor)

 

 

vim /var/www/sites/your-domain.com/.htaccess

 

RewriteEngine On

IndexIgnore .htaccess */.??* *~ *# */HEADER* */README* */_vti*

SetEnvIfNoCase User-Agent "^Black Hole” bad_bot
SetEnvIfNoCase User-Agent "^Titan bad_bot
SetEnvIfNoCase User-Agent "^WebStripper" bad_bot
SetEnvIfNoCase User-Agent "^NetMechanic" bad_bot
SetEnvIfNoCase User-Agent "^CherryPicker" bad_bot
SetEnvIfNoCase User-Agent "^EmailCollector" bad_bot
SetEnvIfNoCase User-Agent "^EmailSiphon" bad_bot
SetEnvIfNoCase User-Agent "^WebBandit" bad_bot
SetEnvIfNoCase User-Agent "^EmailWolf" bad_bot
SetEnvIfNoCase User-Agent "^ExtractorPro" bad_bot
SetEnvIfNoCase User-Agent "^CopyRightCheck" bad_bot
SetEnvIfNoCase User-Agent "^Crescent" bad_bot
SetEnvIfNoCase User-Agent "^Wget" bad_bot
SetEnvIfNoCase User-Agent "^SiteSnagger" bad_bot
SetEnvIfNoCase User-Agent "^ProWebWalker" bad_bot
SetEnvIfNoCase User-Agent "^CheeseBot" bad_bot
SetEnvIfNoCase User-Agent "^Teleport" bad_bot
SetEnvIfNoCase User-Agent "^TeleportPro" bad_bot
SetEnvIfNoCase User-Agent "^MIIxpc" bad_bot
SetEnvIfNoCase User-Agent "^Telesoft" bad_bot
SetEnvIfNoCase User-Agent "^Website Quester" bad_bot
SetEnvIfNoCase User-Agent "^WebZip" bad_bot
SetEnvIfNoCase User-Agent "^moget/2.1" bad_bot
SetEnvIfNoCase User-Agent "^WebZip/4.0" bad_bot
SetEnvIfNoCase User-Agent "^WebSauger" bad_bot
SetEnvIfNoCase User-Agent "^WebCopier" bad_bot
SetEnvIfNoCase User-Agent "^NetAnts" bad_bot
SetEnvIfNoCase User-Agent "^Mister PiX" bad_bot
SetEnvIfNoCase User-Agent "^WebAuto" bad_bot
SetEnvIfNoCase User-Agent "^TheNomad" bad_bot
SetEnvIfNoCase User-Agent "^WWW-Collector-E" bad_bot
SetEnvIfNoCase User-Agent "^RMA" bad_bot
SetEnvIfNoCase User-Agent "^libWeb/clsHTTP" bad_bot
SetEnvIfNoCase User-Agent "^asterias" bad_bot
SetEnvIfNoCase User-Agent "^httplib" bad_bot
SetEnvIfNoCase User-Agent "^turingos" bad_bot
SetEnvIfNoCase User-Agent "^spanner" bad_bot
SetEnvIfNoCase User-Agent "^InfoNaviRobot" bad_bot
SetEnvIfNoCase User-Agent "^Harvest/1.5" bad_bot
SetEnvIfNoCase User-Agent "Bullseye/1.0" bad_bot
SetEnvIfNoCase User-Agent "^Mozilla/4.0 (compatible; BullsEye; Windows 95)" bad_bot
SetEnvIfNoCase User-Agent "^Crescent Internet ToolPak HTTP OLE Control v.1.0" bad_bot
SetEnvIfNoCase User-Agent "^CherryPickerSE/1.0" bad_bot
SetEnvIfNoCase User-Agent "^CherryPicker /1.0" bad_bot
SetEnvIfNoCase User-Agent "^WebBandit/3.50" bad_bot
SetEnvIfNoCase User-Agent "^NICErsPRO" bad_bot
SetEnvIfNoCase User-Agent "^Microsoft URL Control – 5.01.4511" bad_bot
SetEnvIfNoCase User-Agent "^DittoSpyder" bad_bot
SetEnvIfNoCase User-Agent "^Foobot" bad_bot
SetEnvIfNoCase User-Agent "^WebmasterWorldForumBot" bad_bot
SetEnvIfNoCase User-Agent "^SpankBot" bad_bot
SetEnvIfNoCase User-Agent "^BotALot" bad_bot
SetEnvIfNoCase User-Agent "^lwp-trivial/1.34" bad_bot
SetEnvIfNoCase User-Agent "^lwp-trivial" bad_bot
SetEnvIfNoCase User-Agent "^Wget/1.6" bad_bot
SetEnvIfNoCase User-Agent "^BunnySlippers" bad_bot
SetEnvIfNoCase User-Agent "^Microsoft URL Control – 6.00.8169" bad_bot
SetEnvIfNoCase User-Agent "^URLy Warning" bad_bot
SetEnvIfNoCase User-Agent "^Wget/1.5.3" bad_bot
SetEnvIfNoCase User-Agent "^LinkWalker" bad_bot
SetEnvIfNoCase User-Agent "^cosmos" bad_bot
SetEnvIfNoCase User-Agent "^moget" bad_bot
SetEnvIfNoCase User-Agent "^hloader" bad_bot
SetEnvIfNoCase User-Agent "^humanlinks" bad_bot
SetEnvIfNoCase User-Agent "^LinkextractorPro" bad_bot
SetEnvIfNoCase User-Agent "^Offline Explorer" bad_bot
SetEnvIfNoCase User-Agent "^Mata Hari" bad_bot
SetEnvIfNoCase User-Agent "^LexiBot" bad_bot
SetEnvIfNoCase User-Agent "^Web Image Collector" bad_bot
SetEnvIfNoCase User-Agent "^The Intraformant" bad_bot
SetEnvIfNoCase User-Agent "^True_Robot/1.0" bad_bot
SetEnvIfNoCase User-Agent "^True_Robot" bad_bot
SetEnvIfNoCase User-Agent "^BlowFish/1.0" bad_bot
SetEnvIfNoCase User-Agent "^JennyBot" bad_bot
SetEnvIfNoCase User-Agent "^MIIxpc/4.2" bad_bot
SetEnvIfNoCase User-Agent "^BuiltBotTough" bad_bot
SetEnvIfNoCase User-Agent "^ProPowerBot/2.14" bad_bot
SetEnvIfNoCase User-Agent "^BackDoorBot/1.0" bad_bot
SetEnvIfNoCase User-Agent "^toCrawl/UrlDispatcher" bad_bot
SetEnvIfNoCase User-Agent "^WebEnhancer" bad_bot
SetEnvIfNoCase User-Agent "^TightTwatBot" bad_bot
SetEnvIfNoCase User-Agent "^suzuran" bad_bot
SetEnvIfNoCase User-Agent "^VCI WebViewer VCI WebViewer Win32" bad_bot
SetEnvIfNoCase User-Agent "^VCI" bad_bot
SetEnvIfNoCase User-Agent "^Szukacz/1.4" bad_bot
SetEnvIfNoCase User-Agent "^QueryN Metasearch" bad_bot
SetEnvIfNoCase User-Agent "^Openfind data gathere" bad_bot
SetEnvIfNoCase User-Agent "^Openfind" bad_bot
SetEnvIfNoCase User-Agent "^Xenu’s Link Sleuth 1.1c" bad_bot
SetEnvIfNoCase User-Agent "^Xenu’s" bad_bot
SetEnvIfNoCase User-Agent "^Zeus" bad_bot
SetEnvIfNoCase User-Agent "^RepoMonkey Bait & Tackle/v1.01" bad_bot
SetEnvIfNoCase User-Agent "^RepoMonkey" bad_bot
SetEnvIfNoCase User-Agent "^Zeus 32297 Webster Pro V2.9 Win32" bad_bot
SetEnvIfNoCase User-Agent "^Webster Pro" bad_bot
SetEnvIfNoCase User-Agent "^EroCrawler" bad_bot
SetEnvIfNoCase User-Agent "^LinkScan/8.1a Unix" bad_bot
SetEnvIfNoCase User-Agent "^Keyword Density/0.9" bad_bot
SetEnvIfNoCase User-Agent "^Kenjin Spider" bad_bot
SetEnvIfNoCase User-Agent "^Cegbfeieh" bad_bot

 

<Limit GET POST>
order allow,deny
allow from all
Deny from env=bad_bot
</Limit>

 


Above rules are Bad bots prohibition rules have RewriteEngine On directive included however for many websites this directive is enabled directly into VirtualHost section for domain/s, if that is your case you might also remove RewriteEngine on from .htaccess and still the prohibition rules of bad bots should continue to work
Above rules are also perfectly suitable wordpress based websites / blogs in case you need to filter out obstructive spiders even though the rules would work on any website domain with mod_rewrite enabled.

Once you have implemented above rules, you will not need to restart Apache, as .htaccess will be read dynamically by each client request to Webserver

2. Testing .htaccess Bad Bots Filtering Works as Expected


In order to test the new Bad Bot filtering configuration is working properly, you have a manual and more complicated way with lynx (text browser), assuming you have shell access to a Linux / BSD / *Nix computer, or you have your own *NIX server / desktop computer running
 

Here is how:
 

 

lynx -useragent="Mozilla/5.0 (compatible; MegaIndex.ru/2.0; +http://megaindex.com/crawler)" -head -dump http://www.your-website-filtering-bad-bots.com/

 

 

Note that lynx will provide a warning such as:

Warning: User-Agent string does not contain "Lynx" or "L_y_n_x"!

Just ignore it and press enter to continue.

Two other use cases with lynx, that I historically used heavily is to pretent with Lynx, you're GoogleBot in order to see how does Google actually see your website?
 

  • Pretend with Lynx You're GoogleBot

 

lynx -useragent="Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" -head -dump http://www.your-domain.com/

 

 

  • How to Pretend with Lynx Browser You are GoogleBot-Mobile

 

lynx -useragent="Mozilla/5.0 (iPhone; U; CPU iPhone OS 4_1 like Mac OS X; en-us) AppleWebKit/532.9 (KHTML, like Gecko) Version/4.0.5 Mobile/8B117 Safari/6531.22.7 (compatible; Googlebot-Mobile/2.1; +http://www.google.com/bot.html)" -head -dump http://www.your-domain.com/

 


Or for the lazy ones that doesn't have Linux / *Nix at disposal you can use WannaBrowser website

Wannabrowseris a web based browser emulator which gives you the ability to change the User-Agent on each website req1uest, so just set your UserAgent to any bot browser that we just filtered for example set User-Agent to CheeseBot

The .htaccess rule earier added once detecting your browser client is coming in with the prohibit browser agent will immediately filter out and you'll be unable to access the website with a message like:
 

HTTP/1.1 403 Forbidden

 

Just as I've talked a lot about Index Bots, I think it is worthy to also mention three great websites that can give you a lot of Up to Date information on exact Spiders returned user-agent, common known Bot traits as well as a a current updated list with the Bad Bots etc.

Bot and Browser Resources information user-agents, bad-bots and odd Crawlers and Bots specifics

1. botreports.com
2. user-agents.org
3. useragentapi.com

 

An updated list with robots user-agents (crawler-user-agents) is also available in github here regularly updated by Caia Almeido

There are also a third party plugin (modules) available for Website Platforms like WordPress / Joomla / Typo3 etc.

Besides the listed on these websites as well as the known Bad and Good Bots, there are perhaps a hundred of others that might end up crawling your webdsite that might or might not need  to be filtered, therefore before proceeding with any filtering steps, it is generally a good idea to monitor your  HTTPD access.log / error.log, as if you happen to somehow mistakenly filter the wrong bot this might be a reason for Website Indexing Problems.

Hope this article give you some valueable information. Enjoy ! 🙂

 

Enable TLS 1.2 Internet Explorer / Make TLS 1.1 and TLS 1.2 web sites work on IE howto

Monday, August 1st, 2016

Internet-Explorer-cannot-display-the-webpage-IE-error
 

Some corporate websites and web tools especially one in DMZ-ed internal corporation networks require an encryption of TLS 1.2 (Transport Layer of Security cryptographic protocol)   TLS 1.1 protocol   both of which are already insecure (prone to vulnerabilities).

Besides the TLS 1.2 browser requirements some corporate tool web interfaces like Firewall Opening request tools etc. are often are very limited in browser compitability and built to only work with certain versions of Microsoft Internet Explorer like leys say IE (Internet Explorer) 11.

TLS 1.2 is supported across IE 8, 9, 10 and 11, so sooner or later you might be forced to reconfigure your Internet Explorer to have enabled the disabled by OS install TLS 1.2 / 1.1.

For those unaware of what TLS (Transport Layer of Security) protocol is so to say the next generation encryption protocol after SSL (Secure Socket Layer) also both TLS and SSL terms are being inter-exchangably used when referring with encrypting traffic between point (host / device etc.) A and B by using a key and a specific cryptographic algorithm.
TLS is usually more used historically in Mail Servers, even though as I said some web tools are starting to use TLS as a substitute for the SSL certificate browser encryption or even in conjunction with it.
For those who want to dig a little bit further into What is TLS? – read on technet here.

I had to enable TLS on IE and I guess sooner others will need a way to enable TLS 1.2 on Internet Explorer, so here is how this is done:
 

Enable-Internet-Explorer-TLS1.2-TLS-1.1-internet-options-IE-screensho
 


    1. On the Internet Explorer Main Menu (press Alt + F to make menu field appear)
    Select Tools > Internet Options.

    2. In the Internet Options box, select the Advanced tab.

    3. In the Security category, uncheck Use SSL 3.0 (if necessery) and Check the ticks:

    Use TLS 1.0,
    Use TLS 1.1 and Use TLS 1.2 (if available).

    4. Click OK
   
     5. Finally Exit browser and start again IE.

 

Once browser is relaunched, the website URL that earlier used to be showing Internet Explorer cannot display the webpagre can't connect / missing website error message will start opening normally.

Note that TLS 1.2 and 1.1 is not supported in Mozilla Firefox older browser releases though it is supported properly in current latest FF releases >=4.2.

If you  have fresh new 4.2 Firefox browser and you want to make sure it is really supporting TLS 1.1 and TLS 1.2 encrpytion:

 

(1) In a new tab, type or paste about:config in the address bar and press Enter/Return. Click the button promising to be careful.

(2) In the search box above the list, type or paste TLS and pause while the list is filtered

(3) If the security.tls.version.max preference is bolded and "user set" to a value other than 3, right-click > Reset the preference to restore the default value of 3

(4) If the security.tls.version.min preference is bolded and "user set" to a value other than 1, right-click > Reset the preference to restore the default value of 1

The values for these preferences mean:

1 => TLS 1.0 2 => TLS 1.1 3 => TLS 1.2


To get a more concrete and thorough information on the exact TLS / SSL cryptography cipher suits and protocol details supported by your browser check this link


N.B. ! TLS is by default disabled in many latest version browsers such as Opera, Safari etc.  in order to address the POODLE SSL / TLS cryptographic protocol vulnerability

Improve Apache Load Balancing with mod_cluster – Apaches to Tomcats Application servers Get Better Load Balancing

Thursday, March 31st, 2016

improve-apache-load-balancing-with-mod_cluster-apaches-to-tomcats-application-servers-get-better-load-balancing-mod_cluster-logo


Earlier I've blogged on How to set up Apache to to serve as a Load Balancer for 2, 3, 4  etc. Tomcat / other backend application servers with mod_proxy and mod_proxy_balancer, however though default Apache provided mod_proxy_balancer works fine most of the time, If you want a more precise and sophisticated balancing with better load distribuion you will probably want to install and use mod_cluster instead.

 

So what is Mod_Cluster and why use it instead of Apache proxy_balancer ?
 

Mod_cluster is an innovative Apache module for HTTP load balancing and proxying. It implements a communication channel between the load balancer and back-end nodes to make better load-balancing decisions and redistribute loads more evenly.

Why use mod_cluster instead of a traditional load balancer such as Apache's mod_balancer and mod_proxy or even a high-performance hardware balancer?

Thanks to its unique back-end communication channel, mod_cluster takes into account back-end servers' loads, and thus provides better and more precise load balancing tailored for JBoss and Tomcat servers. Mod_cluster also knows when an application is undeployed, and does not forward requests for its context (URL path) until its redeployment. And mod_cluster is easy to implement, use, and configure, requiring minimal configuration on the front-end Apache server and on the back-end servers.
 


So what is the advantage of mod_cluster vs mod proxy_balancer ?

Well here is few things that turns the scales  in favour for mod_cluster:

 

  •     advertises its presence via multicast so as workers can join without any configuration
     
  •     workers will report their available contexts
     
  •     mod_cluster will create proxies for these contexts automatically
     
  •     if you want to, you can still fine-tune this behaviour, e.g. so as .gif images are served from httpd and not from workers…
     
  •     most importantly: unlike pure mod_proxy or mod_jk, mod_cluster knows exactly how much load there is on each node because nodes are reporting their load back to the balancer via special messages
     
  •     default communication goes over AJP, you can use HTTP and HTTPS

 

1. How to install mod_cluster on Linux ?


You can use mod_cluster either with JBoss or Tomcat back-end servers. We'll install and configure mod_cluster with Tomcat under CentOS; using it with JBoss or on other Linux distributions is a similar process. I'll assume you already have at least one front-end Apache server and a few back-end Tomcat servers installed.

To install mod_cluster, first download the latest mod_cluster httpd binaries. Make sure to select the correct package for your hardware architecture – 32- or 64-bit.
Unpack the archive to create four new Apache module files: mod_advertise.so, mod_manager.so, mod_proxy_cluster.so, and mod_slotmem.so. We won't need mod_advertise.so; it advertises the location of the load balancer through multicast packets, but we will use a static address on each back-end server.

Copy the other three .so files to the default Apache modules directory (/etc/httpd/modules/ for CentOS).
Before loading the new modules in Apache you have to remove the default proxy balancer module (mod_proxy_balancer.so) because it is not compatible with mod_cluster.

Edit the Apache configuration file (/etc/httpd/conf/httpd.conf) and remove the line

 

LoadModule proxy_balancer_module modules/mod_proxy_balancer.so

 


Create a new configuration file and give it a name such as /etc/httpd/conf.d/mod_cluster.conf. Use it to load mod_cluster's modules:

 

 

 

LoadModule slotmem_module modules/mod_slotmem.so
LoadModule manager_module modules/mod_manager.so
LoadModule proxy_cluster_module modules/mod_proxy_cluster.so

In the same file add the rest of the settings you'll need for mod_cluster something like:

And for permissions and Virtualhost section

Listen 192.168.180.150:9999
<virtualhost  192.168.180.150:9999="">
<directory>
Order deny,allow
Allow from all 192.168
</directory>
ManagerBalancerName mymodcluster
EnableMCPMReceive
</virtualhost>

ProxyPass / balancer://mymodcluster/


The above directives create a new virtual host listening on port 9999 on the Apache server you want to use for load balancing, on which the load balancer will receive information from the back-end application servers. In this example, the virtual host is listening on IP address 192.168.204.203, and for security reasons it allows connections only from the 192.168.0.0/16 network.
The directive ManagerBalancerName defines the name of the cluster – mymodcluster in this example. The directive EnableMCPMReceive allows the back-end servers to send updates to the load balancer. The standard ProxyPass and ProxyPassReverse directives instruct Apache to proxy all requests to the mymodcluster balancer.
That's all you need for a minimal configuration of mod_cluster on the Apache load balancer. At next server restart Apache will automatically load the file mod_cluster.conf from the /etc/httpd/conf.d directory. To learn about more options that might be useful in specific scenarios, check mod_cluster's documentation.

While you're changing Apache configuration, you should probably set the log level in Apache to debug when you're getting started with mod_cluster, so that you can trace the communication between the front- and the back-end servers and troubleshoot problems more easily. To do so, edit Apache's configuration file and add the line LogLevel debug , then restart Apache.
 

2. How to set up Tomcat appserver for mod_cluster ?
 

Mod_cluster works with Tomcat version 6, 7 and 8, to set up the Tomcat back ends you have to deploy a few JAR files and make a change in Tomcat's server.xml configuration file.
The necessary JAR files extend Tomcat's default functionality so that it can communicate with the proxy load balancer. You can download the JAR file archive by clicking on "Java bundles" on the mod_cluster download page. It will be saved under the name mod_cluster-parent-1.2.6.Final-bin.tar.gz.

Create a new directory such as /root/java_bundles and extract the files from mod_cluster-parent-1.2.6.Final-bin.tar.gz there. Inside the directory /root/java_bundlesJBossWeb-Tomcat/lib/*.jar you will find all the necessary JAR files for Tomcat, including two Tomcat version-specific JAR files – mod_cluster-container-tomcat6-1.2.6.Final.jar for Tomcat 6 and mod_cluster-container-tomcat7-1.2.6.Final.jar for Tomcat 7. Delete the one that does not correspond to your Tomcat version.

Copy all the files from /root/java_bundlesJBossWeb-Tomcat/lib/ to your Tomcat lib directory – thus if you have installed Tomcat in

/srv/tomcat

run the command:

 

cp -rpf /root/java_bundles/JBossWeb-Tomcat/lib/* /srv/tomcat/lib/ .

 

Then edit your Tomcat's server.xml file

/srv/tomcat/conf/server.xml.


After the default listeners add the following line:

 

<listener classname="org.jboss.modcluster.container.catalina.standalone.ModClusterListener" proxylist="192.168.204.203:9999"> </listener>



This instructs Tomcat to send its mod_cluster-related information to IP 192.168.180.150 on TCP port 9999, which is what we set up as Apache's dedicated vhost for mod_cluster.
While that's enough for a basic mod_cluster setup, you should also configure a unique, intuitive JVM route value on each Tomcat instance so that you can easily differentiate the nodes later. To do so, edit the server.xml file and extend the Engine property to contain a jvmRoute, like this:
 

.

 

<engine defaulthost="localhost" jvmroute="node2" name="Catalina"></engine>


Assign a different value, such as node2, to each Tomcat instance. Then restart Tomcat so that these settings take effect.

To confirm that everything is working as expected and that the Tomcat instance connects to the load balancer, grep Tomcat's log for the string "modcluster" (case-insensitive). You should see output similar to:

Mar 29, 2016 10:05:00 AM org.jboss.modcluster.ModClusterService init
INFO: MODCLUSTER000001: Initializing mod_cluster ${project.version}
Mar 29, 2016 10:05:17 AM org.jboss.modcluster.ModClusterService connectionEstablished
INFO: MODCLUSTER000012: Catalina connector will use /192.168.180.150


This shows that mod_cluster has been successfully initialized and that it will use the connector for 192.168.204.204, the configured IP address for the main listener.
Also check Apache's error log. You should see confirmation about the properly working back-end server:

[Tue Mar 29 10:05:00 2013] [debug] proxy_util.c(2026): proxy: ajp: has acquired connection for (192.168.204.204)
[Tue Mar 29 10:05:00 2013] [debug] proxy_util.c(2082): proxy: connecting ajp://192.168.180.150:8009/ to  192.168.180.150:8009
[Tue Mar 29 10:05:00 2013] [debug] proxy_util.c(2209): proxy: connected / to  192.168.180.150:8009
[Tue Mar 29 10:05:00 2013] [debug] mod_proxy_cluster.c(1366): proxy_cluster_try_pingpong: connected to backend
[Tue Mar 29 10:05:00 2013] [debug] mod_proxy_cluster.c(1089): ajp_cping_cpong: Done
[Tue Mar 29 10:05:00 2013] [debug] proxy_util.c(2044): proxy: ajp: has released connection for (192.168.180.150)


This Apache error log shows that an AJP connection with 192.168.204.204 was successfully established and confirms the working state of the node, then shows that the load balancer closed the connection after the successful attempt.

You can start testing by opening in a browser the example servlet SessionExample, which is available in a default installation of Tomcat.
Access this servlet through a browser at the URL http://balancer_address/examples/servlets/servlet/SessionExample. In your browser you should see first a session ID that contains the name of the back-end node that is serving your request – for instance, Session ID: 5D90CB3C0AA05CB5FE13121E4B23E670.node2.

Next, through the servlet's web form, create different session attributes. If you have a properly working load balancer with sticky sessions you should always (that is, until your current browser session expires) access the same node, with the previously created session attributes still available.

To test further to confirm load balancing is in place, at the same time open the same servlet from another browser. You should be redirected to another back-end server where you can conduct a similar session test.
As you can see, mod_cluster is easy to use and configure. Give it a try to address sporadic single-back-end overloads that cause overall application slowdowns.

Configuring server running both OpenSSHD and Apache to be accessed via HTTPS

Friday, December 18th, 2009

I wanted to make this machine accessible for both me and others also with a simple browser,
I was thinking about configuring this on pc-freak since some time now.
It took me a while until I found a program that does this for me, anyhow luckily I found it.
It’s called webshell and is working pretty well. Check out the home page of WebShell for download and more info on it. I’ve succesfully installed it on FreeBSD 7.2.
All that is needed for the program to operate is python 2.3 or higher and python openssl (this is optional),
however most people would desire to have the service running over SSL thus this is mandatory.
On my FreeBSD box I had to install:
the port/package py26-openssl and subversion (this is a prerequirement in order to download the source via svn)
It also necessery to modify the webshell.py and change the shebang's location pointing to python
in freebsd that is:

#!/usr/local/bin/python and not #!/usr/bin/python

as in linux.
Then I copied the downloaded source to /usr/local/web-shell/webshell as well as add recordto rc.local

# echo "/usr/local/web-shell/webshell.py -d" >> /etc/rc.local The last thing I did was manually start the daemon with:
# /usr/local/web-shell/webshell.py -d

Tadam, it's up and running accessing it is as simple as pointing the browser
to a domain name or ip on which the python service is running
Currently the running webshell for pc-freak can be accessed via

How to make a mirror of website on GNU / Linux with wget / Few tips on wget site mirroring

Wednesday, February 22nd, 2012

how-to-make-mirror-of-website-on-linux-wget

Everyone who used Linux is probably familiar with wget or has used this handy download console tools at least thousand of times. Not so many Desktop GNU / Linux users like Ubuntu and Fedora Linux users had tried using wget to do something more than single files download.
Actually wget is not so popular as it used to be in earlier linux days. I've noticed the tendency for newer Linux users to prefer using curl (I don't know why).

With all said I'm sure there is plenty of Linux users curious on how a website mirror can be made through wget.
This article will briefly suggest few ways to do website mirroring on linux / bsd as wget is both available on those two free operating systems.

1. Most Simple exact mirror copy of website

The most basic use of wget's mirror capabilities is by using wget's -mirror argument:

# wget -m http://website-to-mirror.com/sub-directory/

Creating a mirror like this is not a very good practice, as the links of the mirrored pages will still link to external URLs. In other words link URL will not pointing to your local copy and therefore if you're not connected to the internet and try to browse random links of the webpage you will end up with many links which are not opening because you don't have internet connection.

2. Mirroring with rewritting links to point to localhost and in between download page delay

Making mirror with wget can put an heavy load on the remote server as it fetches the files as quick as the bandwidth allows it. On heavy servers rapid downloads with wget can significantly reduce the download server responce time. Even on a some high-loaded servers it can cause the server to hang completely.
Hence mirroring pages with wget without explicity setting delay in between each page download, could be considered by remote server as a kind of DoS – (denial of service) attack. Even some site administrators have already set firewall rules or web server modules configured like Apache mod_security which filter requests to IPs which are doing too frequent HTTP GET /POST requests to the web server.
To make wget delay with a 10 seconds download between mirrored pages use:

# wget -mk -w 10 -np --random-wait http://website-to-mirror.com/sub-directory/

The -mk stands for -m/-mirror and -k / shortcut argument for –convert-links (make links point locally), –random-wait tells wget to make random waits between o and 10 seconds between each page download request.

3. Mirror / retrieve website sub directory ignoring robots.txt "mirror restrictions"

Some websites has a robots.txt which restricts content download with clients like wget, curl or even prohibits, crawlers to download their website pages completely.

/robots.txt restrictions are not a problem as wget has an option to disable robots.txt checking when downloading.
Getting around the robots.txt restrictions with wget is possible through -e robots=off option.
For instance if you want to make a local mirror copy of the whole sub-directory with all links and do it with a delay of 10 seconds between each consequential page request without reading at all the robots.txt allow/forbid rules:

# wget -mk -w 10 -np -e robots=off --random-wait http://website-to-mirror.com/sub-directory/

4. Mirror website which is prohibiting Download managers like flashget, getright, go!zilla etc.

Sometimes when try to use wget to make a mirror copy of an entire site domain subdirectory or the root site domain, you get an error similar to:

Sorry, but the download manager you are using to view this site is not supported.
We do not support use of such download managers as flashget, go!zilla, or getright

This message is produced by the site dynamic generation language PHP / ASP / JSP etc. used, as the website code is written to check on the browser UserAgent sent.
wget's default sent UserAgent to the remote webserver is:
Wget/1.11.4

As this is not a common desktop browser useragent many webmasters configure their websites to only accept well known established desktop browser useragents sent by client browsers.
Here are few typical user agents which identify a desktop browser:
 

  • Mozilla/5.0 (Windows NT 6.1; rv:6.0) Gecko/20110814 Firefox/6.0
  • Mozilla/5.0 (X11; Linux i686; rv:6.0) Gecko/20100101 Firefox/6.0
  • Mozilla/6.0 (Macintosh; I; Intel Mac OS X 11_7_9; de-LI; rv:1.9b4) Gecko/2012010317 Firefox/10.0a4
  • Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:2.2a1pre) Gecko/20110324 Firefox/4.2a1pre

etc. etc.

If you're trying to mirror a website which has implied some kind of useragent restriction based on some "valid" useragent, wget has the -U option enabling you to fake the useragent.

If you get the Sorry but the download manager you are using to view this site is not supported , fake / change wget's UserAgent with cmd:

# wget -mk -w 10 -np -e robots=off \
--random-wait
--referer="http://www.google.com" \--user-agent="Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.6) Gecko/20070725 Firefox/2.0.0.6" \--header="Accept:text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5" \--header="Accept-Language: en-us,en;q=0.5" \--header="Accept-Encoding: gzip,deflate" \--header="Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7" \--header="Keep-Alive: 300"

For the sake of some wget anonimity – to make wget permanently hide its user agent and pretend like a Mozilla Firefox running on MS Windows XP use .wgetrc like this in home directory.

5. Make a complete mirror of a website under a domain name

To retrieve complete working copy of a site with wget a good way is like so:

# wget -rkpNl5 -w 10 --random-wait www.website-to-mirror.com

Where the arguments meaning is:
-r – Retrieve recursively
-k – Convert the links in documents to make them suitable for local viewing
-p – Download everything (inline images, sounds and referenced stylesheets etc.)
-N – Turn on time-stamping
-l5 – Specify recursion maximum depth level of 5

6. Make a dynamic pages static site mirror, by converting CGI, ASP, PHP etc. to HTML for offline browsing

It is often websites pages are ending in a .php / .asp / .cgi … extensions. An example of what I mean is for instance the URL http://php.net/manual/en/tutorial.php. You see the url page is tutorial.php once mirrored with wget the local copy will also end up in .php and therefore will not be suitable for local browsing as .php extension is not understood how to interpret by the local browser.
Therefore to copy website with a non-html extension and make it offline browsable in HTML there is the –html-extension option e.g.:

# wget -mk -w 10 -np -e robots=off \
--random-wait \
--convert-links http://www.website-to-mirror.com

A good practice in mirror making is to set a download limit rate. Setting such rate is both good for UP and DOWN side (the local host where downloading and remote server). download-limit is also useful when mirroring websites consisting of many enormous files (documental movies, some music etc.).
To set a download limit to add –limit-rate= option. Passing by to wget –limit-rate=200K would limit download speed to 200KB.

Other useful thing to assure wget has made an accurate mirror is wget logging. To use it pass -o ./my_mirror.log to wget.
 

Install Google Chrome Web Browser Chrome on 32 and 64 bit Debian Lenny and Squeeze/Sid Linux

Sunday, July 25th, 2010

Linux Tux Google Chrome

I’ve decided to write a short post on how to install in a quick manner Google Chrome on Debian GNU/Linux.

There are few reasons why you would consider installing Chrome, however the most obvious one is is the browser speed.
I should admit the browsing experience with Chrome looks and feels far better compared to Iceweasel (e.g. Firefox) on Debian.
It could be that web loading speed performance with Epiphany or Opera is similar to Chrome in terms of velocity, apart from the faster browser experience with Google Chrome, I’ve seen reports online that sometimes Google Chrome behaves better when it comes to multimedia audio and video streams online.

Another thing I notice in Google Chrome is that it’s generally much lighter and loads the base browser times faster than Iceweasel.

The most accurate way to install Chrome on Debian Linux is using Google Linux repositories

So to install add to your /etc/apt/sources.list the following google linux repo

# Google software repository
deb http://dl.google.com/linux/deb/ stable non-free main

e.g.

debian-deskop:~# echo "deb http://dl.google.com/linux/deb/ stable non-free main" >> /etc/apt/sources.list

Then update your repositories list with apt-get:

debian-desktop:~# apt-get update

Next choose your google chrome preferred release between the available (beta, stable and unstrable) version.
I’ve chose to install the Google Chrome stable release apt-getting it like shown below

debian-desktop:~# apt-get install google-chrome-stable

Now the google chrome will be ready to use to start using it either start it up from the Gnome / KDE Menus or exec the command:

debian-desktop:~$ google-chrome

So far so good, you will have now a gnome browser, however what is really irritating is the default behaviour of the chrome install by default it tampers with the default browser configured for my whole Linux desktop system in other words it automatically links:

/etc/alternatives/gnome-www-browser to -> /usr/bin/google-chrome as well as,
/etc/alternatives/x-www-browser to -> /usr/bin/google-chrome

Well I wasn’t happy with that unwarranted install behaviour of Google Chrome therefore I decided to reverse my default Gnome and System Browser back to Epiphany.

First I removed the links to /usr/bin/google-chrome

debian-desktop:~# rm -f /etc/alternatives/gnome-www-browser
debian-desktop:~# rm -f /etc/alternatives/x-www-browser

And thereafter I linked it back to Epiphany

debian-desktop:~# ln -sf /usr/bin/epiphany /etc/alternatives/gnome-www-browser
debian-desktop:~# ln -sf /usr/bin/epiphany /etc/alternatives/x-www-browser

Use apt-get with Proxy howto – Set Proxy system-wide in Linux shell and Gnome

Friday, May 16th, 2014

linux-apt-get-configure-proxy-howto-set-proxy-systemwide-in-linux

I juset setup a VMWare Virtual Machine on my HP notebook and installed Debian 7.0 stable Wheezy. Though VMWare identified my Office Internet and configured automatically NAT, I couldn't access the internet from a browser until I remembered all HP traffic is going through a default set browser proxy.
After setting a proxy to Iceweasel, Internet pages started opening normally, however as every kind of traffic was also accessible via HP's proxy, package management with apt-get (apt-get update, apt-get install etc. were failing with errors):


# apt-get update

Ign cdrom://[Debian GNU/Linux 7.2.0 _Wheezy_ – Official i386 CD Binary-1 20131012-12:56] wheezy Release.gpg
Ign cdrom://[Debian GNU/Linux 7.2.0 _Wheezy_ – Official i386 CD Binary-1 20131012-12:56] wheezy Release
Ign cdrom://[Debian GNU/Linux 7.2.0 _Wheezy_ – Official i386 CD Binary-1 20131012-12:56] wheezy/main i386 Packages/DiffIndex
Ign cdrom://[Debian GNU/Linux 7.2.0 _Wheezy_ – Official i386 CD Binary-1 20131012-12:56] wheezy/main Translation-en_US
Err http://ftp.by.debian.org wheezy Release.gpg
  Could not connect to ftp.by.debian.org:80 (86.57.151.3). – connect (111: Connection refused)
Err http://ftp.by.debian.org wheezy-updates Release.gpg
  Unable to connect to ftp.by.debian.org:http:
Err http://security.debian.org wheezy/updates Release.gpg
  Cannot initiate the connection to security.debian.org:80 (2607:ea00:101:3c0b:207:e9ff:fe00:e595). – connect (101: Network is unreachable) [IP: 2607:ea00:101:3c0b:207:e9ff:fe00:e595 80]
Reading package lists…

 

This error is caused because apt-get is trying to directly access above http URLs and because port 80 is filtered out from HP Office, it fails in order to make it working I had to configure apt-get to use Proxy host – here is how:

a) Create /etc/apt/apt.conf.d/02proxy file (if not already existing)
and place inside:
 

Acquire::http::proxy::Proxy "https://web-proxy.cce.hp.com";
Acquire::ftp::proxy::Proxy "ftp://web-proxy.cce.hp.com";


To do it from console / gnome-terminal issue:
echo ''Acquire::http::Proxy "https://web-proxy.cce.hp.com:8088";' >> /etc/apt/apt.conf.d/02proxy
echo ''Acquire::ftp::Proxy "https://web-proxy.cce.hp.com:8088";' >> /etc/apt/apt.conf.d/02proxy

That's all now apt-get will tunnel all traffic via HTTP and FTP proxy host web-proxy.cce.hp.com and apt-get works again.

Talking about Proxyfing Linux's apt-get, its possible to also set proxy shell variables, which are red and understood by many console programs like Console browsers lynx, links, elinks  as well as wget and curl commands, e.g.:

 

export http_proxy=http://192.168.1.5:5187/
export https_proxy=$http_proxy
export ftp_proxy=$http_proxy
export rsync_proxy=$http_proxy
export no_proxy="localhost,127.0.0.1,localaddress,.localdomain.com"

For proxies protected with username and password export variables should look like so: echo -n "username:"
read -e username
echo -n "password:"
read -es password
export http_proxy="http://$username:$password@proxyserver:8080/"
export https_proxy=$http_proxy
export ftp_proxy=$http_proxy
export rsync_proxy=$http_proxy
export no_proxy="localhost,127.0.0.1,localaddress,.localdomain.com"

To make this Linux proxy settings system wide on Debian / Ubuntu there is the /etc/environment file add to it:
 

http_proxy=http://proxy.server.com:8080/
https_proxy=http://proxy.server.com:8080/
ftp_proxy=http://proxy.server.com:8080/
no_proxy="localhost,127.0.0.1,localaddress,.localdomain.com"
HTTP_PROXY=http://proxy.server.com:8080/
HTTPS_PROXY=http://proxy.server.com:8080/
FTP_PROXY=http://proxy.server.com:8080/
NO_PROXY="localhost,127.0.0.1,localaddress,.localdomain.com"


To make proxy global (systemwide) for most (non-Debian specific) Linux distributions shell environments create new file /etc/profile.d/proxy.sh and place something like:

function proxy(){
echo -n "username:"
read -e username
echo -n "password:"
read -es password
export http_proxy="http://$username:$password@proxyserver:8080/"
export https_proxy=$http_proxy
export ftp_proxy=$http_proxy
export rsync_proxy=$http_proxy
export no_proxy="localhost,127.0.0.1,localaddress,.localdomain.com"
echo -e "nProxy environment variable set."
}
function proxyoff(){
unset HTTP_PROXY
unset http_proxy
unset HTTPS_PROXY
unset https_proxy
unset FTP_PROXY
unset ftp_proxy
unset RSYNC_PROXY
unset rsync_proxy
echo -e "nProxy environment variable removed."
}

To set Global Proxy (make Proxy Systemwide) for a user in GNOME Desktop environment launch gnome-control-center

And go to Network -> Network Proxy

/images/gnome-configure-systemwide-proxy-howto-picture1

/images/gnome-configure-systemwide-proxy-howto-picture2

To make proxy settings also system wide for some GUI Gnome GTK3 applications

gsettings set org.gnome.system.proxy mode 'manual'
gsettings set org.gnome.system.proxy.http host 'your-proxy.server.com'
gsettings set org.gnome.system.proxy.http port 8080

Linux convert and read .mht (Microsoft html) file format. MHT format explained

Thursday, June 5th, 2014

linux-open-and-convert-mht-file-format-to-html-howto
If you're using Linux as a Desktop system sooner or later you will receive an email with instructions or an html page stored in .mht file format.
So what is mht? MHT is an webpage archive format (short for MIME HTML document). MHTML saves the Web page content and incorporates external resources, such as images, applets, Flash animations and so on, into HTML documents. Usually those .mht files were produced with Microsoft Internet Explorer – saving pages through:

File -> Save As (Save WebPage) dialog saves pages in .MHT.

To open those .mht files on Linux, where Firefox is available add the UNMHT FF Extension to browser. Besides allowing you to view MHT on Linux, whether some customer is requiring a copy of an HTML page in MHT, UNMHT allows you to also save complete web pages, including text and graphics, into a MHT file.
There is also support for Google Chrome browser for MHT opening and saving via a plugin called IETAB. But unfortunately IETAB is not supported in Linux.
Anyways IETAB is worthy to mention here as if your'e a Windows users and you want to browse pages compatible only with Internet Explorer, IETAB will emulates exactly IE by using IE rendering engine in Chrome  and supports Active X Controls. IETAB is a great extension for QA (web testers) using Windows for desktop who prefer to not use IE for security reasons. IETab supports IE6, IE7, IE8 and IE9.

Another way to convert .MHT content file into HTML is to use Linux KDE's mhttohtml tool.

linux-kde-converter-mhttohtml

Another approach to open .MHT files in Linux is to use Opera browser for Linux which has support for .MHT

Note that because MHT files could be storing potentially malicious content (like embedded Malware) it is always wise when opening MHT on Windows to assure you have scanned the file with Antivirus program. Often mails containing .MHT from unknown recipients are containing viruses or malware. Also links embedded into MHT file could easily expose you to spoof attacks. MHT files are encoded in combination of plain text MIMEs and BASE64 encoding scheme, MHT's mimetype is:

MIME type: message/rfc822