Monday, January 28, 2013

Setting up FTP for your website Ubuntu 12.04

File Transfer Protocol (FTP) is a standard network protocol used to transfer files from one host to another over a TCP-based network, such as the Internet. Its most common use is uploading your website files to your webserver, allowing you to develop your website, upload it via FTP then refresh the page to see the updated page. If you’re setting up your own webservers its essential you have some way of uploading files and the best way to do this is with FTP. First you need to setup and configure an ftp server on your webserver, in this tutorial we are going to be using ProFTP. ProFTPD is a proven, high-performance and scalable FTP server, with a focus toward simplicity, security, and ease of configuration.


I’ve gone through many FTP server tutorials in the past and they all work fine for uploading to your home directory but in this situation we don’t want to just upload to our home directors, we want to upload to our website directors, in this case /var/wwwr. First off we need to setup a new user for our ftp login details:

sudo useradd [username]
sudo passwd [username]


Now we need to add the www-data group so it can write to you webserver storage, be warned this will set the users primary group to www-data so ensure you have created a new user especially for use as an ftp account.
sudo usermod  -g www-data [username]

Now in most situations your website is stored in /var/www, this often restricts writing from the www-data group so we need to change the permissions to allow any user in the www-data group to write to the /var/www folder:

sudo chmod -R g+w /var/www

Now we are ready to actually install the server, simply type:
sudo apt-get install proftpd
and select the Stand Alone option when asked.

Now we need to make some small configuration changes to ProFTP, first open then proftpd.conf file:
sudo nano /etc/proftpd/proftpd.conf

and edit the following lines to match my example,
# Use this to jail all users in their homes
DefaultRoot                     /var/www
# Set the user and group that the server normally runs at.
User                            www-data
Group                           www-data

Now we need to restart ProFTP so that it recognises the new settings:
sudo /etc/init.d/proftpd restart

and thats it your ftp server should be setup to allow you to upload, edit and download your website, to test this simply type in:
ftp 127.0.0.1

and enter your login details, now try making a folder with the following command
mkdir test

if you can do this with no errors that your ftp server has been correctly configured.


 Trouble shooting
-----------------
1.. if you face permission problem during upload problem use following command
  chown -R www-data:www-data /var/www

2... Use passive mode in file zilla and/or dreamweaver


Wednesday, January 23, 2013

ADDING A ROBOTS.TXT FILE TO YOUR MAGENTO STORE

ADDING A ROBOTS.TXT FILE TO YOUR MAGENTO STORE


A web-robot or robot can be described as a program that executes a specific function automatically on the web; be it for search engine indexing or for HTML and link validation. Googlebot and Bingbot are two of the most common web-robots.
Bandwidth is the measure of data that is sent across the Internet, each time a person visits your website a portion of bandwidth is used. The same applies for web-robots; each time a web-robot visits your site they use a small portion of bandwidth.
Ordinarily the bandwidth that web-robots use is relatively small, but web-robots can sometimes be seen to consume gigabytes of bandwidth which can be a problem for those with bandwidth limits to adhere to with their hosting providers.
Whilst your website being visited regularly by web-robots is by no means a bad thing, it can in fact be perfectly normal if you are regularly adding new content to your website, problems arise when web-robots become stuck in infinite loops on your website. These infinite loops can be caused by custom scripts, but are most often caused when Session ID's are served with each URL that is indexed. The constant activity of web-robots on your site trapped in an infinite loop is what can sometimes cause such heavy bandwidth usage.
A robots.txt file is a simple text file that sits in your web root folder and can be used to control the web-robots allowed to visit your website. Additionally, a robots.txt file further allows you to control what web-robots can view during that visit. This means by controlling the web-robots that visit your site, you can prevent certain directories from being indexed by a web-robot which can in turn provide SEO benefits by preventing duplicated content from being indexed. Furthermore, a robots.txt also allows you to specify a Crawl-Delay in order to restrict a web-robot from constantly indexing and crawling your website, helping to reduce the footprint they make on your bandwidth allocation.
For your convenience, we have included a widely available robots.txt file for use with Magento which is beneficial both in terms of improving your SEO as well as reducing bandwidth usage and server load.


 # $Id: robots.txt,v magento-specific 2010/28/01 18:24:19 goba Exp $
#
# robots.txt
#
# This file is to prevent the crawling and indexing of certain parts
# of your site by web crawlers and spiders run by sites like Yahoo!
# and Google. By telling these ?robots? where not to go on your site,
# you save bandwidth and server resources.
#
# This file will be ignored unless it is at the root of your host:
# Used:  http://example.com/robots.txt
# Ignored: http://example.com/site/robots.txt
#
# For more information about the robots.txt standard, see:
# http://www.robotstxt.org/wc/robots.html
#
# For syntax checking, see:
# http://www.sxw.org.uk/computing/robots/check.html
# Website Sitemap
# Sitemap: http://www.mywebsite.com/sitemap.xml

# Crawlers Setup
User-agent: *
Crawl-delay: 30
# Allowable Index
Allow: /*?p=
Allow: /index.php/blog/
Allow: /catalog/seo_sitemap/category/
Allow: /catalogsearch/result/
Allow: /media/
# Directories
Disallow: /404/
Disallow: /app/
Disallow: /cgi-bin/
Disallow: /downloader/
Disallow: /errors/
Disallow: /includes/
Disallow: /js/
Disallow: /lib/
Disallow: /magento/
Disallow: /pkginfo/
Disallow: /report/
Disallow: /scripts/
Disallow: /shell/
Disallow: /skin/
Disallow: /stats/
Disallow: /var/
# Paths (clean URLs)
Disallow: /index.php/
Disallow: /catalog/product_compare/
Disallow: /catalog/category/view/
Disallow: /catalog/product/view/
Disallow: /catalogsearch/
Disallow: /checkout/
Disallow: /control/
Disallow: /contacts/
Disallow: /customer/
Disallow: /customize/
Disallow: /newsletter/
Disallow: /poll/
Disallow: /review/
Disallow: /sendfriend/
Disallow: /tag/
Disallow: /wishlist/
# Files
Disallow: /cron.php
Disallow: /cron.sh
Disallow: /error_log
Disallow: /install.php
Disallow: /LICENSE.html
Disallow: /LICENSE.txt
Disallow: /LICENSE_AFL.txt
Disallow: /STATUS.txt
# Paths (no clean URLs)
Disallow: /*.js$
Disallow: /*.css$
Disallow: /*.php$
Disallow: /*?p=*&
Disallow: /*?SID=
Disallow: /*?limit=all

# Uncomment if you do not wish for Google to index your images
#User-agent: Googlebot-Image
#Disallow: /


Full credit for the above robots.txt file goes to its original creator. Discussion on the robots.txt file can be found within this thread on the Magento Community Forums.

To install the robots.txt file to your domain, you can follow these few simple steps:
Step 1: Download the robots.txt file to your computer by clicking here.
Step 2: If your Magento is installed within a sub-directory you will need to modify the robots.txt file accordingly. For example, you would need to change 'Disallow: /404/' to 'Disallow: /your-sub-directory/404/' and 'Disallow: /app/' to 'Disallow: /your-sub-directory/app/' and so on and so forth...
Step 3: If your domain has a sitemap.xml, you can uncomment line number 21 of the robots.txt file and include the URL to your sitemap.xml.
Step 4: Upload the robots.txt file to your web-root folder. This can be done by placing the file within your 'httpdocs/' directory either by logging into your Plesk Hosting Control Panel with your credentials, or through your FTP client of choice.
If you have any feedback on the above robots.txt file, feel free to leave a post below. It would be good to discuss and improve the above robots.txt file for the benefit of everyone.



Original sources: http://www.nublue.co.uk/forums/topic/318/adding-a-robotstxt-file-to-your-magento-store/