Hadoop Installation on Single Machine
To Download and Install Hadoop, the
prerequisites are
1. Linux based OS 64-bit OS like
Ubuntu
CentOS
Fedora
... etc
I preferred to use Ubuntu 12.04LTS, later 14.04
LTS(upcomming version)
2. JAVA 1.6 or 1.7 JDK
Go to Downloads folder
> cd Downloads
Un-zip the hadoop tar file
>sudo tar xzf hadoop-1.1.2.tar.gz
I
created a folder in /home/hduser/
>mkdir
Installations
Move
the Hadoop Un-Zip folder to Installations Directory, pointing as Hadoop
>sudo
mv /home/hduser/Downloads/hadoop-1.2.1 hadoop
Giving
some permissions to hadoop folder
>sudo addgroup hadoop
>sudo
chown -R hduser:hadoop hadoop
Restart
the terminal inorder to get .bashrc file with some content
>gksudo
gedit .bashrc
add the
following lines to end of this page
# Set
JAVA_HOME (we will also configure JAVA_HOME directly for Hadoop later on)
export
JAVA_HOME=/usr/lib/jvm/jdk1.6.0_45/
# Set
Hadoop-related environment variables
export
HADOOP_HOME=/home/hduser/Installations/hadoop-1.2.1
# Some
convenient aliases and functions for running Hadoop-related commands
unalias
fs &> /dev/null
alias
fs="hadoop fs"
unalias
hls &> /dev/null
alias
hls="fs -ls"
# If
you have LZO compression enabled in your Hadoop cluster and
#
compress job outputs with LZOP (not covered in this tutorial):
#
Conveniently inspect an LZOP compressed file from the command
# line;
run via:
#
# $
lzohead /hdfs/path/to/lzop/compressed/file.lzo
#
#
Requires installed 'lzop' command.
#
lzohead
() {
hadoop fs -cat $1 | lzop -dc | head -1000 |
less
}
# Add
Hadoop bin/ directory to PATH
export PATH=$PATH:$HADOOP_HOME/bin
Save
the file and exit the terminal.
Now its
time to modify the configuratins in core-site.xml, hdfs-site.xml, mapred-site.xml, hadoop-env.sh
In
hadoop-env.sh
export
JAVA_HOME as in snap
In
core-site.xml file, add text below, between tags
scheme and
authority determine the FileSystem implementation. The
uri's scheme
determines the config property (fs.SCHEME.impl) naming
the FileSystem
implementation class. The uri's
authority is used to
determine the
host, port, etc. for a filesystem.
In
hdfs-site.xml,
We need to set
replication factor as follows
The actual number
of replications can be specified when the file is created.
The default is
used if replication is not specified in create time.
In
mapred-site.xml, Configure mapred as follows
at. If "local", then jobs are run
in-process as a single map
and reduce task.
Open the terminal and go for following steps for
first use.
>cd $HADOOP_HOME
>bin/hadoop namenode -format
>start-all.sh
>jps
after
getting jps, you will find 5 daemons which cluster is ready
goto web browser for GUI for Hadoop
http://localhost:50070
In coming blog, i am going to give Multi-Cluster Setup
Happy Hadooping !!
Comments
Post a Comment
thank you for your feedback