Skip to content

JAPNEET30/hbase-setup

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 

Repository files navigation

#Hadoop and Hbase Setup on Windows

This document covers basic setup of hbase and hadoop on a single system on windows. Softwares being setup are:

  1. Hadoop (3.4.2)
  2. Hbase (2.6.3)
  3. JDK (17) on wsl

the steps are as follows:

  1. Enable WSL on Windows from 'Turn windows features on and off' dialog.

    1. search for windows features on the start menu.
    2. Scroll down to windows sub-system for linux option and enable it.
    3. Click ok.
    4. Let it load the libraries and restart after.
  2. Install ubuntu system from the ms-store.

    1. Open the ubuntu System from start menu.
    2. Set the login and password for the sub-system.
  3. Download Java JDK using the following commands:

    1. sudo apt install openjdk-17-jdk
    2. Define environment variable JAVA_HOME (By default jdk is installed in /usr/lib/jvm/java-17-openjdk-amd64)
    3. export JAVA_HOME=/usr/lib/jvm/java-17-openjdk-amd64
  4. After settings these things we wil now start with hadoop setup.

  5. Here hdoop will be setup as a pseud-distributed system. To setup Hadoop follow these steps:

    1. Install ssh:

      1. Run: sudo apt install openssh-server
      2. Start the service: sudo service ssh start
      3. check service: sudo service ssh status
      4. (Optional) setup ssh as a startup service: sudo systemctl enable ssh
    2. Install Hadoop: 0. Change location to home directory of the user.(Optional)

      1. Download Hadoop: wget https://downloads.apache.org/hadoop/common/hadoop-3.4.2/hadoop-3.4.2-lean.tar.gz
      2. Unzip the file: tar -xzf hadoop-3.4.2-lean.tar.gz
      3. Change directory: cd hadoop-3.4.2
      4. Add environment variable: export HADOOP_HOME=/~/hadoop-3.4.2
      5. Applying the changes in env: source ~/.bashrc
      6. Adding the JAVA_HOME location to hadoop:
        1. open <-hadoop_directory->/etc/hadoop/hadoop-env.sh: nano $HADOOP_HOME/etc/hadoop/hadoop-env.sh
        2. change the variable JAVA_HOME: export JAVA_HOME=/usr/lib/jvm/java-17-openjdk-amd64
        3. save using ctrl+s and ctrl+x.
      7. Setting up passwordless ssh:
        1. generate ssh rsa key: ssh-keygen -t rsa -P "" -f ~/.ssh/id_rsa
        2. add the generated key to authorized keys: cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
        3. modify file permissions rw for the user: chmod 600 ~/.ssh/authorized_keys
      8. Adding hdfs namenode for later: hdfs namenode -format
      9. starting the hadoop service from the hadoop parent directory:
        1. Run: start-dfs.sh
        2. Run: start-yarn.sh
      10. Check hadoop setup: hadoop jar $HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.4.2.jar pi 2 5
    3. Setting up Hbase:

      1. Download Hbase 2.6.3: wget https://downloads.apache.org/hbase/2.6.3/hbase-2.6.3-bin.tar.gz
      2. Unzip the downloaded file: tar -xzf hbase-2.6.3-bin.tar.gz
      3. Enter into hbase base directory: cd hbase-2.6.3
      4. Setting JAVA_HOME variable in Hbase config file:
        1. Edit the file hbase-env.sh: nano conf/hbase-env.sh
        2. Modify the JAVA_HOME variable to: export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
        3. Save and exit the file: CTRL-S then, CTRL-X
      5. Configure Standalone mode:
        1. Open file hbase-site.xml: nano conf/hbase-site.xml
        2. Edit the property as: hbase.cluster.distributed false hbase.rootdir file:///home/<-user-directory-name-here->/hbase-data
        3. save and exit: CTRL-S then, CTRL-X
      6. Make data directory: mkdir -p ~/hbase-data
      7. Starting hbase service from hbase parent directory: ./bin/start-hbase.sh
      8. Running the hbase shell: ./bin/hbase shell
      9. Running commands in hbase shell:
        1. create 'test', 'cf'
        2. put 'test', 'row1', 'cf:a','value1'
        3. scan 'test'
      10. Stopping hbase service: ./bin/stop-hbase.sh
    4. The setup should now be complete.

About

Just steps i used to setup hadoop and hdfs on my system for standalone operation.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published