Friday, January 9, 2015

Bulk Loading Data into Cassandra Using SSTableLoader

Bulk Loading Data into Cassandra Using SSTableLoader

Why Use SSTableLoader:
                        When you want to move the data from any database to Cassandra database the best option is SSTableloader in Cassandra. By using this we can transfer the data very fast.

Steps to loading the data into Cassandra:

  • Create Keyspace in the Casssandra.
  • Create table based on your requirement using CQLSH.
  • Create a .csv file from the existing data 
  •       Then use SSTableloader move the data into Cassandra.
          Step1: Creating Keyspace
                CREATE KEYSPACE sample WITH REPLICATION = {‘class’ : 'SimpleStrategy', 'replication_factor' : 1 };
Step 2: Creating table based on your requirement .
              CREATE TABLE sample.users (
            key uuid,
            firstname ascii,
            lastname ascii,
            password ascii,
            age ascii,
            email ascii,
            PRIMARY KEY (key, firstname));  
     
               In the above i am creating table users .Primary keys are key and firstname.
Step 3:

Creating the .csv based on your table.

How to create CSV file using Java:

Sample program to create CsvFile:
import java.io.FileWriter;
public class CreateCsv {
       public static void main(String[] args) {
              generateCsvFile("E:/csv/records.csv");
       }
       public static void generateCsvFile(String csvName) {
              try {
                     FileWriter writer = new FileWriter(csvName);
                     for (int i = 0; i < 1000000; i++) {
                          
                           writer.append(Integer.toString(i));
                           writer.append(',');
                           writer.append("26");
                           writer.append('\n');
                          
                     }
                     writer.flush();
                     writer.close();
                     System.out.println("Success");
              } catch (Exception e) {
                     e.printStackTrace();
              }
       }
}

These are mandatory steps after the  create project for sstableloader

·        In the project to upload the all the jars of Cassandra. These jars all are available in lib folder and  tools folder of Cassandra tar or zip file provided by the Datastax.
·        And also upload the Cassandra.yaml file of conf folder in Cassandra tar or zip file of Datastax.
·         And also upload the .csv file to the project.For example I put the sstable.csv in my project.

No comments:

Post a Comment