Saturday, January 31, 2015

Good Blog on Hive

http://rishavrohitblog.blogspot.com/2013/10/introduction-to-hives-partitioning.html

Wednesday, January 28, 2015

Blog On Map-Reduce

http://blog.matthewrathbone.com/2013/02/20/real-world-hadoop---implementing-a-left-outer-join-in-hive.html

http://www.thecloudavenue.com/2014/05/user-recomdations-with-big-data-1.html

http://www.bidn.com/blogs/cprice1979/ssas/4283/mapreduce-ninja-moves-combiners-shuffle-amp-doing-a-sort

Sunday, January 25, 2015

Scoop Questions

http://hadooptutorial.info/sqoop-interview-questions-and-answers-for-experienced/



1) Number of map's to work on a table can be controlled by
 
 
 
 

2) By Default Parallelism(Multiple MR jobs to work on one table) is Off
 
 

3) Sqoop import create-hive-table. Create-Hive-Import Role is
 
 
 
 

4) Best performed mode will be
 
 
 
 

5) We can't import MySql table without primary key in HDFS.
 
 

6) Number of map's required to transfer whole data in HDFS is decided by
 
 
 
 

7) When we use Direct Import(example mysqldump tool), JDBC use can be avoided.
 
 

8) Default file format during sqoop import to hdfs is
 
 
 
 

9) Multiple maps can't work when we transfer single table import
 
 

10) Table A has column Imagedata, of type blob, Which one is fastest possible method.
 
 
 
 

Map-Reduce Questions

Identity Reducer
  • http://stackoverflow.com/questions/10630447/hadoop-difference-between-0-reducer-and-identity-reducer
http://hadooptutorial.info/merging-small-files-into-sequencefile/




Saturday, January 17, 2015

CCD-410 Preparation




General Topics 
  • http://hadoop-gyan.blogspot.com/2012/11/map-reduce.html 
Partition 
  • http://www.hadoopmaterial.com/2013/10/custom-partitioner.html
  • http://hadooptutorial.wikispaces.com/Custom+partitioner
Sequence File 
  • http://thinkbiganalytics.com/hadoop-sequence-files-and-a-use-case/
  • http://dailyhadoopsoup.blogspot.com/2014/01/sequence-file.html
Sorting using Hadoop – TotalOrderPartitioner
  • https://pipiper.wordpress.com/2013/05/02/sorting-using-hadoop-totalorderpartitioner/
NLineInputFormat in Java MapReduce 
  • http://hadooped.blogspot.com/2013/09/nlineinputformat-in-java-mapreduce-use.html

MultipleInputs

  • http://www.lichun.cc/blog/2012/05/hadoop-multipleinputs-usage/
DBInputFormat & DBOutputFormat
  • https://archanaschangale.wordpress.com/tag/dbinputformat/
job.setJarByClass
  • http://www.bigdataspeak.com/2014/06/what-is-need-to-use-jobsetjarbyclass-in.html
Difference between job.submit & job.waitForCompletion

  • http://stackoverflow.com/questions/16702298/what-is-the-difference-between-job-submit-and-job-waitforcomplete-in-hadoop
  • http://stackoverflow.com/questions/7483624/running-jobs-parallely-in-hadoop
Map Join & Reducer Join 
  • http://codingjunkie.net/mapreduce-reduce-joins/
  • http://hadoop-learner.blogspot.com/2013/07/hadoop-certification.html
  • http://dailyhadoopsoup.blogspot.com/2014/02/mapreduce-inputs-and-splitting.html
Sample Questions: 
  • http://www.aiopass4sure.com/cloudera-exams/ccd-410-exam-questions/which-invocation-correctly-passesmapredjobname-with-a-value-of-example-to-hadoop.html
MultipleInputs :
  • http://halitics.blogspot.com/2013/10/multiple-input-files-example-prog.html