Big Data Analytics MCQ

Question 31 : Pick a hash function h that maps each of the N elements to at least log2 N bits, Estimated number of distinct elements is

  1. 2^R
  2. 2^(-R)
  3. 1-(2^R)
  4. 1-(2^(-R))

Question 32 : which of the following is not the characterstic of stream data?

  1. Continuous
  2. ordered
  3. persistant
  4. huge

Question 33 : Which of the following is a column-oriented database that runs on top of HDFS

  1. Hive
  2. Sqoop
  3. Hbase
  4. Flume

Question 34 : Which of the following decides the number of partitions that are created on the local file system of the worker nodes?

  1. Number of map tasks
  2. Number of reduce tasks
  3. Number of file input splits
  4. Number of distinct keys in the intermediate key-value pairs

Question 35 : Which of the following is not the class of points in BFR algorithm

  1. Discard Set (DS)
  2. Compression Set (CS)
  3. Isolation Set (IS)
  4. Retained Set (RS)

Question 36 : Which of the following is not true for 5v?

  1. Volume
  2. variable
  3. Velocity
  4. value

Question 37 : Which algorithm isused to find fully connected subgraph in soial media mining?

  1. CURE
  2. CPM
  3. SimRank
  4. Girvan-Newman Algorithm

Question 38 : A ________________ query Q is a query that is issued once over a database D, and then logically runs continuously over the data in D until Q is terminated.

  1. One-time Query
  2. Standing Query
  3. Adhoc Query
  4. General Query

Question 39 : Effect of Spider trap on page rank

  1. perticular page get the highest page rank
  2. All the pages of web will get 0 page rank
  3. no effect on any page
  4. affects a perticular set of pages

Question 40 : Which of the following is correct option for MongoDB

  1. MongoDB is column oriented data store
  2. MongoDB uses XML more in comparison with JSON
  3. MongoDB is a document store database
  4. MongoDB is a key-value data store

Question 41 : _________ systems focus on the relationship between users and items for recommendation.

  1. DGIM
  2. Collaborative-Filtering
  3. Content Based and Collaborative Filtering
  4. Content Based

Question 42 : The graphical representation of an SNA is made up of links and _____________.

  1. People
  2. Networks
  3. Nodes
  4. Computers

Question 43 : Hadoop is a framework that works with a variety of related tools. Common hadoop ecosystem include ____________

  1. MapReduce, Hummer and Iguana
  2. MapReduce, Hive and HBase
  3. MapReduce, MySQL and Google Apps
  4. MapReduce, Heron and Trumpet

Question 44 : About data streaming, Which of the following statements is true?

  1. Stream data is always unstructured data.
  2. Stream data often has a high velocity.
  3. Stream elements cannot be stored on disk.
  4. Stream data is always structured data.

Question 45 : Which of the following is a NoSQL Database Type ?

  1. SQL
  2. JSON
  3. Document databases
  4. CSV

Question 46 : Techniques for fooling search engines into believing your page is about something it is not, are called _____________.

  1. term spam
  2. page rank
  3. phishing
  4. dead ends

Question 47 : The police set up checkpoints at randomly selected road locations, then inspected every driver at those locations. What type of sample is this?

  1. Simple Random Sample
  2. Startified Random Sample
  3. Cluster Random Sample
  4. Uniform sampling

Question 48 : Which of the following statements about standard Bloom filters is correct?

  1. It is possible to delete an element from a Bloom filter.
  2. A Bloom filter always returns the correct result.
  3. It is possible to alter the hash functions of a full Bloom filter to create more space.
  4. A Bloom filter always returns TRUE when testing for a previously added element.

Question 49 : Which of the following is responsible for managing the cluster resources and use them for scheduling users’ applications?

  1. Hadoop Common
  2. YARN
  3. HDFS
  4. MapReduce

Question 50 : ___________is related with an inconsistency possessed by data and this in turn hampers the data analization process or creates hurdle in the way for those wish to analyze this form of data.

  1. Variability
  2. Variety
  3. Volume
  4. Complexity
  • chevron_left
  • 1
  • 2
  • chevron_right