Others Interview Questions and Answers (265) - Page 6

What is a Thrift in NoSQL parlance?

Thrift is a software framework and an interface definition language that allows
cross-language services and API development. Services generated using Thrift work
efficiently and seamlessly between C++, Java, Python, PHP, Ruby, Erlang, Perl,
Haskell, C#, Cocoa, Smalltalk, and OCaml. Thrift was created by Facebook in
2007. It’s an Apache incubator project.
What are the NoSQL categories?

Since every NoSQL provider tries to solve a different problem, so NoSQL implementations can be categorized by their manner of implementation.

a) Key-Value Store Database

b) Column Family Database

c) Document Store Database

d) Graph Database

e) Multivalued Database

f) Object Database

g) Tripple Store Database

h) Tuple Store Database

i) Tabular Database
Briefly explain Key-Value database

Key-value stores allow the application to store its data in a schema-less (key, value) pairs. These data can be stored in a hash table like data-types of a programming language - so that each value can be accessed by its key. Although such storage might not be very efficient - since they provide only a single way to access the values - but eliminates the need for a fixed data model.
Key Value
----- : ------
Id1_Name : Niladri Biswas
Id1_Citizenship : Indian

Id2_Name : Mike Curz
Id2_Citizenship : American

Since it is guarented to always have a unique key for a particular object, we can query the database for that unique key and get the results back from whichever node has the object.

Examples involve Rika,Dynamo etc.
What are some of the common features of NoSQL?

a) Easy to use in conventional load-balanced clusters

b) Persistent data (not just caches)

c) Scale to available memory

d) Have no fixed schema and allow schema migration without downtime

e) Have individual query systems rather than using a standard query language

f) Are ACID within a node of the cluster and eventually consistent across the cluster
Explain in short the Columnar database

Column-oriented NoSQL databases espouses a model where data in stored in a column-oriented way. This contrasts with the row-oriented format in RDBMS. The column-oriented storage allows data to be stored effectively. It avoids consuming space when storing nulls by simply not storing a column when a value doesn’t exist for that column. Each unit of data can be thought of as a set of key/value pairs, where the unit itself is identified with the help of a primary identifier, often referred to as the primary key or Row-Key.Also units are stored in an ordered-sorted manner. The units of data are sorted and ordered on the basis of the row-key.
Column family databases are still extremely scalable but less-so than key-value stores. However, they work better with more complex data sets.
Examples of Column family databases are: Apache HBase,Hypertable,Cassandra, Google’s Bigtable etc.
Briefly explain Document Database

The word document in document databases connotes loosely structured sets of key/
value pairs in documents, typically JSON (JavaScript Object Notation), and not documents or spreadsheets (though these could be stored too).
Document databases treat a document as a whole and avoid splitting a document into its constituent name/value pairs. At a collection level, this allows for putting together a diverse set of documents into a single collection. Document databases allow indexing of documents on the basis of not only its primary identifier but also its properties. Document stores are used for large, unstructured or semi-structured records.
Examples : MongoDB,CouchDB etc.
Explain briefly the concept of Tripple Store Database

In a classic relational database, data is stored in records. Each record contains multiple fields. These fields contain data that may belong to some object. The relation between the field and the object it belongs to is not represented as data in the database. It is only available as metadata in the form of the column (name, datatype, collation, foreign keys). An object is not explictly modelled, but rather via a series of linked tables.

A Triple Store Database is a network of interrelated triples ("subject-predicate-object" triplets) whose predicates are part of the data themselves. Moreover, each object has an identifier that is not just an integer number that means only something inside the database only. It is a URI that may have a distinct meaning worldwide.

A triple is a record containing three values: either (uri, uri, uri) or (uri, uri, value). In the first form the triple relates one object to another, as in the fact "Vox inc. is a supplier" (Both "Vox inc.", "is a", and "supplier" are semantic subjects identified by a uri). In the second form the triple links a constant value to a subject, as in "Vox inc.'s phone number is 0842 020 9090".

Examples: Mulgara
Explain in brief Multivalue Database

MultiValue database deals with three dimensional data modelling where they focus mainly on three-dimensional data structures like Fields, Values, and Subvalues.

A field, or column, is the same as it would be in a normalized database.

A Value is a further breakdown of a column. For example, in a normalized database there might be columns defined for Address1, Address2, Address3. In a MultiValue database there would be a definition for a column named Address, and stored in that column would be either one, two, three, or more values. These different values would be delimited by a special character known as a Value Mark.

A Subvalue is a further breakdown of a value. For example, if there is a column defined for Phone, and there is a value called Home, there may be two sub values for the home phone number- perhaps the main number and a home office phone number.

Another distinct advantage to the MultiValue world is that the tables are extremely flexible. Columns can usually just be added to the database definition and used immediately. There is no need to shut down the database , lock out the users, add the column, and rebuild the database. A new column is simply added to a dictionary and that column is then immediately available.

Examples : jBASE etc.
Explain Graph Database

A Graph is just data that has a defined structure.E.g. lists, trees, maps, objects are graph data structures. A Graph is the general data structure for storing related data.
The records in a Graph database are called “Nodes” which are connected through “Relationships” that always have a direction. The traditional visual representation uses circles for Nodes and lines for Relationships.The most common example is the relationship between people on a social network such as Facebook.A graph database is a big dense network structure. While it could take an RDBMS hours to sift through a huge linked list of people, a graph database uses sophisticated shortest path algorithms to make data queries more efficient.
Examples includes AllegroGraph,Sones,Neo4j etc.
What are the Pros and Cons of using Key Value store?


a) With little or no need to maintain indexes, key-value stores are often designed
to be horizontally scalable, extremely fast, or both.

b) They’re particularly suited for problems where the data are not highly related.

For example, in a web application, users’ session data meet this criteria; each user’s session activity will be different and largely unrelated to the activity of other users.


Often lacking indexes and scanning capabilities, KV stores won’t help us if
we need to be able to perform queries on your data, other than basic CRUD
operations (Create, Read, Update, Delete).
What are some of the Pros and Cons of Columnar database


Columnar databases have been traditionally developed with horizontal scalability
as a primary design goal. As such, they’re particularly suited to “Big
Data” problems, living on clusters of tens, hundreds, or thousands of nodes.
They also tend to have built-in support for features such as compression and
versioning. The canonical example of a good columnar data storage problem
is indexing web pages. Pages on the Web are highly textual (benefits from
compression), somewhat interrelated, and change over time (benefits from


Different columnar databases have different features and therefore different
drawbacks. But one thing they have in common is that it’s best to design
your schema based on how you plan to query the data. This means you should
have some idea in advance of how your data will be used, not just what it’ll
consist of. If data usage patterns can’t be defined in advance—for example,
fast adhoc reporting—then a columnar database may not be the best fit.
What can be the pros and cons of using Document database?

Document databases are suited to problems involving highly variable domains.
When we don’t know in advance what exactly our data will look like, document
databases are a good bet. Also, because of the nature of documents,
they often map well to object-oriented programming models. This means less
impedance mismatch when moving data between the database model and
application model.


If we are used to performing elaborate join queries on highly normalized relational
database schema, we’ll find the capabilities of document databases lacking.
What is Master-Slave Replication?

With master-slave distribution, the data is replicated across multiple nodes. One node is designated as the master, or primary. This master is the authoritative source for the data and is usually responsible for processing any updates to that data. The other nodes are slaves, or secondaries. A replication process synchronizes the slaves with the master.

a) Master-slave replication is most helpful for scaling when you have a read-intensive dataset. It will scale horizontally to handle more read.

b) A second advantage of master-slave replication is read resilience: Should the master fail, the slaves can still handle read requests.

Masters can be appointed manually or automatically. Manual appointing typically means that when we configure the cluster, we configure one node as the master. With automatic appointment, we create a cluster of nodes and they elect one of themselves to be the master. Apart from simpler configuration, automatic appointment means that the cluster can automatically appoint a new master when a master fails, reducing downtime
What is Peer-to-Peer Replication?

Master-slave replication helps with read scalability but doesn’t help with scalability of writes. It provides resilience against failure of a slave, but not of a master. Essentially, the master is still a bottleneck and a single point of failure. Peer-to-peer replication attacks these problems by not having a master. All the replicas have equal weight, they can all accept writes, and the loss of any of them doesn’t prevent access to the data store.With a peer-to-peer replication cluster, we can ride over node failures without losing access to data. Furthermore, we can easily add nodes to improve the performance.
What is the advantage if we use peer-to-peer replication and sharding together?

Using peer-to-peer replication and sharding is a common strategy for column-family databases. In a scenario we might have tens or hundreds of nodes in a cluster with data sharded over them. A good starting point for peer-to-peer replication is to have a replication factor of 3, so each shard is present on three nodes. Should a node fail, then the shards on that node will be built on the other nodes .
What are the types of Replication?

Replication comes in two forms:

- Master-slave replication makes one node the authoritative copy that handles writes while slaves synchronize with the master and may handle reads.

- Peer-to-peer replication allows writes to any node; the nodes coordinate to synchronize their copies of the data.

Master-slave replication reduces the chance of update conflicts but peer-to-peer replication avoids loading all writes onto a single point of failure.
What is Spanner?

Spanner is Google's scalable, multi-version, globally-distributed, and synchronously-replicated database. It is the first system to distribute data at global scale and support externally-consistent distributed transactions.It’s a database that embraces ACID, SQL, and transactions, that can be distributed across thousands of nodes spanning multiple data centers across multiple regions.
What is "Schematized Semi-relational Tables" for Google Spanner?

A hierarchical approach to grouping tables that allows Spanner to co-locate related data into directories that can be easily stored, replicated, locked, and managed on what Google calls spanservers. They have a modified SQL syntax that allows for the data to be interleaved, and the paper mentions some changes to support columns encoded with Protobufs.
What is NClass?

It is Free UML class designer tool for windows and non-Windows(mono) users.

It provides lots of functionality like Multilingual GUI,Source code generation,Reverse engineering from .NET assemblies etc

Currently it Support two languages like C# and Java.
What is Knockout?

Knockout is a JavaScript library that allows us to create interactive, responsive UI by respecting the observer pattern with a clean underlying data model. It is an attempt to bring the Model View View Model [MVVM] feel into JavaScript that offers declarative bindings similar to WPF or Silver light applications. It works on any kind of mainstream browser like IE 6+, Firefox 2+, Chrome, and Safari etc.
Found this useful, bookmark this page to the blog or social networking websites. Page copy protected against web site content infringement by Copyscape

 Interview Questions and Answers Categories