Friday, April 22, 2011

Opaque Pointer and Binary Compatibility

23, April 2011
Binary Compatibility:
When you compile a small application that prints "hello, world", we are implicitly depending on third party libraries usually developed by compiler vendor. If there is bug, or upgraded software from the vendor, can we run our application without recompiling? If yes, we say that the library is binary compatible.

Or in-other words, let us say we supplied a library, and later we come up with great ideas that requires changes in many classes layouts, how can you ship the upgraded libraries without recompiling client code? The solution is Handle/Body idiom (refer Advanced C++ book by Jim Copelin).

Binary compatibility depends on various factors Application Binary Interface definition, source compatibility opaque pointer in code.

Opaque Pointer (Also referred as Handle/Body Idiom, Pimpl Idiom, Cheshire Cat etc.):

To achieve binary compatibility we separate implementation from interface, so easy to say? The interface will have an indirection (pointer or Handle) to implementation (Body). An excellent article on Qt d-pointer (a.k.a. data pointer) is available on Qt developer page and KDE developer page.

Note that the concept of Opaque Pointer or Pointer to Implementation (PIMPL) is used extensively in many software libraries. I came across PIMPL in mobile software, Qt, Boost, many more. I am preparing an article for GeeksforGeeks, soon it will be published.

Also note the concept of q-pointer (pointer back to Handle class) that references the API class for any functionality defined in API class hierarchy.

Layout Engines - WebKit, Gecko, ...

22, April 2011
Layout Engine:

If we want to display an image on screen, how can we do it? We need an entity that paints the screen with the image content. These links helps as digital diary.
The traditional software frameworks like Windows Form, Qt widget, Java Swings etc. provides functionality to render the screen using static information called objects. There are more sophisticated rendering engines that are part of web browsers. The concept of layout engine came from early development of web (Netscape navigator).
Layout engine is the heart of any web browser; examples are WebKit (Chrome, Safari, Konqueror browsers) Gecko (Firefox, Epic browsers), Presto (Opera browser), Trident (Internet Explorer browser) etc. are few examples of well-known layout engines. Almost all of these layout engines are written in advanced C++.
The concept of layout engines is not limited to browsers. One can use it to design applications that needs screen rendering. For example, the outlook mail client, Winamp player display, RealPlayer displays are based on Microsoft propriety Trident engine. Gecko is also used in Thinderbird email client, Google Picasa, and other Internet application suites.
Examples of Webkit layout engine other than browser application can be found in Qt example code. The Qt documentation application is based on Webkit layout engine. It is interesting to imagine the applications of these layout engines. If we are handy with the engine design and few web standards, we can adopt the engine alone for variety of applications.
With the advancement of Cloud Computing Technology, the computer geeks need a thin client application that connects to a server in cloud. Brower based thin clients are going to play major role in future. Chrome design (see Webkit2 also) and development is one example. IE9 also very interesting. Time will answer well.
Script Engine:
Well, the layout engine itself can’t help much. It should be flexible to program for dynamism. The client side scripting languages like JavaScript play a critical role here. But one needs to parse these scripts in the browser at runtime. The parsing entity is called Script Engine. For more details read about Chakra and V8 script engines. Firefox uses optimized SpiderMonkey as script engine. The performance of script engine contributes to overall performance of browser considerably.
For details on Gecko see this link. For details of WebKit see this link. WebKit wiki. The code is free, start exploring their design and implementation. We can see the practical use of C++, design patterns, advanced C++ idioms, etc.

Friday, April 15, 2011

Trie – An efficient way to search data

15 April, 2011

Trie – An efficient way to search data
Trie is an information retrieval data structure efficiently extracts required data. For example, see the following,
  1. When we type a name on smart phones it displays all the contacts containing the typed-in string. What data structure will fill such needs?
  2. In advanced editors like Visual Studio, Qt Creator, etc… when we type the identifier name, it automatically displays all the members part of that object. How is it possible?
  3. On Linux (even on emulated DOS in Windows), if we type characters “gcc” + TAB key, it displays all the commands with “gcc” as prefix. What data structures is appropriate here?
All the above cases use a kind of trie data structure. Perhaps, the practical implementation may vary based on kind of application.
There is a demand for tries on Geeksforgeeks. We will be posting few articles on these in coming weeks.

Monday, April 11, 2011

Differences between Semaphores and Mutexes

11, Apr 2011
What are the differences between semaphores and mutexes?. Are binary semaphore and mutex same or just a notational convenience?
It is typical interview question. Surprisingly, we get correct answers rare.
From my learnings during RTOS design, I am preparing an article for Geeksforgeeks (link), it will be published soon.
An interesting question, if you observe Windows word 2003, Adobe PDF reader, only one instance will be running in the task manager. Where as Windows Excel 2003 will create as many processes as number of excel documents opened. How to implement, only one instance of process must be running, invoking another instance should prompt "An instance is already running" message?

Sunday, April 10, 2011

Beautiful Architecture - Worth Reading (Productive Weekend)

10, Apr 2011.
Got new book "Beautiful Architecture". I got the book from local book shop. Good to know that the royalty on the book will be donated to "Doctors Without Boarders".
The book is collection of essays from experts that designed high end, and high availability systems. Yet to finish few chapters, left good learning experience.
Few topics to name,
1. Scalable Architectures
2. Resource Oriented Architectures
3. Data Centric Architectures (Facebook)
4. Virtualization (Xen Hypervisor)
5. Fault Tolerant Operating Systems
6. GNU Emacs and KDE (Qt)
7. Metacircular Virtual Machines
Happy and worthy weekend.

Friday, April 8, 2011

An Interview Experience

8 April, 2011
I don't like to present in interview panels unless really required. Today I had to be in an interview panel. Good puzzle I recapped after long time.
Two players A and B playing a game (similar to game of Nim). Given a pile of N coins, the players need to pick coins from the pile. In each move a player can pick atleast 1 and utmost m coins. The player who makes the last move is the winner.
For example if A starts the game, he can pick 1, 2, 3, ... m coins from the pile. Then B can pick 1, 2, 3, .... m coins from the remaining pile. Next A, followed by B, so on...
In the process, the player who makes last move (picks all remaining coins <= m from the pile) is the winner.
Now, given N = 15, m = 3, and A starts the game with some 1 <= x <= m coins. How many different ways are there that A can win the game?
Hint: Use state space analysis method (i.e. trees).

Pascal’s Identity - Theoretical Significance

April 8, 2011

We know that C (n, r) = C (n – 1, r – 1) + C (n – 1, r)
It is quite easy to prove it mathematically. But what is its significance in real life examples?
C (n, r) is selecting r elements from n elements. Pick a special element k from these n elements, we left with (n – 1) elements.
Now this r elements selection C (n, r) can be grouped into two categories,
A)      r elements that contain the element k.
B)      r elements that do not contain the element k.
From A) the special element k is in all r selections. Since k is part of r elements, we need to choose remaining (r – 1) elements from remaining (n – 1) elements, this can be done in C (n – 1, r – 1) ways.
From B) the special element k is not there in all r selections. It means, we will have to select all the r elements from available (n – 1) elements (we must exclude element k from n). This can be done in C (n – 1, r) ways.
Now it is evident that, the first term of RHS is combination containing special element k, and the second term is combination without the special element k. Sum of these two is selecting r elements from n elements.
Interesting proof, little confusing to understand, but useful in practice.

Pascal's Identity and Dynamic Programming:
If we note the identity, it is the base for recursive calculation of binomial coefficients. However, when plain recursion is used, it will lead to an exponential algorithm.
Recursion with memoization can result in an efficient algorithm. The idea behind this dynamic programming (programming we mean a method of tabulation) is to tabulate the first occurrence of binomial co-efficients.
Pascal's triangle can serve as the tabulation in the dynamic method. Each co-efficient is sum of coefficients in the previous row. The table can grow till min(k, i) where 0 <= i <= n.
Try the dynamic algorithm.
Note that there is no need of two dimensional array to memoization of partial results. This fact is evident from Pascal's triangle with an extra variable to store nCk.