<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>The Flickering Tubelight &#187; Tutorials</title>
	<atom:link href="http://flickeringtubelight.net/blog/category/technical-information-or-tutorials-on-some-engineeringmathematical-concepts/feed/" rel="self" type="application/rss+xml" />
	<link>http://flickeringtubelight.net/blog</link>
	<description></description>
	<lastBuildDate>Sat, 05 Feb 2011 02:30:57 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.2</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>A simple problem that led us to Ramanujan&#8217;s work on Integer Partitioning</title>
		<link>http://flickeringtubelight.net/blog/2010/09/a-simple-problem-that-led-us-to-ramanujans-work-on-integer-partitioning/</link>
		<comments>http://flickeringtubelight.net/blog/2010/09/a-simple-problem-that-led-us-to-ramanujans-work-on-integer-partitioning/#comments</comments>
		<pubDate>Sun, 12 Sep 2010 14:12:04 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Family]]></category>
		<category><![CDATA[Tidbits]]></category>
		<category><![CDATA[Tutorials]]></category>

		<guid isPermaLink="false">http://flickeringtubelight.net/blog/?p=182</guid>
		<description><![CDATA[Raghu, my cousin, sent me an email with the following problem a few months ago.
Question
Manish was on his way to an interview. On the way, he encountered his long lost cousin, Vijay, whom he hadn&#8217;t met in more than a decade. They started catching up on lost time. Manish learned that Vijay had 3 sons. [...]]]></description>
			<content:encoded><![CDATA[<p>Raghu, my cousin, sent me an email with the following problem a few months ago.</p>
<h2>Question</h2>
<p>Manish was on his way to an interview. On the way, he encountered his long lost cousin, Vijay, whom he hadn&#8217;t met in more than a decade. They started catching up on lost time. Manish learned that Vijay had 3 sons. When he asked about their ages, Vijay replied, &#8220;You&#8217;re going for an interview, right? Consider this a trial question. Figure out their ages from this: The product of the ages of my three sons is 36.&#8221; To this, Manish grumbled that he needed more information. Vijay, then, pointed to a sign board across the street that displayed the address of the area and said that the sum of the ages of his three children was equal to the last two digits of the <span id="lw_1255020725_0" class="yshortcuts">pin code (zip code)</span> of that area. Manish demanded still more information. Finally, Vijay said, &#8220;My eldest son wore a black shirt today. This is all I can tell you.&#8221;</p>
<p>What were the ages of the three children?<span id="more-182"></span></p>
<h2>Solution</h2>
<p>Say the ages are a, b and c. We know from clue 1 that a.b.c = 36, where &#8220;.&#8221; represents multiplication. First step in identifying such factors is to factorize 36 into its smallest factors &#8211; 1.2.2.3.3. The next step is to figure out how to group these 5 numbers into 3 groups. Note that we can have more 1s in the factorization. For example, what if 2 kids are 1 year old and 1 &#8220;kid&#8221; is 36 years old? So, to allow that to happen, we need another 1 in the factors. So the factors are 1.1.2.2.3.3. By trial and error, we recognize that the way to make 3 groups from 6 objects is by making groups of:</p>
<p>1 + 1 + 4 (call it Grouping Style 1, or GS1)</p>
<p>1 + 2 + 3 (call it Grouping Style 2, or GS2)</p>
<p>and 2 + 2 + 2 (call it Grouping Style 3, or GS3)</p>
<p>GS1 requires selecting 1 out of 6 for the first component, 1 out of remaining 5 for the second component and the remaining 4 automatically go into the third component. There are 6 ways to choose the first component, 5 ways to choose the second component and 1 way to chose the third component. This gives us a total of 6.5=30 combinations for GS1. Many of these will turn out to be identical. After some work, we can whittle down the GS1 groupings to the following:</p>
<p>1,1,36</p>
<p>1,2,18</p>
<p>1,3,12</p>
<p>2,2,9</p>
<p>2,3,6</p>
<p>3,3,4</p>
<p>GS2 requires selecting 1 out of 6 for the first component, 2 out of 5 for the second component and the remaining 3 automatically go into the third component. There are 6 ways to choose the first component, 5C2 = 10 ways to choose the second component, and 1 way to choose the third component. This gives us a total of 6.10=60 combinations for GS2. In reality it is easier to work it out by trial and error. After some work, and after recognizing and ignoring the combinations that we have already seen under GS1, we can whittle down the GS2 groupings to the following:</p>
<p>1,4,9</p>
<p>1,6,6</p>
<p>Similar analysis for GS3 gives us no new combinations.</p>
<p>Before we can use clue 2, we need to add up the ages in these combinations. Doing so, we get the following:</p>
<p>1+1+36=38</p>
<p>1+2+18=21</p>
<p>1+3+12=16</p>
<p>2+2+9=13</p>
<p>2+3+6=11</p>
<p>3+3+4=10</p>
<p>1+4+9=14</p>
<p>1+6+6=13</p>
<p>Notice that all the combinations give us unique totals, except for two combinations which both give us 13. The fact that the second clue did not suffice to answer the question indicates that the total of the ages must have been 13. The children could be aged (2,2 and 9) or (1,6 and 6). Any other value for the total age and the answer would have been clear after clue 2.</p>
<p>Clue 3 tells us of the existence of an eldest child. The color of the shirt is immaterial. In the (1,6 and 6) combination, there is no eldest child. There are 2 elder children, who are twins. In the (2,2 and 9) combination, there is an eldest child. Hence, that is the answer. The children are aged 2 years, 2 years and 9 years.</p>
<h2>Intersecting Ramanujan&#8217;s trail</h2>
<p>With the specific problem out of the way, let us think about a generalization to the problem. India&#8217;s best known mathematician, Srinivasa Ramanujan, hiked (given his genius, he probably breezed) along a mathematical thought process, probably in 1913, leading to his work on Integer Partitions. In attempting to solve this problem, I seem to have unknowingly stepped onto this trail briefly. Let me explain. Remember that we had to figure out how many ways could 6 objects be grouped into 3 groups. I listed these out as groupings with (1,1,4), (1,2,3) and (2,2,2) objects. There are 3 grouping styles possible, no more, no less. But as the number of total objects grows larger, or the number of groups to create changes, the number of such grouping styles are harder to figure out. At least they seem to follow no simple pattern. For example, to group 6 objects into 2 groups, there are also 3 ways &#8211; {5,1}, {4,2} and {3,3}. To group 7 objects into 2 groups, there are only 3 groupings &#8211; {6,1} and {5,2} and {4,3}. To group 7 objects into 3 groups, however, there are 4 groupings &#8211; {1,1,5}, {1,2,4}, {1,3,3}, and {2,2,3}.</p>
<p>The question is, is there a more general formula to figure this (the number of distinct ways to group N objects into G groups) out? And another question is &#8211; what is Integer Partitioning and how is that related to this problem?</p>
<p>So before going into the theory, let us try to do look at this problem from different angles, and we may see an opening to solving it. Let us use the following example &#8211; find the number of ways of grouping 7 objects into 2 groups. We discovered (through some mental enumerations, I admit) that there are three grouping styles &#8211; {6,1}, {5,2}, {4,3} &#8211; possible here. But now, let me draw your attention to a simple fact &#8211; notice that the sum of the numbers each of the groupings is equal to 7. Hardly a surprise, you say. There were 7 objects to begin with. And regardless of how we group them, the total object count is 7. Big deal! But it is often rewarding to look at problems from a different perspective. We now know that &#8220;grouping 7 objects into 2 groups&#8221; is essentially the same as &#8220;finding 2 positive integers that add up to 7&#8243;. How many ways are there to add up 2 positive integers to get a total of 7? 6+1=7 and 5+2=7 and 4+3=7. No more ways to do it. So, the number of ways to group N things into G groups is equal to the number of ways to add G positive integers to make N. Allow me create a short hand notation to identify this count. I will use P(N,G). P for partition, but really you can use any symbol. I think it only makes sense when N&gt;=G. Otherwise P(N,G)=0.</p>
<p>P(N,G) = number of ways to add G positive integers to make N = numbers of ways to divide N objects into G groups</p>
<p>There are a couple of simple properties of P(N,G) which are easy to observe. There is only 1 way to divide N objects into N groups, and that is to assign 1 object per group. Similarly there is 1 way to divide N objects into 1 group, and that is to place all objects into 1 group. Symbolically,</p>
<p>P(N,N) = 1</p>
<p>P(N, 1) = 1</p>
<p>OK, I think we have beaten that one into submission (if not to death).</p>
<p>Note that we have only defined it. We we after a general formula for P(N,G). And we do not have that yet.</p>
<p>Now, let us visualize this slightly differently, meditate upon it a bit, and see if we can uncover some other properties about P(N,G). The following figure shows the objects, and groupings, visually. This style of representation, called, Ferrers Diagrams, uses one dot for one object. And it arranges the dots in each group along a <em>column</em>. There are 3 groupings shown &#8211; these are our friends {6,1}, {5,2} and {4, 3}. If we stare at this for some time we realize that the reason there are 2 columns is because we wanted 2 groups. This means there is at least 1 <em>row</em>, the first row, which has 2 dots in each of the groupings. Now, imagine that we remove those two dots from each of the groupings. What are we left with?</p>
<p><a href="http://flickeringtubelight.net/blog/wp-content/uploads/2010/09/IntegerPartitioning1.png"><img class="alignnone size-medium wp-image-660" title="IntegerPartitioning1" src="http://flickeringtubelight.net/blog/wp-content/uploads/2010/09/IntegerPartitioning1-299x174.png" alt="" width="299" height="174" /></a></p>
<p>The figure below shows the case where the 2 dots in the first row are removed. We are left with 5 dots. But more importantly notice the groupings that are left. {5}, {4,1} and {3,2}. These are some of the ways you can group 5 objects. This time, we do not necessarily group 5 objects into 2 groups &#8211; since we have a grouping, {5}, with only 1 group. It may be interesting to see how many ways there are in all to group 5 objects &#8211; not necessarily into 1 or 2 groups, but rather into <em>any</em> number of groups.</p>
<p><a href="http://flickeringtubelight.net/blog/wp-content/uploads/2010/09/IntegerPartitioning2.png"><img class="alignnone size-medium wp-image-661" title="IntegerPartitioning2" src="http://flickeringtubelight.net/blog/wp-content/uploads/2010/09/IntegerPartitioning2-299x168.png" alt="" width="299" height="168" /></a></p>
<p>The following figure shows <em>all</em> the ways to group 5 objects.</p>
<p>1 group &#8211; {5}</p>
<p>2 groups &#8211; {4,1}, {3,2}</p>
<p>3 groups &#8211; {3,1,1}, {2,2,1}</p>
<p>4 groups &#8211; {2,1,1,1}</p>
<p>5 groups &#8211; {1,1,1,1,1}</p>
<p><a href="http://flickeringtubelight.net/blog/wp-content/uploads/2010/09/IntegerPartitioning3.png"><img class="alignnone size-full wp-image-659" title="IntegerPartitioning3" src="http://flickeringtubelight.net/blog/wp-content/uploads/2010/09/IntegerPartitioning3.png" alt="" width="748" height="268" /></a></p>
<p>Note that the number of ways to group 5 into 1 and 2 groups in this figure exactly matches the 3 groups in the previous figure. That is P(7,2) = P(5,1)+P(5,2)! That is a breakthrough. So maybe there is a way to break down any P(N,G) into smaller and smaller pieces and add up the totals of the smaller pieces. For example, P(N,G) = P(N-G,1)+P(N-G,2)+ &#8230; + P(N-G,G). Let&#8217;s try this for P(7,2).</p>
<p>P(7,2) = P(5,1) + P(5,2) = 1 + P(3,1) + P(3,2) = 1 + 1 + P(1,1) = 1 + 1 + 1 = 3</p>
<p>Not quite a closed form, but an recursive solution.</p>
<h2>Integer Partitioning and a Generating Function for it</h2>
<p>By the way, this is a good place to (finally) get into Integer Partitioning. In the figure above we listed out <em>all</em> the ways to group 5 objects. That <em>is</em> the Integer Partitioning of 5. I use P(5) to represent it.</p>
<p>P(5) = P(5,1) + P(5,2) + P(5,3) + P(5,4) + P(5,5)</p>
<p>Though it is possible to use the recursive technique I described above to solve for P(5) term by term, there is a very interesting alternative way to figure it out. It is based on a near mind-bending technique, based on <em>Generating Functions</em>. It is powerful technique, used in many different places. I have not yet been able to pin down if the abstraction involved in using Generating Functions is pure genius or if it simply falls out of the mechanics of mathematics if only you start from the correct perspective. Let me attempt to convey how Generating Functions work in this specific situation.</p>
<p>We want to find out all the possible ways to group N objects. But instead of starting with this specific problem we turn the problem on its head and solve a much more general problem. Let us use a bunch of 1s, a bunch of 2s, a bunch of 3s etc. and see what we can add them up to. We do this exhaustively. That is, we we do not let <em>any</em> combination fall through the cracks. Then, we can count up all the different ways a bunch of addends give us the total N. For example, if N is 5, and we do this exhaustive (but structured) adding of all integers (repetitions allowed) and see how many combinations add to 5. We can be sure that none of the addends will be 6 or greater. Similarly, the <em>number</em> of addends will also be no greater than 5, because each addend is at least 1, and with 5 1s we already are at 5. So, the number of combinations we need to look at only needs to cover the addends 1 through 5, and no more than 5 of those addends.</p>
<p>This exhaustive search seems hard to do. But <em>generating functions</em> provide a structured method to approach this. Let us do it one step at a time. We will look at the potential contributions to the total, of each candidate addend one at a time.  Let us look at the addend 1. What totals can 1 contribute to? 1s can add up to 1 in 1 way, {1}. 1s can add up to 2 in 1 way, {1,1}. 1s can add up to 3 in 1 way {1,1,1} and so on. That is, 1 alone can add up to <em>any</em> N, AND, in only 1 way for each given N. In fact if there is no 1 in the addends, then 1 can contribute 0 to the total. In generating function terms this would be represented as follows:</p>
<p><img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_69a03512c3aee910860f83d0009cf6e3.png" align="absmiddle" class="tex" alt="1+ x+x^2+x^3+x^4+ ... " /></p>
<p>What? Where did that come from? That is typically, always the reaction in my own mind when I see that. Essentially, the order of <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_1f31f8c0da2e32b6acaa5b9a0e5154e9.png" align="absmiddle" class="tex" alt="x^k" />, which is k, indicates the total that the addends (remember we are only talking about the addends being 1) add up to. The coefficient of <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_1f31f8c0da2e32b6acaa5b9a0e5154e9.png" align="absmiddle" class="tex" alt="x^k" />, which is 1, indicates the number of ways the addends can be added up to get to k. There is only 1 way to do this using only 1s. And there may be no 1s in the addends, and that is the term <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_1fbafd7495b60c9332f7887a3ae2c07e.png" align="absmiddle" class="tex" alt="x^0=1" />.</p>
<p>But what if 2s are allowed? There can be 1 two, or 2 twos, or 3 twos, etc. Thus the generating function based on contributions only from twos is as follows:</p>
<p><img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_5dba174bc073f3ef8503acb33dc3e5e0.png" align="absmiddle" class="tex" alt="1+ x^2 + x^4 + x^6 + ... " /></p>
<p>This basically means twos can add up to 0 or 2 or 4 or 6 or 8 etc.</p>
<p>Now, we know that we can have both 1s and 2s in the set of addends. In fact, we can have 1s, and 2s, and 3s, and 4s etc. Let us restrict the addends to 1s and 2s for the time being. There can be 1 one and 1 two, or 1 one and 2 twos or 2 ones and 1 two etc. How do we mathematically express these combinations? This is where the true import of the elegance of generating functions becomes clear. By allowing the contributions of the addends to be in the power of x position, products of <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_347b99be8c291ade0c6b4d680e18916a.png" align="absmiddle" class="tex" alt="x^a" /> and <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_dcfb056901a422c68ba50c438ae7c635.png" align="absmiddle" class="tex" alt="x^b" /> correctly give us the sum in the power&#8217;s position, <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_0e23fc9dd1637fc9aa5c3cb8c3c2f91f.png" align="absmiddle" class="tex" alt="x^(a+b)" />.  Further, if there are multiple ways to add up to a given addend, they show up in the coefficient position. That is, as an example, say we want to figure out the Integer Partitions of any number N, using only the addends 1 and 2, we can <em>multiply</em> the individual addends&#8217; generating polynomials.</p>
<p><img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_5f53fc51b5b4e8fcdf493e680d47b4b7.png" align="absmiddle" class="tex" alt="(1+x+x^2+x^3+x^4+...) . (1+x^2+x^4+x^6+...)" /></p>
<p>This product of infinite polynomials will give us every possible way to add up 1s and 2s to get us to a specific total N. Say N=5, then we know we need not worry about <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_aca32a2d4ede6a4a5babdc499b929bff.png" align="absmiddle" class="tex" alt="x^5" /> and above in either polynomial.</p>
<p><img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_b270911d642c04b98671cc6751caeaaf.png" align="absmiddle" class="tex" alt="(1+x+x^2+x^3+x^4) . (1+x^2+x^4)" /></p>
<p>And after some math, we end up with the necessary coefficient for <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_aca32a2d4ede6a4a5babdc499b929bff.png" align="absmiddle" class="tex" alt="x^5" />, and that is the number of ways to get to 5 using only 1s and 2s.</p>
<p>Now, just extending this to allow contributions from the addends 3, 4, 5 etc., we get the full generating function for Integer Partitioning.</p>
<p>P(N) = coefficient of <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_bb8ff14b3a9066cd00876b872d73a795.png" align="absmiddle" class="tex" alt="x^N" /> in <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_7e371625f7427d6278d6d104e2d52ddb.png" align="absmiddle" class="tex" alt="(1+x+x^2+x^3+x^4+...) . (1+x^2+x^4+x^6+...) . (1+x^3+x^6+x^9+...) . (1+x^4+x^8+x^12+...)..." /></p>
<p>Whew! So we have that out of the way. What amazes me no end is how generating functions are able to hijack this polynomial product form to solve (or, more accurately, bring structure to) seemingly impossible scenarios that need counting. Notice that we still do not have a closed form for the Partition of Integers, P(N). Computers can be employed for the multiplication of terms and accumulation of coefficients.</p>
<h2>A Closed Form Approximation</h2>
<p>In 1918, Srinivasa Ramanujan and his advisor, G. Hardy, came up with a <em>closed form</em> , albeit a closed form for an <em>approximation</em> to P(N). With the current understanding of Integer Partitioning, this seems like an amazing accomplishment. This closed form was:</p>
<p><img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_6cc545f579fa05596cc8fbb9e88f2ca2.png" align="absmiddle" class="tex" alt="P(N) \approx \frac{e^{\pi\cdot\sqrt{2N/3}}}{4N\sqrt{3}}" /></p>
<p>This produces pretty good approximations. For example, P(100) = 190,569,292, and this closed form approximates it to around 199 million.</p>
<h2>References</h2>
<p>1. Joseph Laurendi, Partitions of Integers, 2005, http://www.artofproblemsolving.com/Resources/Papers/LaurendiPartitions.pdf</p>
<p>2. Wikipedia, http://en.wikipedia.org/wiki/Partition_%28number_theory%29</p>
]]></content:encoded>
			<wfw:commentRss>http://flickeringtubelight.net/blog/2010/09/a-simple-problem-that-led-us-to-ramanujans-work-on-integer-partitioning/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Discovering Hamming Codes</title>
		<link>http://flickeringtubelight.net/blog/2010/07/discovering-hamming-codes/</link>
		<comments>http://flickeringtubelight.net/blog/2010/07/discovering-hamming-codes/#comments</comments>
		<pubDate>Mon, 19 Jul 2010 02:43:36 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Information]]></category>
		<category><![CDATA[Tidbits]]></category>
		<category><![CDATA[Tutorials]]></category>

		<guid isPermaLink="false">http://flickeringtubelight.net/blog/?p=480</guid>
		<description><![CDATA[Digital data, transmitted over a communication medium (wireless, optical fiber, copper wire), or stored in some storage medium (such as computer memory or hard disk), is prone to bit-flips and errors. For example, if the message &#8220;10110101000101010&#8243; means &#8220;BILL JOHN&#8221; and communication channel noise flips a bit, the message received may be &#8220;10010101000101010&#8243;, meaning, &#8220;KILL [...]]]></description>
			<content:encoded><![CDATA[<p>Digital data, transmitted over a communication medium (wireless, optical fiber, copper wire), or stored in some storage medium (such as computer memory or hard disk), is prone to bit-flips and errors. For example, if the message &#8220;10110101000101010&#8243; means &#8220;BILL JOHN&#8221; and communication channel noise flips a bit, the message received may be &#8220;10010101000101010&#8243;, meaning, &#8220;KILL JOHN&#8221;. Now, that could create a problem. The problem also exists in data that is sitting untouched on a digital storage medium. Have you ever noticed that if you open some photo file on your computer, after years of storage, they develop strange colors and often do not display fully? This could be due to some bit errors in the stored 1s and 0s that represent the image file data.<span id="more-480"></span></p>
<p>One way to avoid (or, at least drastically reduce) this problem is to either send or store multiple copies of the information. Say you send the message &#8220;BILL JOHN&#8221; 3 times. One of the times the error converts this to &#8220;KILL JOHN&#8221;, but two of the times the message transmits successfully. Then you know that most likely the intended message was &#8220;BILL JOHN&#8221;. Similarly, your bank probably stores your account information of multiple storage locations to avoid this and other problems (such as, catastrophic data loss due to a fire).  However, redundancy is not very efficient, and may be overkill. Further, it may not be always possible to apply (say the communication takes a long time and repeat communication is not possible).</p>
<p>Another solution is to add a small amount of extra information to the message being communicated or stored to validate the correctness of the data. This is, in effect, redundancy, but is often a more space-efficient form of it. Its purpose is primarily to detect errors in the message data, not to handle catastrophic destruction of data (which is the primary purpose of full redundancy). The most common way to detect errors in transmitted data is by the use of parity bits. The data is first divided into blocks of bits, say 7 or 15-bit blocks. Next, for each block of bits a parity bit is added, thus making the block size 8 or 16 bits. For example, if the original data is &#8220;10110101000101010&#8243;, and the block size is 7 +1 parity bit, then the data is first extended to make it a multiple of 7. So, say the message is changed to &#8220;000010110101000101010&#8243;. Then, it is broken into 7-bit blocks &#8211; &#8220;0000101&#8243;, &#8220;1010100&#8243;, &#8220;0101010&#8243;. And finally, for each 7-bit block one more bit is added. Say, it is added to the right end. The parity bit is dependent on the data bits. The parity bit is set to a 1 or a 0 depending on the number of 1s in the message. If there are an odd number of 1s in the message, the parity bit is set to 1 to make the total parity of the 8-bit block even. I am assuming even parity in this example; the parity may also be set up to be odd. Going back to the example, if there are already an even number of 1s in the 7-bit data block the parity bit is set to 0. The recipient knows what the block size is, where the parity bit appears, and what kind of parity (Even or Odd Parity) is being used. This is part of a pre-established communication agreement. For example, &#8220;0000101&#8243; has 2 1s, that is, an even number of 1s. The parity bit is set to 0 and appended to these 7 bits to make the transmission block, &#8220;0000101<strong>0</strong>&#8220;. Upon transmission, say there is a bit flip and the message becomes &#8220;00101010&#8243;. The receiver can then figure out that the parity, which is supposed to be even is now odd, and therefore indicates an error. The receiver can then request a retransmission. Notice that the parity bit itself may be transmitted in error, and that triggers an error as well (even though in reality the data bits were all transmitted fine). The size of the block is set to be small enough so that the probability of a bit flip is very very low. More importantly, the probability of <em>two</em> flips is infinitesimally small. Notice that if there <em>are</em> two flips, however, a single bit parity scheme as described above will not work. There is then a need for a more involved parity scheme. If the channel or storage medium has a propensity for unidirectional flips (that is, say it can only flip a 1 to a 0, bit never a 0 to a 1), then a counter may be maintained along with the data to count the number of 1s in the data block. If the number of 1s changes, then the receiver can detect an error. It is a little tricky once you realize that the number of 1s counted must include the number of 1s in the counter itself! The counter bits need to be transmitted as well, and are subject to bit flips as well.</p>
<p>In some situations, there may not be a second copy to rerequest the data upon detecting the error. Or in case of a communication channel, it may take too long to request a replay. And regardless of the application, it seems mathematically challenging to come up with a way to send a message such that not only is the error detected, it is also corrected by the receiver. This was the challenge that Richard Hamming took up an solved in an ingenious way. Here I would like to think through the same problem and come with a solution, which will then help us think through some of Hamming&#8217;s thoughts in the 1930s and 40s.  We will stick to the single bit error scenario. We can always reduce the block size sufficiently that only a single bit error can occur with any meaningful probability.</p>
<p>The first insight that led to the discovery of the Hamming code seems to be that some extra meta data (just like parity) needs to be transmitted in addition to the main data. In the presence of this extra data the receiver must be able to identify a bit flip in any bit in the transferred message (including a bit flip in the meta bits).  How do we uniquely identify a bit that flipped? We need to identify the position of the bit that flipped. In a message with d bits of data and p bits of meta data, the total number of transmitted bits is d+p. To identify a unique position where the flip may have occurred, we need log2(d+p) bits. The genius of Hamming was in recognizing that the bits being transmitted should be grouped into several groups such that each bit was a member of a unique set of groups. Further, the binary representation of the bit positions was the simplest way to identify the groups.</p>
<p>Say the message were 10 bits long. That is, d+p=10. The 10 positions can be represented as:</p>
<p>0001 &#8211; message bit 1<br />
0010 &#8211; message bit 2<br />
0011 &#8211; message bit 3<br />
0100 &#8211; message bit 4<br />
0101 &#8211; message bit 5<br />
0110 &#8211; message bit 6<br />
0111 &#8211; message bit 7<br />
1000 &#8211; message bit 8<br />
1001 &#8211; message bit 9<br />
1010 &#8211; message bit 10</p>
<p>Notice that we start from 0001, and not 0000. This is because we need to make sure all numbers have 1 (or 0). This means to represent 8 positions (d+p=8), we actually need 4 bits. So the number of groups = log2(d+p+1). Further, notice that the binary representation of every message-bit-position has 1s in unique power-of-2 positions. This is obvious. After all, that is how we represent the positions uniquely. But the insight Hamming was able to draw was that each of the representations (left column above) with a 1 in a certain power-of-2 position could be clubbed together into a group. That is, in the example above, message bits with a 1 in the 2^0 position could be grouped into Club 0. Message bits with a 1 in the 2^1 position could be grouped into Club 1. Notice that some of the numbers which were members of Club 1 were also members Club 0 (for example message bit 3). Similarly, Club 2 and Club 3 could be formed. Further, exactly 4 clubs were needed to identify every bit (including the meta bits uniquely).</p>
<p>0 | 0 | 0 | 1 | &#8211; message bit 1  &#8211; Club 0<br />
0 | 0 | 1 |0 | &#8211; message bit 2 &#8211; Club 1<br />
0 | 0 | 1 | 1 | &#8211; message bit 3 &#8211; Club 0,1<br />
0 | 1 | 0 | 0 | &#8211; message bit 4 &#8211; Club 2<br />
0 | 1 | 0 | 1 | &#8211; message bit 5 &#8211; Club 0, 2<br />
0 | 1 | 1 | 0 | &#8211; message bit 6 &#8211; Club 1, 2<br />
0 | 1 | 1 | 1 | &#8211; message bit 7 &#8211; Club 0, 1, 2<br />
1 | 0 | 0 | 0 | &#8211; message bit 8 &#8211; Club 3<br />
1 | 0 | 0 | 1 | &#8211; message bit 9 &#8211; Club 0, 3<br />
1 | 0 | 1 | 0 | &#8211; message bit 10 &#8211; Club 1, 3</p>
<p>Notice that <em>no</em> two bit-position-representations <em>can</em> belong to the same set of Clubs. After all, every bit position is represented uniquely in binary, and a 1 in the binary representation corresponds to a &#8220;key&#8221; to a certain club. No two binary representations are the same and so no two bit-positions can belong to the same club combination.</p>
<p>The groups/clubs thus formed are:</p>
<p>Club 0 &#8211; message bits 1, 3, 5, 7, 9<br />
Club 1 &#8211; message bits 2, 3, 6, 7, 10<br />
Club 2 &#8211; message bits 4, 5, 6, 7<br />
Club 3 &#8211; message bits 8, 9, 10</p>
<p>Now, Hamming must have realized that he has created clubs with careful membership such that a unique member could be located if we knew which clubs he belonged to. So, a flipping bit can be located if that flip can identify all the clubs the bit belongs to. The clubs must start with some common property, and the flip must change that property for each club that it belongs to. This would help identify the clubs in which the flipped bit has membership, and by that unique combination, we can identify the flipped bit.</p>
<p>That unique property that Hamming gave each club was parity. Since no two clubs has the same membership, each club needed at least 1 unique parity bit. Since we want to keep the number of parity bits as small as possible to keep the message overhead as small as possible, the smallest solution would need as many parity bits as clubs. And we know that for transmitting d+n bits, we need log2(d+p+1) bits to represent the positions and, therefore, log2(d+p+1) clubs. This leads to the following observation:</p>
<p>The minimum overhead bits/parity bits = log2(d+p+1). But we assumed that the number of parity bits = p.</p>
<p>p &gt;= log2(d+p+1)</p>
<p>Now we know how many groups, which bits make up the groups and the minimum number of parity bits. In the above example with d+p=10, p=4. Thus for 6 data bits we need to transmit 4 parity bits. But we must recognize that the parity bits cannot be simply lumped together into any casually selected positions. This is because without careful selection we may not have at least 1 parity bit per group. For example if the first 4 bit positions of a 10 bit message are used for parity, there would be no parity bits in Club 3. If we instead used the last 4 bits in a 10-bit message for parity, each club would have at least one parity bit, but all 3 of Club 3&#8217;s members would be parity bits. This creates a dependency on filling out the parity bits on the sender side. First parity bit 7 would have to be established based on Club 2 which has 3 data bits and only 1 parity bit. Then the remaining parity bits in Club 0 and Club 1 would have to be established, followed finally by the parity for Club 3. Hamming realized that it would be much easier to ensure that one parity bit belonged to 1 club, no more no less. All the members of a club are data bits except the one parity. This helps speed up setting the parity bit &#8211; there is no dependency and fixed order in which the parity bits need to be established. The bit positions in the message which belong t0 only 1 club are positions with a single 1 and all else 0s. These are also the power-of-two bit positions. Bit position 1, 2, 4 and 8. By making these bits parity bits, there is another advantage. At the receiving end, once parity is established for the various clubs, if it turns out that club 2 and club 1 are not maintaining parity, then to identify the bit position which flipped all that is needed is to calculate 2^2 + 2^1 = 6. Bit at position 6 is the only bit which is a member of Club 2 and 1 and no other club.</p>
<p>References:<br />
I found this video of a class, held in University of New South Wales, Australia, and taught by Professor Richard Buckland, to be very useful in helping me understand how Hamming Codes work. The Hamming Code material starts at minute 8:00.<a href="http://www.youtube.com/watch?v=kE7V7UI4jpk"> http://www.youtube.com/watch?v=kE7V7UI4jpk</a></p>
]]></content:encoded>
			<wfw:commentRss>http://flickeringtubelight.net/blog/2010/07/discovering-hamming-codes/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Thoughts on the mathematical constant e</title>
		<link>http://flickeringtubelight.net/blog/2010/05/thoughts-on-the-mathematical-constant-e/</link>
		<comments>http://flickeringtubelight.net/blog/2010/05/thoughts-on-the-mathematical-constant-e/#comments</comments>
		<pubDate>Sun, 16 May 2010 15:54:23 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Information]]></category>
		<category><![CDATA[Tidbits]]></category>
		<category><![CDATA[Tutorials]]></category>

		<guid isPermaLink="false">http://flickeringtubelight.net/blog/?p=432</guid>
		<description><![CDATA[e for exponential
The mathematical constant  shows up in strange places. Moreover, its significance is not as easy to grasp as that of the other famous constant, , because there is no easy physical object in whose context to imagine it. For example,   is the ratio of the circumference to the diameter of [...]]]></description>
			<content:encoded><![CDATA[<h2>e for exponential</h2>
<p>The mathematical constant <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_e1671797c52e15f763380b45e841ec32.png" align="absmiddle" class="tex" alt="e" /> shows up in strange places. Moreover, its significance is not as easy to grasp as that of the other famous constant, <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_4f08e3dba63dc6d40b22952c7a9dac6d.png" align="absmiddle" class="tex" alt="\pi" />, because there is no easy physical object in whose context to imagine it. For example,  <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_4f08e3dba63dc6d40b22952c7a9dac6d.png" align="absmiddle" class="tex" alt="\pi" /> is the ratio of the circumference to the diameter of  a circle. Yes, it is irrational, but if you can get over that mystery (or ignore it for the time being), it is straightforward to <em>imagine</em> what <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_4f08e3dba63dc6d40b22952c7a9dac6d.png" align="absmiddle" class="tex" alt="\pi" /> is. Every circle does seem to have a certain <em>circleness</em>, which makes them all look the same. It is intuitively not hard to agree with the hunch that every circle has an unchanging ratio between the circumference and the diameter; and it makes sense to keep that ratio handy and give it a name.</p>
<p>The constant <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_e1671797c52e15f763380b45e841ec32.png" align="absmiddle" class="tex" alt="e" /> is considerably more elusive. It appears, at first, to be a number you would not go hunting after. You just happen to stumble upon in during one of your mathematical excursions; it seems interesting enough that you then pick it up and put in in your pocket for some potential use later. After stumbling upon the same thing along other mathematical excursions, in hindsight, it does seem to be something rather useful. Something you <em>should</em> have gone looking for.<span id="more-432"></span></p>
<p>Jacob Bernoulli, sometime in the late 1600s, stumbled upon this constant when he was trying to calculate the maximum compound interest that can be earned on an investment by compounding continuously.</p>
<p>If you invest 1 dollar in an account that earns 100% annual interest, then after 1 year the value of your investment will be 2 dollars assuming the compounding (interest calculation) happens once during the year. Assuming you do not touch the money and let it grow, by the end of 2 years, the money would have doubled again and you&#8217;ll have 4 dollars in all. Now, imagine the bank decided to compound every 6 months instead of once a year. The annual interest rate remains the same, 100%. 50% is applied at the end of 6 months and another 50% is applied at the end of the year. The 1 dollar grows to 1.50 after the first 6 months. For the second half of the year 1.50 dollars grow in value. At the end of the first year, the total value in your account is 1.50 + (1.50*50%), that is, 2.25 dollars. Not bad! Just by compounding every 6 months you made 2.25 instead of 2. Now you start thinking (as, perhaps, did Jacob Bernoulli). What if the bank compounded every 3 months? What if the bank compounded every month? Every week? Every day? Every second? Every picosecond? A similar question is what Jacob Bernoulli asked himself in the late 1600s. It turns out that this calculation results in an infinite series, which when added up closes in on 2.718281&#8230;This means if your bank was really generous and compounded <em>continuously</em>, your 1 dollar would become 2.718281 dollars at the end of the year. Quite a swing there &#8211; between 2 and 2.71, wouldn&#8217;t you say? And this given that the interest rate is unchanged between the two scenarios. (Aside: In general, pay some attention to how often your investments are being compounded, not <em>just</em> to the the annual rate of return.)</p>
<p>That number, 2.718281&#8230;, shows up in such seemingly disparate scenarios that it may be hard to see the connection between those scenarios. I recall that the way this constant was introduced to me in school was entirely different from the discussion above. I was taught that <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_c9d81daf1dc94d468bf1ad47a8180461.png" align="absmiddle" class="tex" alt="c\cdot{e^{x}}" />, where <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_4a8a08f09d37b73795649038408b5f33.png" align="absmiddle" class="tex" alt="c" /> is a constant, is the only function whose derivative is equal to itself. That is, the slope of the curve is the value of the curve. Wow, I thought. That is quite a curve. I now feel that though that was correct, it may not have been the best way to introduce <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_e1671797c52e15f763380b45e841ec32.png" align="absmiddle" class="tex" alt="e" />; it does not say <em>why</em> <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_c9d81daf1dc94d468bf1ad47a8180461.png" align="absmiddle" class="tex" alt="c\cdot{e^{x}}" /> is the <em>only</em> function that satisfies this property. Further, it is not explained why a function that does satisfy this property should even <em>be</em> of the basic form <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_378ef468365a2fd4ae953f909ad2dee0.png" align="absmiddle" class="tex" alt="a^{x}" />, where a is a constant.</p>
<p>I decided therefore to think through two things. First, I wanted to understand <em>why</em> <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_c9d81daf1dc94d468bf1ad47a8180461.png" align="absmiddle" class="tex" alt="c\cdot{e^{x}}" /> the <em>only </em>function whose slope at a given <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_9dd4e461268c8034f5c8564e155c67a6.png" align="absmiddle" class="tex" alt="x" /> is the same as its value. I wanted to approach this problem by using the knowledge of this special property, and then trying to come up with the function <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_50bbd36e1fd2333108437a2ca378be62.png" align="absmiddle" class="tex" alt="f(x)" /> which would fit this requirement. Second, I wanted to tie together the two seemingly different arenas where e shows up &#8211; continuous compounding and as the base of the curve whose slope is equal to its value. There are other areas where <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_e1671797c52e15f763380b45e841ec32.png" align="absmiddle" class="tex" alt="e" /> shows up also. For example, if the probability of winning a wager is 1 in <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_7b8b965ad4bca0e41ab51de7b31363a1.png" align="absmiddle" class="tex" alt="n" />, and a man places <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_7b8b965ad4bca0e41ab51de7b31363a1.png" align="absmiddle" class="tex" alt="n" /> bets, then the probability that he wins at least one bet is <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_e829777ab97c93ef78b95b37ec071e96.png" align="absmiddle" class="tex" alt="\frac{1}{e}" />. This question is related closely to the <em>hat check</em> problem (also known as <a href="http://en.wikipedia.org/wiki/Derangement">Derangements</a>). <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_e1671797c52e15f763380b45e841ec32.png" align="absmiddle" class="tex" alt="e" /> also shows up in <a href="http://en.wikipedia.org/wiki/Euler%27s_identity">Euler&#8217;s Identity</a>. I will need to do more digging before I attempt to understand, intuitively, why <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_e1671797c52e15f763380b45e841ec32.png" align="absmiddle" class="tex" alt="e" /> shows up in those scenarios so I may not tackle these in much depth in this article.</p>
<h2><em>Why</em> is <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_c9d81daf1dc94d468bf1ad47a8180461.png" align="absmiddle" class="tex" alt="c\cdot{e^{x}}" /> the <em>only</em> function whose slope at a given <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_9dd4e461268c8034f5c8564e155c67a6.png" align="absmiddle" class="tex" alt="x" /> is the same as its value?</h2>
<p>Before I looked at this particular issue and attempted to uncover the function <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_50bbd36e1fd2333108437a2ca378be62.png" align="absmiddle" class="tex" alt="f(x)" /> which gave me this special property, I decided to start small. I decided to look at some other canonical functions that might have been interesting. For example, &#8220;What is a function whose value, <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_50bbd36e1fd2333108437a2ca378be62.png" align="absmiddle" class="tex" alt="f(x)" />, is equal to <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_9dd4e461268c8034f5c8564e155c67a6.png" align="absmiddle" class="tex" alt="x" />?&#8221;, &#8220;What is a function whose value is equal to <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_32f5240d0dbf2ccbe75ef7f8ef2015e0.png" align="absmiddle" class="tex" alt="x^2" />?&#8221;, &#8220;What is a function whose value is equal to <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_50bbd36e1fd2333108437a2ca378be62.png" align="absmiddle" class="tex" alt="f(x)" />?&#8221;. The first question is easy. It defines <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_f8abf34599677984dfe91b4f300389f6.png" align="absmiddle" class="tex" alt="f(x)=x" />. The curve for this is a straight line at a <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_4af7d16ee8abbb68901ca728d6d66eb5.png" align="absmiddle" class="tex" alt="45^{\circ}" /> angle through the origin. The second one is easy as well. It defines <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_d271cedde6675e55152d3c7a4236f775.png" align="absmiddle" class="tex" alt="f(x)=x^2" />. A parabola passing through the origin, (1,1) and (-1,1). The third question is the easiest of all, but the one that gets us closest to the question we are going after. It defines <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_a8057018c4637b6521b5be0a17519865.png" align="absmiddle" class="tex" alt="f(x)=f(x)" />! That is silly. Any function will fit. Notice, however, that in the third question the value of the function is dependent on the value of the function and not directly on <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_9dd4e461268c8034f5c8564e155c67a6.png" align="absmiddle" class="tex" alt="x" />. This starts to highlight the significance of the question we are trying to answer. We want to discover a function, <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_48879efd6f01be071ce5f17d2d102051.png" align="absmiddle" class="tex" alt="f(x)=f'(x)" />. There is something quirky going on here. The right hand side is the derivative of the left hand side. So which comes first? It seems like you cannot really know the slope at <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_50bbd36e1fd2333108437a2ca378be62.png" align="absmiddle" class="tex" alt="f(x)" />, until you define <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_50bbd36e1fd2333108437a2ca378be62.png" align="absmiddle" class="tex" alt="f(x)" />. But then, you need the slope of <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_50bbd36e1fd2333108437a2ca378be62.png" align="absmiddle" class="tex" alt="f(x)" /> to define <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_50bbd36e1fd2333108437a2ca378be62.png" align="absmiddle" class="tex" alt="f(x)" /> because its value <em>is</em> the slope of <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_50bbd36e1fd2333108437a2ca378be62.png" align="absmiddle" class="tex" alt="f(x)" />. It is a bit confusing. The trick is to realize that this function is continuous. So the value of the slope of <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_50bbd36e1fd2333108437a2ca378be62.png" align="absmiddle" class="tex" alt="f(x)" /> will be pretty close to the value of the slope of <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_38a932caefd461a1063aa32f21fb5ffd.png" align="absmiddle" class="tex" alt="f(x-\Delta{x})" />, the slope of the curve <em>just</em> prior to x. (This recursion also vaguely points to an upcoming infinite series.)</p>
<p>Now, let us try to find the function which satisfies the interesting property, <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_48879efd6f01be071ce5f17d2d102051.png" align="absmiddle" class="tex" alt="f(x)=f'(x)" />. In words, this means the <em>rate</em> of growth of this function at a given point is equal to the value of the function at that point. More on that in a bit. Let us use the fact that the function is continuous and therefore, <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_74caf4d1ec90d3a36ea7c7bbfe65b516.png" align="absmiddle" class="tex" alt="f'(x)" />, the slope of the function <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_50bbd36e1fd2333108437a2ca378be62.png" align="absmiddle" class="tex" alt="f(x)" /> at <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_9dd4e461268c8034f5c8564e155c67a6.png" align="absmiddle" class="tex" alt="x" /> is given by the following.</p>
<p><img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_cb461640770db1f53f41d005ba80f131.png" align="absmiddle" class="tex" alt="f'(x)=\frac{f(x+\Delta{x})-f(x)}{\Delta{x}}" /></p>
<p>Since we <em>know</em> that the function we are trying to uncover has the special property that <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_48879efd6f01be071ce5f17d2d102051.png" align="absmiddle" class="tex" alt="f(x)=f'(x)" />, we can substitute <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_50bbd36e1fd2333108437a2ca378be62.png" align="absmiddle" class="tex" alt="f(x)" /> instead of <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_74caf4d1ec90d3a36ea7c7bbfe65b516.png" align="absmiddle" class="tex" alt="f'(x)" /> in the previous equation, to get the following.</p>
<p><img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_559b621365785f1f261b7e3ee234d980.png" align="absmiddle" class="tex" alt="f(x)=\frac{f(x+\Delta{x})-f(x)}{\Delta{x}}" /></p>
<p>This can be rewritten as</p>
<p><img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_8907255c33a0872cf41376aea67ad54b.png" align="absmiddle" class="tex" alt="f(x)\cdot\Delta{x}=f(x+\Delta{x})-f(x)" /></p>
<p>which in turn can be rewritten as</p>
<p><img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_77af21e01765b6968d7b9b8645dae9d2.png" align="absmiddle" class="tex" alt="f(x)\cdot(1+\Delta{x})=f(x+\Delta{x})" /></p>
<p>I will substitute <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_9dd4e461268c8034f5c8564e155c67a6.png" align="absmiddle" class="tex" alt="x" /> with <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_0d99278a0c5a59c1ed3ef4e8420682cb.png" align="absmiddle" class="tex" alt="x-\Delta{x}" /> (this is unfortunately one of those things which you do <em>after</em> you have worked out the steps and realize that doing something like this may make things a little cleaner to understand).</p>
<p>So we get, <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_ce43eacc5ce2f8253c77387cd0905b88.png" align="absmiddle" class="tex" alt="f(x-\Delta{x})\cdot(1+\Delta{x})=f(x)" /></p>
<p>Swapping the right and left sides, we get the following.</p>
<p><img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_ee82a2c1deb4a2c008de1dc38cdb3095.png" align="absmiddle" class="tex" alt="f(x)=f(x-\Delta{x})\cdot(1+\Delta{x})" /></p>
<p>The above equation represents <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_50bbd36e1fd2333108437a2ca378be62.png" align="absmiddle" class="tex" alt="f(x)" /> in terms of <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_38a932caefd461a1063aa32f21fb5ffd.png" align="absmiddle" class="tex" alt="f(x-\Delta{x})" />. Using steps similar to the ones shown above if we express <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_38a932caefd461a1063aa32f21fb5ffd.png" align="absmiddle" class="tex" alt="f(x-\Delta{x})" /> in terms of <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_0742b20f5c77cfb3ed26433fa693a2b8.png" align="absmiddle" class="tex" alt="f(x-2\cdot\Delta{x})" /> we get the following.</p>
<p><img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_37b2607446f16c5128f87a967aa050d0.png" align="absmiddle" class="tex" alt="f(x)=f(x-2\cdot\Delta{x})\cdot(1+\Delta{x})^2" /></p>
<p>Extending this process <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_7b8b965ad4bca0e41ab51de7b31363a1.png" align="absmiddle" class="tex" alt="n" /> times we get the following.</p>
<p><img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_25a4cf252408c54a9ef8179507799639.png" align="absmiddle" class="tex" alt="f(x)=f(x-n\cdot\Delta{x})\cdot(1+\Delta{x})^n" /></p>
<p>If <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_7b8b965ad4bca0e41ab51de7b31363a1.png" align="absmiddle" class="tex" alt="n" /> is large enough that <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_72feff558dbc6136047b57d529b409a1.png" align="absmiddle" class="tex" alt="n\cdot\Delta{x}=x" /> then the above equation becomes very interesting.</p>
<p><img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_7a69f9b00ed22198ac9029021a84216f.png" align="absmiddle" class="tex" alt="f(x)=f(0)\cdot(1+\frac{x}{n})^n" /></p>
<p>Notice that <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_9dd4e461268c8034f5c8564e155c67a6.png" align="absmiddle" class="tex" alt="x" /> is still not a power to any base. So <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_50bbd36e1fd2333108437a2ca378be62.png" align="absmiddle" class="tex" alt="f(x)" /> is still not an exponential curve, at least it does not look so. It does seem like it may expand into a sum of polynomials with the largest polynomial being of degree <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_7b8b965ad4bca0e41ab51de7b31363a1.png" align="absmiddle" class="tex" alt="n" />. A polynomial, however large in degree, grows slower than an exponential curve (as long as the degree is finite). However, note that <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_7b8b965ad4bca0e41ab51de7b31363a1.png" align="absmiddle" class="tex" alt="n" /> is not a finite integer. It is an infinitely large number because <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_56f4fb7d80196b954a5a33facf4b9a23.png" align="absmiddle" class="tex" alt="\Delta{x}" /> is infinitesimally small. So there is still hope that this sum of polynomials may catch up to an exponential curve. Now, we apply a trick which changes the nature of this curve. A sum of polynomials becomes an exponential curve, and the <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_9dd4e461268c8034f5c8564e155c67a6.png" align="absmiddle" class="tex" alt="x" /> goes up to a &#8220;position of power&#8221;. We simply reapply the equation <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_72feff558dbc6136047b57d529b409a1.png" align="absmiddle" class="tex" alt="n\cdot\Delta{x}=x" /> to the <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_15699da5162c6174d970b58e778138fc.png" align="absmiddle" class="tex" alt="\frac{x}{n}" /> and the <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_7b8b965ad4bca0e41ab51de7b31363a1.png" align="absmiddle" class="tex" alt="n" /> term in the above equation to get the following.</p>
<p><img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_0887549c642d568058f69931f1f19ced.png" align="absmiddle" class="tex" alt="f(x)=f(0)\cdot(1+\Delta{x})^\frac{x}{\Delta{x}}" /></p>
<p>Now, say we represent <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_1d32d8aaaa8ce8210bc9de688c09308c.png" align="absmiddle" class="tex" alt="\frac{1}{\Delta{x}}" /> by a new variable, <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_8d9c307cb7f3c4a32822a51922d1ceaa.png" align="absmiddle" class="tex" alt="N" />. Then the above equation can be written as follows.</p>
<p><img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_69f8cd0a4cb9b4df957ee96502f04bcf.png" align="absmiddle" class="tex" alt="f(x)=f(0)\cdot(1+\frac{1}{N})^{{x}\cdot{N}}" /></p>
<p>which in turn can be written as</p>
<p><img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_786bb06d74932edeebe61d219c8abcc0.png" align="absmiddle" class="tex" alt="f(x)=f(0)\cdot((1+\frac{1}{N})^N)^x" /></p>
<p>And now we start seeing the exponential form emerge. The <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_9dd4e461268c8034f5c8564e155c67a6.png" align="absmiddle" class="tex" alt="x" /> is the power to which <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_0750543f1200a9510dc39b462722fe68.png" align="absmiddle" class="tex" alt="(1+\frac{1}{N})^N" /> must be raised to evaluate <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_50bbd36e1fd2333108437a2ca378be62.png" align="absmiddle" class="tex" alt="f(x)" />.</p>
<p>This term, <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_0750543f1200a9510dc39b462722fe68.png" align="absmiddle" class="tex" alt="(1+\frac{1}{N})^N" />, when expanded out (remember that <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_8d9c307cb7f3c4a32822a51922d1ceaa.png" align="absmiddle" class="tex" alt="N" /> is infinitely large) hones in on a constant, 2.718281. And that is why, a special name is given to this term (<img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_e1671797c52e15f763380b45e841ec32.png" align="absmiddle" class="tex" alt="e" />). The right hand side thus becomes <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_ff2d26be6b0b506663911208302f91b3.png" align="absmiddle" class="tex" alt="e^x" />.</p>
<p>This takes the general form <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_eb415307499a489e6e9ebf6d2f506853.png" align="absmiddle" class="tex" alt="f(x)=c\cdot{e^x}" />, where <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_07bab375a7f255602b9f2985b304fe9a.png" align="absmiddle" class="tex" alt="c = f(0)" />, the value of the function at <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_e11729b0b65ecade3fc272548a3883fc.png" align="absmiddle" class="tex" alt="x=0" />. In other words we have proved that a function that satisfies the property that the curve&#8217;s value at any point is equal to the slope of the curve at that point follows the general form of an exponential curve.</p>
<h2>How is continuous compounding related to the base of the exponential curve whose slope is  equal to its value?</h2>
<p>Now that we understand the general structure of the exponential curve and its special property, slope equals value, let us revisit the issue of continuous compounding. How is that related to this curve? It is intuitive once we realize that the <em>rate</em> at which money grows at any point in time is directly proportional to the <em>amount</em> of money in the account at that point in time. This is the same as saying the <em>slope</em> of the curve at a given point is directly proportional to the <em>value</em> of the function at that point. The interest rate plays the role of tempering the <em>proportionality</em>. For example, if the interest rate is 100%, the value of the continuously compounding account after <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_9dd4e461268c8034f5c8564e155c67a6.png" align="absmiddle" class="tex" alt="x" /> units of time (<img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_9dd4e461268c8034f5c8564e155c67a6.png" align="absmiddle" class="tex" alt="x" /> years) is <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_8867b399ea90faa4cb5d5163025016f6.png" align="absmiddle" class="tex" alt="f(x) =c \cdot e^{100\% \cdot x}" />. Here <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_4a8a08f09d37b73795649038408b5f33.png" align="absmiddle" class="tex" alt="c" /> is the amount in the account at the beginning (value of the function at <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_f4bff8c9a4fa0cfee395df7e0dade218.png" align="absmiddle" class="tex" alt="x=0, f(0)" />). If the interest rate is 8%, then the value at time <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_9dd4e461268c8034f5c8564e155c67a6.png" align="absmiddle" class="tex" alt="x" /> years is <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_c61324159496444763875a29718388e8.png" align="absmiddle" class="tex" alt="f(x)=c\cdot e^{80\% \cdot x}" />. Once we see that the two phrases, &#8220;slope of a curve is proportional to its value&#8221; and &#8220;rate of growth of something is proportional to the amount of that something&#8221;, are saying the same thing, it is clearer that these two ways of defining or discovering <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_e1671797c52e15f763380b45e841ec32.png" align="absmiddle" class="tex" alt="e" /> are equivalent.</p>
<h2>Other places <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_e1671797c52e15f763380b45e841ec32.png" align="absmiddle" class="tex" alt="e" /> shows up</h2>
<p>Another place <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_e1671797c52e15f763380b45e841ec32.png" align="absmiddle" class="tex" alt="e" /> shows up is called Derangements, as in the opposite of arrangement. A popular example used to explain this issue is the <em>hat check</em> problem. Say there are <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_7b8b965ad4bca0e41ab51de7b31363a1.png" align="absmiddle" class="tex" alt="n" /> people who check in at a party, and the butler takes their hats as they enter the party. There are boxes to hold the hats, one per guest. The boxes have names on them to identify which guest&#8217;s hat should go into it. The butler, however, does not know the names of the guests. He puts the hats into those boxes randomly. When the number of hats (and boxes) is large, the probability that <em>no hat is in the right box</em> is <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_e829777ab97c93ef78b95b37ec071e96.png" align="absmiddle" class="tex" alt="\frac{1}{e}" />. I have been able to work this out mathematically, but I still have not been able to understand this intuitively. Maybe the reason <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_e1671797c52e15f763380b45e841ec32.png" align="absmiddle" class="tex" alt="e" /> shows up here has nothing to do with the properties of <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_ff2d26be6b0b506663911208302f91b3.png" align="absmiddle" class="tex" alt="e^x" />. Maybe it just so happens that the same summation, which adds up to e, <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_58cb916ff717ed6eed51eb31eaad66c4.png" align="absmiddle" class="tex" alt="(1+\frac{1}{n})^n" />, shows up here.</p>
<h3>Solution to Hat Check Problem: An Outline</h3>
<p>The probability that all hat are in wrong boxes = 1 &#8211; probability that at least one hat is in the right box</p>
<p>= 1 &#8211; (prob. that the <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_207540b5d67149d0d26722001b22e54e.png" align="absmiddle" class="tex" alt="1^{st}" /> hat is in the right box OR prob that the <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_d517e3a509a09010ea03e8b689dae5d0.png" align="absmiddle" class="tex" alt="2^{nd}" /> hat is the right box OR &#8230; OR probability that the <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_dee7141541b0575f29b52d84bb7580f6.png" align="absmiddle" class="tex" alt="n^{th}" /> hat is in the right box)</p>
<p>Note that for the <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_97361f12a3555fc4fc4e2ffce1799ac3.png" align="absmiddle" class="tex" alt="i^{th}" /> hat to be in the right box the first i-1 hats have to avoid that box. Therefore,</p>
<p>Probability that the <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_97361f12a3555fc4fc4e2ffce1799ac3.png" align="absmiddle" class="tex" alt="i^{th}" /> hat is in the right box = (probability that the <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_207540b5d67149d0d26722001b22e54e.png" align="absmiddle" class="tex" alt="1^{st}" /> hat has not taken up the right  box of  the <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_97361f12a3555fc4fc4e2ffce1799ac3.png" align="absmiddle" class="tex" alt="i^{th}" /> hat).(probability that the <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_d517e3a509a09010ea03e8b689dae5d0.png" align="absmiddle" class="tex" alt="2^{nd}" /> hat has not taken up the right  box of  the <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_97361f12a3555fc4fc4e2ffce1799ac3.png" align="absmiddle" class="tex" alt="i^{th}" />  hat)&#8230;(probability that the <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_084e59c9abe583f3018a6e8ea6b6b9b0.png" align="absmiddle" class="tex" alt="i-1^{th}" /> hat has not taken up the right  box of  the <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_97361f12a3555fc4fc4e2ffce1799ac3.png" align="absmiddle" class="tex" alt="i^{th}" />  hat).(the <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_97361f12a3555fc4fc4e2ffce1799ac3.png" align="absmiddle" class="tex" alt="i^{th}" /> hat <em>does</em> take up the right box)</p>
<p>Using this, the probability that all hat are in wrong boxes</p>
<p><img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_933d03fccb87d944badf2c4497699935.png" align="absmiddle" class="tex" alt=" = 1 -  (     \frac{1}{n} + ( \frac{1}{n} \cdot (1-\frac{1}{n}))   +  (  \frac{1}{n}  \cdot  (1-\frac{1}{n})^2  )  +  (  \frac{1}{n}  \cdot  (1-\frac{1}{n})^3  )  + ... + (  \frac{1}{n}  \cdot  (1-\frac{1}{n})^n  )    )" /></p>
<p><img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_9c53f508e1ffbc9fec17b45f696f4646.png" align="absmiddle" class="tex" alt=" = 1 - ( \frac{1}{n}) \cdot (1 + (1-\frac{1}{n}) + (1-\frac{1}{n})^2  + (1-\frac{1}{n})^3 + ... + (1-\frac{1}{n})^n ) " /></p>
<p><img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_08033a55378c88eb5ee235a2c780609e.png" align="absmiddle" class="tex" alt=" = 1 - \frac{1}{n} \cdot A " /></p>
<p>where,</p>
<p><img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_49934049f07e96662510c93af42fcc0b.png" align="absmiddle" class="tex" alt=" A = 1 + (1-\frac{1}{n}) + (1-\frac{1}{n})^2 + (1-\frac{1}{n})^3 + ... +  (1-\frac{1}{n})^n " /></p>
<p><img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_c2e78d8d3114caf8ff7bbb64f1e129af.png" align="absmiddle" class="tex" alt=" A \cdot (1-\frac{1}{n}) = (1-\frac{1}{n}) + (1-\frac{1}{n})^2 + (1-\frac{1}{n})^3 + ... +   (1-\frac{1}{n})^n + (1-\frac{1}{n})^{n+1} " /></p>
<p><img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_eb114c837a9f7e16843a0f83fd242908.png" align="absmiddle" class="tex" alt=" A \cdot (1-\frac{1}{n}) + 1 = 1 + (1-\frac{1}{n}) + (1-\frac{1}{n})^2 + (1-\frac{1}{n})^3 + ... +   (1-\frac{1}{n})^n + (1-\frac{1}{n})^{n+1} " /></p>
<p><img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_eae36eee648aa46b8b871c4f64d95c94.png" align="absmiddle" class="tex" alt=" A \cdot (1-\frac{1}{n}) + 1 = A + (1-\frac{1}{n})^{n+1} " /></p>
<p><img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_9ef776cb210292aba7a1b12cfdb630eb.png" align="absmiddle" class="tex" alt=" A = \frac{(1-\frac{1}{n})^{n+1} - 1}{(1-\frac{1}{n}) - 1} " /></p>
<p><img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_eb0d481fae6699c9a128549ab06a24c5.png" align="absmiddle" class="tex" alt=" A = \frac{(1-\frac{1}{n})^{n+1} - 1}{-\frac{1}{n}} " /></p>
<p>Putting this <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_7fc56270e7a70fa81a5935b72eacbe29.png" align="absmiddle" class="tex" alt="A" /> back in the equation for the probability we are looking for, we get,</p>
<p><img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_859f8463e8461ae5b9879f594d550d27.png" align="absmiddle" class="tex" alt=" = 1 + (1-\frac{1}{n})^{n+1} - 1 " /></p>
<p><img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_203b286cc7e79623e1bb7c6e93770913.png" align="absmiddle" class="tex" alt=" = (1-\frac{1}{n})^{n+1} " /></p>
<p><img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_f00f81432918c4d66c9e8e808c4e5bf3.png" align="absmiddle" class="tex" alt=" = (\frac{n-1}{n})^{n+1} " /></p>
<p><img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_e0da64a2acfb5251c39c5c111c4b126f.png" align="absmiddle" class="tex" alt=" = (\frac{n}{n-1})^{-n-1} " /></p>
<p>now, replacing <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_a438673491daae8148eae77373b6a467.png" align="absmiddle" class="tex" alt="n-1" /> by <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_8d9c307cb7f3c4a32822a51922d1ceaa.png" align="absmiddle" class="tex" alt="N" />,</p>
<p><img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_877897a189ef3731d1ef592fea407842.png" align="absmiddle" class="tex" alt=" = (\frac{N+1}{N})^{-N-2} " /></p>
<p><img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_9c9f867217b465ae2eddd42e385543ab.png" align="absmiddle" class="tex" alt=" = (1+\frac{1}{N})^{-N} \cdot (1+\frac{1}{N})^{-2} " /></p>
<p>The second term goes to <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_cfcd208495d565ef66e7dff9f98764da.png" align="absmiddle" class="tex" alt="0" /> as <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_8d9c307cb7f3c4a32822a51922d1ceaa.png" align="absmiddle" class="tex" alt="N" /> increases. So we are left with,</p>
<p><img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_3d0b87f27a338090cf360b91bebe5520.png" align="absmiddle" class="tex" alt=" = \frac{1}{(1+\frac{1}{N})^{N}}" /></p>
<p><img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_454d15e1e90ac89c99605653ea34e126.png" align="absmiddle" class="tex" alt=" = \frac{1}{e}" /></p>
<p>I do not know why <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_e1671797c52e15f763380b45e841ec32.png" align="absmiddle" class="tex" alt="e" /> shows up here. But, well, it does. Yet another place where <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_e1671797c52e15f763380b45e841ec32.png" align="absmiddle" class="tex" alt="e" /> shows up with unabated vigor is when representing complex numbers, and, of course, in that famous Euler&#8217;s equation that brings together 5 of the most interesting quantities in mathematics, <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_66d24771ec114e071c1c13223f816869.png" align="absmiddle" class="tex" alt="e^{i\pi}+1=0" />. I will go into exploring that another time.</p>
<h2>References and Acknowledgments</h2>
<p>My discussions with  my friend Mani, and my investigations on the internet led to this write up. I found the Wikipedia site very useful to tie together the seemingly unconnected ways in which the constant <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_e1671797c52e15f763380b45e841ec32.png" align="absmiddle" class="tex" alt="e" /> shows up in real life. The BetterExplained.com website has a very good illustrated tutorial for those who want to understand the concept of <img src="http://flickeringtubelight.net/blog/wp-content/cache/tex_e1671797c52e15f763380b45e841ec32.png" align="absmiddle" class="tex" alt="e" /> in simple terms.</p>
<p>Wikipedia.com &#8211; <a href="http://en.wikipedia.org/wiki/E_%28mathematical_constant%29">http://en.wikipedia.org/wiki/E_(mathematical_constant)</a></p>
<p>BetterExplained.com &#8211; <a href="http://betterexplained.com/articles/an-intuitive-guide-to-exponential-functions-e/">http://betterexplained.com/articles/an-intuitive-guide-to-exponential-functions-e/</a></p>
]]></content:encoded>
			<wfw:commentRss>http://flickeringtubelight.net/blog/2010/05/thoughts-on-the-mathematical-constant-e/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>We made a garden trellis with PVC piping</title>
		<link>http://flickeringtubelight.net/blog/2010/05/we-made-a-garden-trellis-with-pvc-piping/</link>
		<comments>http://flickeringtubelight.net/blog/2010/05/we-made-a-garden-trellis-with-pvc-piping/#comments</comments>
		<pubDate>Sun, 02 May 2010 18:20:06 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Experiences]]></category>
		<category><![CDATA[Information]]></category>
		<category><![CDATA[Tutorials]]></category>

		<guid isPermaLink="false">http://flickeringtubelight.net/blog/?p=277</guid>
		<description><![CDATA[We have two 4&#8242;x8&#8242; (4 feet by 8 feet) raised beds, which we use for vegetable plants. Kavita has been asking me to either buy or build a trellis for her climbing plants (cucumbers, tomatoes and eventually some types of squash and gourds). I read several websites online and decided to build a simple trellis [...]]]></description>
			<content:encoded><![CDATA[<div id="attachment_284" class="wp-caption alignleft" style="width: 420px"><a href="http://flickeringtubelight.net/blog/wp-content/uploads/2010/05/Trellis_fig1.jpg"><img class="size-full wp-image-284 " title="Trellis_fig1" src="http://flickeringtubelight.net/blog/wp-content/uploads/2010/05/Trellis_fig1.jpg" alt="" width="410" height="353" /></a><p class="wp-caption-text">Figure1: Basic plan showing the material required</p></div>
<p>We have two 4&#8242;x8&#8242; (4 feet by 8 feet) raised beds, which we use for vegetable plants. Kavita has been asking me to either buy or build a trellis for her climbing plants (cucumbers, tomatoes and eventually some types of squash and gourds). I read several websites online and decided to build a simple trellis using PVC piping. It took one trip to the local Home Depot, and then about 2 hours of work. The cost for the material was under $10 (I already had all the tools needed).</p>
<p>We decided to build one and test it out before getting carried away  and  building more. We decided that we would roughly want the trellis to  be  4 feet wide by 5 feet high. At the Home Depot we did some quick   calculations based on the basic design we had in mind and came up with a   total of about 29 feet of PVC tube. The calculation is shown in Figure   1.</p>
<div id="attachment_285" class="wp-caption alignright" style="width: 292px"><a href="http://flickeringtubelight.net/blog/wp-content/uploads/2010/05/Trellis_fig2.jpg"><img class="size-full wp-image-285 " title="Trellis_fig2" src="http://flickeringtubelight.net/blog/wp-content/uploads/2010/05/Trellis_fig2.jpg" alt="" width="282" height="190" /></a><p class="wp-caption-text">Figure 2: Materials and Tools for the project    (4-way, + shaped, PVC connector missing)</p></div>
<p>The PVC pipes are sold in 10&#8242; pieces. We got 3 pieces. We also got  some string (I tried polypropylene string since I did not know any  better, we&#8217;ll see how that works out).</p>
<p>Figure 2 shows most of the material and tools. The one caveat is, since  I took this picture <em>after</em> completing the project, the one 4-way  1/2&#8243; PVC pipe connector used at the center of the frame is missing from  the picture. I had extra connectors of the other type, so I could use  them for the picture. Also, one other thing that is missing from the  picture is a power drill and drill bits. I used a 3/16&#8243; drill bit to  drill evenly spaced holes in the pipes to draw the string through, to  create a framework.</p>
<div id="attachment_287" class="wp-caption alignleft" style="width: 136px"><a href="http://flickeringtubelight.net/blog/wp-content/uploads/2010/05/Trellis_fig3.jpg"><img class="size-full wp-image-287 " title="Trellis_fig3" src="http://flickeringtubelight.net/blog/wp-content/uploads/2010/05/Trellis_fig3.jpg" alt="" width="126" height="170" /></a><p class="wp-caption-text">Figure 3: Trellis plan with the stringing shown</p></div>
<div id="attachment_283" class="wp-caption alignright" style="width: 133px"><a href="http://flickeringtubelight.net/blog/wp-content/uploads/2010/05/Trellis_fig4.jpg"><img class="size-full wp-image-283 " title="Trellis_fig4" src="http://flickeringtubelight.net/blog/wp-content/uploads/2010/05/Trellis_fig4.jpg" alt="" width="123" height="163" /></a><p class="wp-caption-text">Figure 4: Framework ready, stringing is yet to    be  completed</p></div>
<p>The spacing between the holes and how the trellis is supposed to look    eventually is shown in Figure 3.</p>
<p>The thing that took the most time was measuring and marking the PVC    pipes, cutting them to the right size with the saw, then measuring and    marking the locations for the holes for the string to go through and then drilling the holes with    the power drill. Since the PVC pipe keeps rolling about, making it stable before drilling is important. I just used an old rag to wrap around the pipe in order to hold it somewhat still.</p>
<p>Once the pieces were all ready, putting the trellis  together took less than 10 minutes. There was no need for glue, since  the connectors fit quite snugly. Figure 4 shows the trellis laid out on the lawn (with only one piece  of string drawn through, Kavita will work on getting the rest of the  mesh this evening).</p>
<p>We are not sure how well this will hold up, how long it will last  etc. I  will update the post with some pictures on the trellis in action,   later.</p>
]]></content:encoded>
			<wfw:commentRss>http://flickeringtubelight.net/blog/2010/05/we-made-a-garden-trellis-with-pvc-piping/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Cool Gate-Level Logic Simulation Tool</title>
		<link>http://flickeringtubelight.net/blog/2009/11/cool-gate-level-logic-simulation-tool/</link>
		<comments>http://flickeringtubelight.net/blog/2009/11/cool-gate-level-logic-simulation-tool/#comments</comments>
		<pubDate>Sat, 14 Nov 2009 15:49:24 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Tidbits]]></category>
		<category><![CDATA[Tutorials]]></category>

		<guid isPermaLink="false">http://flickeringtubelight.net/blog/2009/11/14/cool-gate-level-logic-simulation-tool/</guid>
		<description><![CDATA[Dr. Mark Hill of University of Wisconsin Madison pointed out the existence of this really cool learning aid to his class Logic Design class in October 2009. He also provides some guidelines about how to use the tool. Give it a shot if you like to play around with AND, OR, NOT gates and build [...]]]></description>
			<content:encoded><![CDATA[<p>Dr. Mark Hill of University of Wisconsin Madison pointed out the existence of this really cool learning aid to his class Logic Design class in October 2009. He also provides some guidelines about how to use the tool. Give it a shot if you like to play around with AND, OR, NOT gates and build logic. You can build &#8220;memory&#8221; structures too, something that can <em>remember</em> state. The following text is from Dr. Hill&#8217;s email. Enjoy.<span id="more-184"></span></p>
<p><span style="color: #000000;">Here is a cool tool for playing with and visualizing simple logic gates:</span></p>
<p><a href="http://joshblog.net/projects/logic-gate-simulator/Logicly.html"><span style="color: #000000;">http://joshblog.net/projects/logic-gate-simulator/Logicly.html</span></a></p>
<p><span style="color: #000000;">Build your own circuit:</span></p>
<p><span style="color: #000000;">(a) Drag over a few gates in the middle.<br />
(b) Drag few switches to the left side as inputs<br />
(c) Drag one or more light bulbs to right as output(s)<br />
(d) Connect items together with wires by clicking on needed endpoints.</span></p>
<p><span style="color: #000000;">Operate it:<br />
(e) Click on switches to toggle inputs between one (blue) and zero (white).<br />
(f) Watch the wires change state &#8212; one (blue) and zero (white) and light bulbs go on (blue) or off (off).</span></p>
]]></content:encoded>
			<wfw:commentRss>http://flickeringtubelight.net/blog/2009/11/cool-gate-level-logic-simulation-tool/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>The Mathematics of Mortgage, Overpayment and Refinancing Decisions</title>
		<link>http://flickeringtubelight.net/blog/2009/10/mathematics-of-mortgage-overpayment-and-refinancing-decisions/</link>
		<comments>http://flickeringtubelight.net/blog/2009/10/mathematics-of-mortgage-overpayment-and-refinancing-decisions/#comments</comments>
		<pubDate>Wed, 07 Oct 2009 17:11:54 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Experiences]]></category>
		<category><![CDATA[Information]]></category>
		<category><![CDATA[Tutorials]]></category>

		<guid isPermaLink="false">http://flickeringtubelight.net/blog/2009/10/07/mathematics-of-mortgage-overpayment-and-refinancing-decisions/</guid>
		<description><![CDATA[With mortgage interest rates at historically low values refinancing home loans is an option currently being investigated by many here in the US. I, too, considered the same issue recently and discovered that this is not an easy decision to make. I developed a spreadsheet to figure out if this was a good idea. You [...]]]></description>
			<content:encoded><![CDATA[<p>With mortgage interest rates at historically low values refinancing home loans is an option currently being investigated by many here in the US. I, too, considered the same issue recently and discovered that this is not an easy decision to make. I developed a spreadsheet to figure out if this was a good idea. You can download this spreadsheet by clicking on this <span style="text-decoration: underline;"><strong><a title="Spreadsheet to aid refinance decisions" href="http://flickeringtubelight.net/blog/wp-content/uploads/2009/10/mortgagerefinancecalculation.xls" target="_blank">link</a></strong></span> (Microsoft Excel 2003). In general, the spreadsheet was also intended to show how loan repayment terms are set, how banks make money on loans, when overpaying monthly payments makes sense etc. Feel free to use the spreadsheet and improve upon it or tailor it to your situation. The rest of this article is a tutorial on how to make decisions about mortgages, how mortgages work in general, whether overpayment of the monthly payment makes sense, and what to consider when refinancing. The focus will be on the mathematical aspects of the decision making.<span id="more-169"></span></p>
<h2>Taking an Interest in Loan Mathematics</h2>
<p>Let us first try to understand, in simple terms, the philosophy of any loan process. In particular, I&#8217;ll focus on the home loan process. You intend to purchase a house. Why purchase, and not continue to rent? Well, that is an interesting question in its own right. But, to not get distracted, let us say you got tired of the rent going up each year, or moving every few years, or actually figured that it was economically better to buy rather than rent. So, you decide to buy a house. You need money. You go to the bank (a mortgage lender). Say you want $200K ($200 thousand). The bank gives you the money at a certain interest rate, say, 5%. What does that really mean? Here in the US, the norm is to calculate the remaining balance every month. The 5% is actually 12*0.4166%, where 0.4166%, or 0.004166, is the monthly interest rate. That is, after the first month, the outstanding balance is $200K + $200K*0.004166. In other words, because the bank did you a favor by giving you $200K which you did not have, it wants $200K*0.004166, which is $833.33. As a quick aside, notice that because the interest is calculated monthly, the annual interest rate is, in reality, greater than the 5% we started with. The real annual interest rate would be 1.004166^12=0.0511, or in other words 5.11%. Nevertheless, the common practice is to quote this as 5% base interest rate, and that is fine, as long as we know what it means.</p>
<p>Now, continuing with our example, in the first month, the bank wants you to pay $833.333 as interest accrued over that month. Say you paid exactly $833.33. The outstanding balance at the beginning of the second month would then be exactly 200K again. And at the end of that second month, the interest would be $833.33 again. Say you pay the bank $833.33 again. The outstanding balance at the beginning of the third month will again be $200K. This pattern could go on endlessly. You may argue that this looks like renting. Every month you pay the rent. You don&#8217;t see any of that money. But with buying there is a fundamental difference. After 10 years of doing the above, that is, paying $833.33 each month, you decide to sell the house. The house itself, typically, appreciates in value. Say the value of the house is now $300K. You sell and make a $100K profit. You paid 10*12*$833.33 over the 10 years, which, coincidentally, comes out to exactly 100K. What that means is you basically lived in a house for 10 years for free (of course you did pay property taxes, painted the house a couple of times, bought a lawn mower, replaced light bulbs, and took care of the house in general). But overall, it sounds like a pretty sweet deal.</p>
<p>One word we all glossed over in this discussion is &#8220;typically&#8221;. Home prices &#8220;typically&#8221; appreciate. The bank does not gloss over that word. If the value of the house drops, say 10 years later, the value of the house drops to 150K. You have paid your &#8220;rent&#8221; for 10 years, and are ready to sell. The bank wants its 200K back, since you never paid any principal all these years. The sale, however, would only fetch you 150K. The bank has the title (ownership document) to the house. It will not let you sell. It says give us 50K first, then sell for 150K, and gives us that 150K as well. You do not have 50K to give to the bank. The loan is foreclosed &#8211; the bank keeps the title to the house, but the bank does not like this situation. The bank now owns the house. But the house is worth only 150K. The bank does not want to be in the business of selling a house, especially one that won&#8217;t bring them their original 200K back.</p>
<p>To prevent this scenario, the bank employs two interesting tactics. Firstly, it does not let you pay only the interest of $833.33 each month. It requires you to pay off some of that principal on top of the interest. Secondly, the amount of principal you pay atop the interest is calculated such that the loan is guaranteed to be paid off in a certain &#8220;term&#8221;. Further, to keep the payment terms simple for the customer, the total payment each month remains unchanged. It is important to recognize that calculation of interest depends only on the interest rate. The calculation of the monthly payment which includes both the interest and some piece of the principal requires the notion of the &#8220;term&#8221;. The payment has to at least be the interest due that month. It is a bit more than that each month because of the principal paid. (In reality it ends up being even more because you pay a part of the annual property taxes, hazard insurance etc. each month � but we can ignore that for this discussion). Paying a bit of principal each month causes the outstanding balance reduce each month; this causes the interest payment to reduce a bit each month. This allows you to pay off even more principal each month, and that cascading effect finally ends exactly when the term runs out. Throughout this period, as mentioned earlier, the actual monthly payment does not change. The reduction in interest is compensated for by the increase in principal payment, which in turn reduces the outstanding balance and causes the next month&#8217;s interest payment to reduce even further. This constant monthly payment (interest + principal) is carefully calculated to achieve this effect. &#8220;Term&#8221; is the number of months the loan is supposed to be fully paid off by. The shorter the term, the better the rates, in general, because to pay off a loan faster (shorter term), you have to pay greater amounts each month. So there needs to be an incentive for you to pay more money each month to the bank. And that incentive is the lower rate. Otherwise, wouldn&#8217;t you rather go with the longer term, pay less to the bank each month and invest that leftover in the stock market?</p>
<p>By forcing you to pay a bit of principal each month, the bank is earning less interest each month. But the good news is that, if after 10 years, you decide to sell the house and the value of the house is only 150K instead of the 200K you bought it for, the bank risks less. You have already paid off about 40K. So the loss for the bank is only 10K instead of 50K, if it had allowed you to only pay the interest each month. So in other words, the bank wants you to pay principal each month not to help you reduce your interest payments, but rather to help it stave off any chance of losing money on the house if prices fall.</p>
<h2>Black Magic &#8211; Calculating the Monthly Payment</h2>
<p>When you talk to a mortgage banker on the phone, you will notice that they like to quickly tell you that your monthly payment would be some x amount. They use the phrase, &#8220;run the numbers&#8221;, with some pride. It is interesting and empowering to understand how the monthly payment is actually calculated.</p>
<p>We have all the information we need. At the beginning, we have a 200K loan. Let us use L to indicate this &#8220;Loan Amount&#8221;. Say, the monthly rate, which is 0.004166 in our example, is represented by c. Say, n represents the term, n months. Say, P represents the monthly payment. We want to determine P ourselves, instead of depending on our loan officer to tell us that information.</p>
<p>After the first month, the outstanding balance is:<br />
L + L*c &#8211; P<br />
= L(1+c) &#8211; P<br />
This is because the loan amount L increases by the monthly interest amount, L*c, but then we make the payment of P. This is the outstanding balance for the second month.<br />
At the end of the second month, the outstanding balance is:<br />
{L(1+c) &#8211; P}(1+c) &#8211; P<br />
= L(1+c)^2 &#8211; P(1+c) &#8211; P<br />
= L(1+c)^2 &#8211; P{(1+c)+1}<br />
At the end of the third month, the outstanding balance is<br />
= {L(1+c)^2 &#8211; {P(1+c)+P}} (1+c) &#8211; P<br />
= L(1+c)^3 &#8211; P {(1+c)^2 + 1+c) + 1}<br />
If you are still with me, you may start seeing a pattern emerge. After n months, the outstanding balance will be:<br />
= L(1+c)^n &#8211; P {(1+c)^(n-1) + (1+c)^(n-2) + &#8230; + (1-c)^2 + (1+c) + 1}<br />
Which, can be rewritten as<br />
= L(1+c)^n &#8211; P {1 + (1+c)^2 + (1+c)^3 + &#8230; (1+c)^(n-1)}<br />
Now. The punch line. After n months, we *know* that the outstanding balance should be 0. So<br />
0 = L(1+c)^n &#8211; P {1 + (1+c)^2 + (1+c)^3 + &#8230; (1+c)^(n-1)}<br />
P {1 + (1+c)^2 + (1+c)^3 + &#8230; (1+c)^(n-1)} = L(1+c)^n<br />
P = L(1+c)^n/{1 + (1+c)^2 + (1+c)^3 + &#8230; (1+c)^(n-1)}<br />
There you go. That is P, your payment each month. Phew! Done? Well, almost. The denominator in the above calculation is not Excel-friendly. Remember, you want this to go into a spreadsheet that can help you with decision making. The number of terms depends on n. Not good. Let&#8217;s try to find a closed form solution for the denominator. Thankfully, it is not hard. Notice that the denominator is of the form:<br />
1 + a + a^2 + a^3 + &#8230; + a^(n-1)<br />
where I replaced 1+c with a. Let us call the above sum X.<br />
X = 1 + a + a^2 + a^3 + &#8230; + a^(n-1)<br />
Adding a^n to both sides (as an aside, this kind of intuition is the reason Kavita hates math)<br />
X + a^n = 1 + a + a^2 + a^3 + &#8230; + a^(n-1) + a^n<br />
Shamelessly using some more of that darned intuition, we extract out a common factor, a, from the last n terms to get to:<br />
X + a^n = 1 + a {1 + a + a^2 + &#8230; + a^(n-1)}<br />
But notice that the stuff inside the {} is precisely what we defined X to be. So:<br />
X + a^n = 1 + a*X<br />
X + a*X = 1 + a^n<br />
X * (1 + a) = 1 + a^n<br />
Therefore,<br />
X = (1-a^n)/(1-a)<br />
Replacing a with (1+c),<br />
X = (1-(1+c)^n)/(1-(1+c))<br />
X = (1-(1+c)^n)/(-c)<br />
X = ((1+c)^n &#8211; 1)/c<br />
Finally, substituting this into the equation for P:<br />
P = L.c.(1+c)^n/{(1+c)^n &#8211; 1}<br />
Now we are seriously done with this calculation.</p>
<p>Let us try to use L=200K, c=0.004166 and n=360 (a 30-year term, which is quite common in the US), and calculate P, your monthly payment. P comes out to $1073.64. The interest component is $833.33, and the principal is $1073.64 &#8211; $833.33 = $240.31. Because you pay off a tiny bit of the 200K principal, the outstanding balance at the beginning of the second month is $200000 &#8211; $240.31 = $199759.69. The interest for the second month is therefore going to be lesser than $833.33. In fact, it is $832.33. This $1 we pay less in interest goes towards the principal, which increases from $240.31 in the first month to $241.31 in the second. Looking a few months into this process the interest payments are $833.33, $832.33, $831.33, $830.32, $829.30, etc., and principal payments are $240.31, $241.31, $242.32 etc.</p>
<p><a title="Fig1" href="http://flickeringtubelight.net/blog/wp-content/uploads/2009/10/refinance_fig1.GIF"><img src="http://flickeringtubelight.net/blog/wp-content/uploads/2009/10/refinance_fig1.thumbnail.GIF" alt="Fig1" /></a><a title="Fig2" href="http://flickeringtubelight.net/blog/wp-content/uploads/2009/10/refinance_fig2.GIF"><img src="http://flickeringtubelight.net/blog/wp-content/uploads/2009/10/refinance_fig2.thumbnail.GIF" alt="Fig2" /></a></p>
<p>Click on the thumbnails above to see the monthly and cumulative payment schedules. The first figure shows how much interest, principal and total payment needs to be made each month. The second figure translates that to a cumulative amount, that is, at any given point in time it tells us how much interest, principal and total payment you would have made. It is interesting to note from the first figure that in the first few years the bank makes most of the money it expects to make on the house (the interest tails off during the later years). The second figure shows that by the end of the loan term, you&#8217;d pay about 200K in interest!</p>
<h2>Does Overpayment Make Sense?</h2>
<p>At this point it is important to understand that by paying off the $240.31, $241.31, $242.32 etc. principal each month the benefit you are getting is in terms of reducing the interest you pay each month. By actually being vested in the house, that is, by owning that piece of the house, you do not get any direct benefit; when the house sells, its value will not depend on how much of the house you actually own. Think of it like this &#8211; the principal payments you make are investments where the rate of return is determined by the reduction in the interest payments.� Let us take an example. Say, somehow, you convince the bank to allow you to pay only the interest, $833.33, each month. You take the difference between your bank-determined payment of $1073.64 and your negotiated payment of $833.33 and invest it ($1073.64 &#8211; $833.33 = 240.31) in the stock market at 10% annual rate of return. Either way, after 10 years, we&#8217;d have invested $240.31*12*10 = $28,837.2. Since the investment is accruing a rate of return each month, we need to carefully calculate how much profit we make (I use an Excel spreadsheet to do this, however, we could use a closed form expression similar to the one we developed above). At a 10% rate of return, we make about $21,000. If we put this same $241.31 into the principal payment each month, after 10 years, our profit (the interest savings compared to the case where we do not pay any principal payment each month) is $8,478. Of course, the actual profit by investing is reduced by the tax you need to pay on that profit. Regardless, it is still a sweet deal to invest the money in the stock market, provided you can guarantee the 10% return on investment. Even if we assume a safe 6% rate of return (after taxes and everything), we stand to make $10,900 in the stock market vs. the $8,478 we &#8220;make&#8221; by putting it into the house. In any case, this is a moot point, since the bank will not allow you to make interest payments only. What this discussion is intended to drive home is that it may not make sense to overpay above the monthly payment of $1073.64, unless you intend to stay at the house for a shorter term. If you stay for a shorter term in the house, then the stock market rate of return may be too risky, whereas the paying into the house guarantees a certain rate of return.</p>
<p><a title="Fig3" href="http://flickeringtubelight.net/blog/wp-content/uploads/2009/10/refinance_fig3.GIF"><img src="http://flickeringtubelight.net/blog/wp-content/uploads/2009/10/refinance_fig3.thumbnail.GIF" alt="Fig3" /></a><a title="Fig4" href="http://flickeringtubelight.net/blog/wp-content/uploads/2009/10/refinance_fig4.GIF"><img src="http://flickeringtubelight.net/blog/wp-content/uploads/2009/10/refinance_fig4.thumbnail.GIF" alt="Fig4" /></a></p>
<p>The figures above show monthly and cumulative payments for a 15 year loan term &#8211; that is a loan for which the monthly payment has been calculated such that is supposed to be paid off in full in 15 years. Typically, 15 year loans have a slightly better rate than a 30 year loan, to give you the incentive to give up more of your cash each month in payment. However, since I am continuing to use a 5% interest rate to plot these curves, these really indicate how your overall payment time line changes if you overpay each month. The overpayment amount is basically the difference between the monthly payment shown in this figure and the minimum monthly payment shown in the previous section. As you can see here, even in the first month you pay as much towards principal as interest, and secondly, by the end of the loan term, you pay only about $80K in interest. The advantage of this scheme is that you are required to only pay in accordance with the 30-year term, but you may choose to overpay if you wish to reduce your interest payments. That way, if you occasionally miss your overpayment target, that is fine as long as you pay the minimum payment for that month. That said, like we discussed above, it may still make sense to not overpay if you can invest that money instead.</p>
<h2>Does Refinancing Make Sense?</h2>
<p>Now that we have understood some of the nuances of the loan process, let us consider how to make a refinancing decision. Refinancing is the process of getting a new loan in order to pay off an existing loan. If this were a free process, that is, there were no cost of refinancing, the decision would have been very simple. If the new interest rate is better than the old interest rate refinancing would make sense. However, there is, typically, a cost involved. The question then changes to how long do you need to stay in the same house after refinancing to recoup the cost of refinancing. Let us take an example. Say you currently have a 5% loan with $200K outstanding, and a different lender offers a 4% loan, with a $2000 closing cost. The current monthly payment is $1073.64. The new monthly payment is $954.83. Since the interest rate is lower, you�ll likely be paying less interest each month with the new loan. So over time the cumulative interest you pay the bank may be lesser with the new loan. For this example, after 1 year the total interest paid with the current loan is about $9900, whereas with the new loan it is $7900. This is about $2000 in savings in 1 year just from the interest rate reduction. Since the interest you pay is like the fees the bank charges for its services, you have found a low-fee option. So in 1 year you have overcome the $2000 cost of refinancing. From year 2 onward you stand to gain by doing this refinance. (Note: I am ignoring the fact that interest is tax-free money, that is, you get the taxes you paid on the interest in your next year&#8217;s tax returns. The absolute savings from interest reduction are therefore about 20 to 25% lesser than the savings I am quoting here and in the spreadsheet. It is easy to fix that though, if you choose to. Instead of saving $2000, you&#8217;d have actually only saved $1500 if you fall in the 25% tax rate bracket.)</p>
<p>Now, let us see what happens to the rest of the money you are paying each month, the principal. With the current loan the cumulative principal payment after 1 year is<br />
$2950. With the new loan the cumulative principal paid in 1 year is $3522. That is, you own more of the house. But this is not important by itself. Yes, you own more of the house, but the net is you converted some cash into a bit of house. If you had not owned any of the house you&#8217;d have been left with cash which you could have invested and actually grown it. The house grows or falls equally in value regardless of whether you are invested in it or not.</p>
<p>But there is one more component to this equation other than the interest and principal. The overall monthly payment has reduced from $1073.64 to $954.83. That is a freeing up of $118.81 each month to be invested as you choose. Even if this was invested conservatively in a 6% rate of return investment, you end up with $1472 at the end of the first year. This is money that would not have been available at all with the current loan. So in fact, at the end of year 1, you have save $2000 + $1472, the former coming from the interest savings and the latter from investing the cash freed up. This means, the $2000 cost of refinancing will actually be made up even sooner than 1 year. Given the above 6% assumption it is more like 7 months. If you plan to live in this house for 7 months or more, go for the refinancing</p>
<p><a title="Fig5" href="http://flickeringtubelight.net/blog/wp-content/uploads/2009/10/refinance_fig5.GIF"><img src="http://flickeringtubelight.net/blog/wp-content/uploads/2009/10/refinance_fig5.thumbnail.GIF" alt="Fig5" /></a></p>
<p>The figure above shows the time to recoup the cost of refinancing, considering only the interest savings and also considering the case where the overall reduction in monthly payment can be invested at 6%.</p>
<h2>Acknowledgements</h2>
<p>My understanding of the issues involved in refinancing, in particular, and mortgages, in general, is based upon my going through this decision-making process recently. Much of this understanding was developed during discussions with my friends Gordie and Srini. If you find flaws in my understanding please let me know. Some online resources that helped me were http://www.mtgprofessor.com/formulas.htm , http://en.wikipedia.org/wiki/Refinancing and http://en.wikipedia.org/wiki/Mortgage.</p>
]]></content:encoded>
			<wfw:commentRss>http://flickeringtubelight.net/blog/2009/10/mathematics-of-mortgage-overpayment-and-refinancing-decisions/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Notes on Memory Consistency and Cache Coherence</title>
		<link>http://flickeringtubelight.net/blog/2008/06/notes-on-memory-consistency-and-cache-coherence/</link>
		<comments>http://flickeringtubelight.net/blog/2008/06/notes-on-memory-consistency-and-cache-coherence/#comments</comments>
		<pubDate>Fri, 27 Jun 2008 20:54:40 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Information]]></category>
		<category><![CDATA[Tutorials]]></category>

		<guid isPermaLink="false">http://flickeringtubelight.net/blog/2008/06/27/notes-on-memory-consistency-and-cache-coherence/</guid>
		<description><![CDATA[Here are some of my notes on the topic of memory consistency and cache coherence, and how uniprocessor and multiprocessor cores have to be built to support the consistency models. Most of this was written up when I was preparing for my Qualifying Exam at NC State University last semester. This is a relatively complicated [...]]]></description>
			<content:encoded><![CDATA[<p><meta http-equiv="Content-Type" content="text/html; charset=utf-8" /><meta name="ProgId" content="Word.Document" /><meta name="Generator" content="Microsoft Word 11" /><meta name="Originator" content="Microsoft Word 11" /><!--[if gte mso 9]><xml>  <w:WordDocument>   <w:View>Normal</w:View>   <w:Zoom>0</w:Zoom>   <w:PunctuationKerning/>   <w:ValidateAgainstSchemas/>   <w:SaveIfXMLInvalid>false</w:SaveIfXMLInvalid>   <w:IgnoreMixedContent>false</w:IgnoreMixedContent>   <w:AlwaysShowPlaceholderText>false</w:AlwaysShowPlaceholderText>   <w:Compatibility>    <w:BreakWrappedTables/>    <w:SnapToGridInCell/>    <w:WrapTextWithPunct/>    <w:UseAsianBreakRules/>    <w:DontGrowAutofit/>   </w:Compatibility>   <w:BrowserLevel>MicrosoftInternetExplorer4</w:BrowserLevel>  </w:WordDocument> </xml><![endif]--><!--[if gte mso 9]><xml>  <w:LatentStyles DefLockedState="false" LatentStyleCount="156">  </w:LatentStyles> </xml><![endif]-->Here are some of my notes on the topic of memory consistency and cache coherence, and how uniprocessor and multiprocessor cores have to be built to support the consistency models. Most of this was written up when I was preparing for my Qualifying Exam at NC State University last semester. This is a relatively complicated topic to understand well, and there might still be several mistakes in how I understood the ideas. Also, this might only make sense, and be interesting, to people familiar with these areas of computer architecture. Here&#8217;s the pdf file:<a href="http://flickeringtubelight.net/blog/wp-content/uploads/2008/06/notesonconsistencyandcoherence.pdf" title="Notes on Memory Consistency and Cache Coherence"> Notes on Memory Consistency and Cache Coherence</a></p>
]]></content:encoded>
			<wfw:commentRss>http://flickeringtubelight.net/blog/2008/06/notes-on-memory-consistency-and-cache-coherence/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Setup and Hold Violations in Digital Systems</title>
		<link>http://flickeringtubelight.net/blog/2008/05/setup-and-hold-violations-in-digital-systems/</link>
		<comments>http://flickeringtubelight.net/blog/2008/05/setup-and-hold-violations-in-digital-systems/#comments</comments>
		<pubDate>Mon, 19 May 2008 22:46:32 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Tutorials]]></category>

		<guid isPermaLink="false">http://flickeringtubelight.net/blog/2008/05/19/setup-and-hold-violations-in-digital-systems/</guid>
		<description><![CDATA[I wrote this up when trying to prepare for my PhD Qualifying Examination this past semester (Spring 2008). It is a pdf file. You can read it here.
]]></description>
			<content:encoded><![CDATA[<p>I wrote this up when trying to prepare for my PhD Qualifying Examination this past semester (Spring 2008). It is a pdf file. You can read it <a href="http://flickeringtubelight.net/blog/wp-content/uploads/2006/04/setupAndHoldViolations.pdf">here</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://flickeringtubelight.net/blog/2008/05/setup-and-hold-violations-in-digital-systems/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Finally, one Contact List to rule them all</title>
		<link>http://flickeringtubelight.net/blog/2007/07/finally-one-contact-list-to-rule-them-all-2/</link>
		<comments>http://flickeringtubelight.net/blog/2007/07/finally-one-contact-list-to-rule-them-all-2/#comments</comments>
		<pubDate>Sun, 29 Jul 2007 23:00:52 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Information]]></category>
		<category><![CDATA[Tutorials]]></category>

		<guid isPermaLink="false">http://flickeringtubelight.net/blog/2007/07/29/finally-one-contact-list-to-rule-them-all-2/</guid>
		<description><![CDATA[I have a Yahoo email account. I have a Google email account. I have a few more that I do not use much. Each email application or service provider typically has an Address Book or Contacts List, which we can use to list the names, email addresses and other information about people. For a while, [...]]]></description>
			<content:encoded><![CDATA[<p>I have a Yahoo email account. I have a Google email account. I have a few more that I do not use much. Each email application or service provider typically has an Address Book or Contacts List, which we can use to list the names, email addresses and other information about people. For a while, I have been thinking about getting the contact lists organized. There were several layers to the word &#8220;organized&#8221; and I was apprehensive about starting to peel those layers. The first layer of the problem was figuring out which mail server I wanted to stick to. The next thought was to create a superset of all the contact lists, currently scattered across applications and mail servers, at one place. The next issue was to find a way to update the contact list quickly, rather than clicking around a web-based Address Book or Contact List applications such as the one provided by Yahoo and Google Mail. Then there was the hope that I could keep a copy of the contact list locally on my personal computer, in case, at some point, I did not have internet access to get to the Yahoo Address Book.</p>
<p>This list of requirements seemed formidable in itself, yet, what made me skeptical of a final solution, was one last requirement I had. I had maintained a list of birthdays and anniversaries in a text file separate from the contact lists in the mail servers I mentioned. It was a simple text file and a simple Perl script I wrote could go through this text file everyday and send me an email if it found any upcoming event. I wanted to retain the ability to do such scripting and not have to maintain a separate text file version of the contact list, just for the purposes of being able to run such a reminder script.</p>
<p>After collecting and formulating these thoughts over a long time, I finally spent a few minutes last week looking for a solution to the multi-layered problem. Searching on the internet revealed that there WAS a relatively easy solution that fixes ALL the above problems, including giving me the ability to run a simple script to extract birthday and anniversary information! Here is the solution. Yahoo and Google Address Books allow the existing contacts-list to be exported as a CSV (Comma Separated Variable) file, or a CSV file to be imported to populate the Contact List or Address Book application. A CSV file, as the name suggests, is just a regular text file, with many fields belonging to a record typed across a single line, with the comma symbol (&#8220;,&#8221;) separating the fields. A new record starts in a new line. The file can be opened with a regular text-editor such as Notepad, Wordpad or Textpad in Windows and vi, pico and emacs in Unix. The file may also be opened using Microsoft Excel spread sheet and the fields show up in separate column and the lines show up in separate rows. This solves the problem of easily modifying the contact list in bulk and storing the contact list as a local file on your personal computer. The CSV file is compatible across Yahoo and Google, and probably across many other applications like Microsoft Outlook and Orkut (web-based networking application). The CSV file can then be imported into Yahoo Mail, Google Mail or other such applications. Problem solved. Single contact list. Storable and updateable locally. Uploadable to multiple web-based servers.</p>
<p>The CSV file based common contact list also allowed me to enter the anniversary and birthday in appropriate columns. I wrote a script called contact.py in the Python scripting language to read the contact list file as a simple text file (in the CSV format) and search for upcoming events. This allowed me to get rid of the earlier text file I had my Perl script read. The CSV file, I called it contactlist.csv, was truly the one file I needed to retain for all my address-book related needs. Whenever I want to add a new contact or update information about an existing contact, I update the local copy of the contact list, contactlist.csv, and then import it into Yahoo Mail and Google Mail to keep them up to date. I have noticed that before I import the latest contactlist.csv file into Yahoo or Google, I need to delete all the existing contacts from Yahoo and Google, respectively. Once, we have an empty contact list on the mail server, the importing of contactlist.csv recreates the complete list. Not starting with an empty contact list on the mail servers, creates duplicates, probably because the &#8220;import&#8221; function is not smart enough to recognize duplicates.</p>
<p>Here is an example of what a few rows from the CSV file contactlist.csv looks like. It gives us idea of what the fields are. The example also shows that all the fields in a CSV file need not be filled. A field can be left empty if we do not know the information relating to that field for a given contact. Also, I use xxxx for the year field of a date (such as a birthday or an anniversary date), in case I do not know the year. This is OK because the script that parses this CSV file, called contact.py, and which is shown later, does not use the year field to determine if an anniversary is approaching. It only uses the day and month parts of the field.</p>
<p><small> First,Middle,Last,Nickname,Email,Messenger ID,Home,Work,Pager,Fax,Mobile,Other,Yahoo! Phone,Alternate Email 1,Alternate Email 2,Personal Website,Business Website,Title,Company,Work Address,Work City,Work State,Work ZIP,Work Country,Home Address,Home City,Home State,Home ZIP,Home Country,Birthday,Anniversary,Custom 1,Custom 2,Custom 3,Custom 4,Comments,Messenger ID1,Messenger ID2,Messenger ID3,Messenger ID4,Messenger ID5,Messenger ID6,Messenger ID7,Messenger ID8,Messenger ID9,Skype ID,IRC ID,ICQ ID,Google ID,MSN ID,AIM ID,QQ ID<br />
Shahrukh,Mayur,Khan,srk,srk@bollywood.com,,,,,,,,,,,,,,,,,,,,,,,,,1/2/xxxx,,,,,,,,,,,,,,,,,,,,,,</p>
<p></small> Here is the contact.py Python script which then works on the CSV file called contactlist.csv with contents as shown above, and sends email to your email account. You might have to appropriately fix some of the fields in the script to get it to work. I present it here just as a hint.</p>
<p><small> import csv, datetime, re<br />
from string import split<br />
filename = &#8220;contactlist.csv&#8221;<br />
warnZone = 8 <em>#number of days before which email reminder should be sent</em></p>
<p>daysInMonth = ['31','28','31','30','31','30','31','31','30','31','30','31'];<br />
def dayOfYear(month, day):<br />
<em>#print &#8220;%s %s&#8221; %(month, day)</em><br />
doy = 0<br />
for n in range(0,int(month)):<br />
if(n == int(month)-1):<br />
doy = doy+int(day)<br />
return doy<br />
else:<br />
doy = doy+int(daysInMonth[n])</p>
<p>now = datetime.datetime.now()<br />
today_month = now.strftime(&#8220;%m&#8221;)<br />
today_day = now.strftime(&#8220;%d&#8221;)<br />
today_doy = dayOfYear(today_month, today_day)<br />
<em>#print &#8220;%s %s %s&#8221; %(today_month, today_day, today_doy)</em></p>
<p>reader = csv.reader(open(filename))<br />
content = &#8220;&#8221;<br />
for row in reader:<br />
firstname = (row[0])<br />
middlename = (row[1])<br />
lastname = (row[2])<br />
anniversary = (row[30])<br />
birthday = (row[29])<br />
anni_split = anniversary.split(&#8216;/&#8217;)<br />
bday_split = birthday.split(&#8216;/&#8217;)<br />
<em>#print &#8220;len anni_split %s&#8221; %(len(anni_split))</em><br />
<em>#print &#8220;len bday_split %s&#8221; %(len(bday_split))</em><br />
if(len(anni_split)&gt;1): <em>#keeps &#8220;a/b/c, gets rid of  &#8220;A&#8221;, as in 1st row</em><br />
anni_month = anni_split[0]<br />
anni_day = anni_split[1]<br />
anni_doy = dayOfYear(anni_month, anni_day)<br />
diff = anni_doy &#8211; today_doy<br />
if((anni_doy &gt;= today_doy and anni_doy &lt;= today_doy + warnZone) or (anni_doy &lt;= today_doy + warnZone &#8211; 365)):<br />
<em>#       print &#8220;%s %s %s&#8217;s anniversary is on %s/%s&#8221; %(firstname, middlename, lastname, anni_month, anni_day)</em><br />
content += firstname+&#8221; &#8220;+middlename+&#8221; &#8220;+lastname+&#8221;\&#8217;s anniversary is on &#8220;+anni_month+&#8221; &#8220;+anni_day+&#8221;\n&#8221;<br />
if(len(bday_split)&gt;1): <em>#keeps &#8220;a/b/c, gets rid of  &#8220;A&#8221;, as in 1st row</em><br />
bday_month = bday_split[0]<br />
bday_day = bday_split[1]<br />
bday_doy = dayOfYear(bday_month, bday_day)<br />
diff = bday_doy &#8211; today_doy<br />
if((bday_doy &gt;= today_doy and bday_doy &lt;= today_doy + warnZone) or (bday_doy &lt;= today_doy + warnZone &#8211; 365)):<br />
<em>#       print &#8220;%s %s %s&#8217;s birthday is on %s/%s&#8221; %(firstname, middlename, lastname, anni_month, anni_day)</em><br />
content += firstname+&#8221; &#8220;+middlename+&#8221; &#8220;+lastname+&#8221;\&#8217;s birthday is on &#8220;+bday_month+&#8221; &#8220;+bday_day+&#8221;\n&#8221;</p>
<p><em>#print &#8220;%s&#8221; %(content)</em></p>
<p>import smtplib<br />
smtpserver = &#8216;mailserver.department.company.com&#8217;<br />
AUTHREQUIRED = 0<br />
RECIPIENTS = ['gol345die@gmail.com']<br />
SENDER = ['con789vey@po.doc.com']<br />
session = smtplib.SMTP(smtpserver)<br />
smtpresult = session.sendmail(SENDER, RECIPIENTS, content)<br />
if smtpresult:<br />
errstr = &#8220;&#8221;<br />
for recip in smtpresult.keys():<br />
errstr = &#8220;&#8221;"Could not deliver mail to : %s Server said: %s %s %s&#8221;"&#8221; % (recip, smtpresult[recip][0], smtpresult[recip][1], errstr)<br />
raise smtplib.SMTPException, errstr</small></p>
]]></content:encoded>
			<wfw:commentRss>http://flickeringtubelight.net/blog/2007/07/finally-one-contact-list-to-rule-them-all-2/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Cycles per Instruction and its components</title>
		<link>http://flickeringtubelight.net/blog/2006/04/cycles-per-instruction-and-its-components/</link>
		<comments>http://flickeringtubelight.net/blog/2006/04/cycles-per-instruction-and-its-components/#comments</comments>
		<pubDate>Sun, 30 Apr 2006 18:30:20 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Tutorials]]></category>

		<guid isPermaLink="false">http://flickeringtubelight.net/blog/2006/04/30/cycles-per-instruction-and-its-components/</guid>
		<description><![CDATA[Cycles per Instruction is one way to evaluate the microarchitecture&#8217;s effect on a processor&#8217;s performance. Performance also depends on what constitutes a cycle (frequency of the machine) and what constitutes an instruction (the architecture of the machine). Interestingly, the ratio of these two, i.e. Cycles per Instruction, tells us about something other than those two [...]]]></description>
			<content:encoded><![CDATA[<p>Cycles per Instruction is one way to evaluate the microarchitecture&#8217;s effect on a processor&#8217;s performance. Performance also depends on what constitutes a cycle (frequency of the machine) and what constitutes an instruction (the architecture of the machine). Interestingly, the ratio of these two, i.e. Cycles per Instruction, tells us about something other than those two factors. CPI tells us how many cycles, at the given frequency, the machine takes to chomp down instructions of the given architecture. In other words CPI tells us about the microarchitectural robustness of a design. It tells us, given a certain cycle period (frequency) and a certain group of instructions (architecture), how many cycles does it take to execute an &#8220;average&#8221; instruction. The &#8220;average&#8221; instruction depends on the workload (the program) being run through the processor. And therefore, when quoting CPI&#8217;s the workload to which that CPI applies is also quoted. The smaller the CPI the better performing a computer system is, all other factors such as the frequency and architecture remaining the same. Of course, other factors such as area, complexity and power must be considered, in addition to performance and the definition of &#8220;better&#8221; changes accordingly.</p>
<p>CPI is affected by several factors. Architects and designers often need to know where to start when trying to improve performance by making microarchtectural changes. The overall CPI has to be broken into components and the component that is contributing the largest is usually attacked first, since improvement in that design component could potentially reduce the CPI by the largest amount. I will come back to this point again with an example, shortly. For this write-up I will break the CPI into two main components. The &#8220;core&#8221; component and the &#8220;memory-nest&#8221; component. The core component deals with all the microarchitectural decision within the processing unit, which in modern microprocessors includes several pipelines (superscalar machines), each with several stages and associated logic &#8211; instruction fetch, decode, issue, register access, execute, write-back etc. The memory-nest component deals with getting appropriate instructions and data from the memory. Several levels of caching is provided to take advantage of certain properties seen in programs. These properties are spatial-locality (if a certain memory location is used, something nearby will be used soon) and temporal locality (if a certain memory location is used, it will likely be used again). To keep the data close at hand, and to avoid the potentially hundreds of processor cycles worth of time to go to the memory every time a new instruction or data is needed, instructions and data are stored in caches. There are some great resources on the web and in books which talk about caches in more detail. For this write-up it suffices to say that caches usually are arranged in a hierarchy. Level 1 is the smallest cache but is the fastest to access and is called the L1Cache. Level 2 cache is bigger than the Level 1 cache but because it is bigger and is usually only accessed after it it known that the instruction or data the core was looking for was not available in the L1Cache, it is slower. It is called the L2Cache. Usually the hierarchy has 2-3 levels of caching.</p>
<p>Therefore, if the CPI component is mostly affected not by the design of the core itself, but because of the sizes or geometries of the caches, the designers would be better served by the information that the memory-nest, or more accurately, a certain level cache should be the first place to look for potential improvements, rather than the core. This brings us to the two main topics of this article. First, what quantities in the cache must be measured to help understand its effect on the CPI. Secondly, from these statistics collected on the caches, how do we build the overall CPI number.</p>
<h3>Cache Statistics &#8211; What they are, what they mean</h3>
<p>When modeling caches there are 4 statistics that are often talked about. These are all not completely independent, and are, therefore, easy to get confused about. The 4 main performance-realted statistics collected for a cache are:</p>
<p>MissesPerReference<br />
HitsPerReference<br />
MissesPerInstruction<br />
HitsPerInstruction</p>
<p>MissesPerReference indicates the ratio of how many times, of all the times the cache was accessed, was the cache unable to provide the data requested. HitsPerReference is the complement. It is the ratio of how many times, out of all the times the cache was accessed, was the cache able to provide the data requested. The complementary nature of these quantities makes it necessary to only quote one number.</p>
<p>MissesPerReference + HitsPerReference = 1</p>
<p>MissesPerInstruction is not that straightforward. It is the ratio of how many times the accesses to the cache missed, out of all the &#8220;instructions in the workload&#8221;! This is a strange measure at first, and takes some getting used to. It&#8217;s the ratio if a design-dependent variable quantity &#8211; cache misses, to an unchanging quantity, the number of instructions in the workload. In the earlier measure of missesPerReference, the denominator, the number of references to the cache, was variable depending on what happened before this cache was reached by the instruction flow. The number of reference to this cache depends on how well the previous levels of cache did. HitsPerInstruction, similarly is the ratio of the number of hits in the cache to the total number of instructions in the workload. MissesPerInstructions and HitsPerInstruction do NOT add up to 1! They add up to the number of references made to the cache per instruction.</p>
<p>These 4 quantities are depicted by an example in Figure1.</p>
<p><a href="http://flickeringtubelight.net/blog/wp-content/uploads/2006/04/CachesAndCPIfig1.gif"><img class="alignnone size-full wp-image-550" title="CachesAndCPIfig1" src="http://flickeringtubelight.net/blog/wp-content/uploads/2006/04/CachesAndCPIfig1.gif" alt="" width="698" height="498" /></a></p>
<h3>Why collect PerInstruction statistics</h3>
<p>So the question is, &#8220;what is the reason we need these two ways of looking at a cache&#8217;s performance?&#8221;. Why doesn&#8217;t the simpler PerReference style of collecting stats suffice? The answer is that the PerReference numbers are great for designing the cache under study provided the number of references does not change, which implies, the design before the cache in study remains unchanged. The PerInstruction numbers are useful to get an idea of how much the cache under study is contributing to the overall CPI. We will shortly see how that overall CPI is calculated from the PerInstruction numbers. Before the PerReference numbers can be used to improve cache designs, the PerInstruction numbers must point out that it is worthwhile to go after a cache design because the cache is adding a big component to the CPI of the system. And we want to keep the CPI low as far as possible. That is why we want to keep track of the PerInstruction statistics.</p>
<h3>How to breakdown overall CPI into components</h3>
<p>There are two ways to break the overall CPI, or another way to look at it is, two way to build the overall CPI.<br />
One way is to look at what fraction of instructions &#8220;pass through&#8221; each level (whether they hit or miss at that level is immaterial) in the hierarchy and add up the cycles they spend at each level. The idea is depicted in Figure 2.<br />
The other way is to look at what fraction of instructions &#8220;finish at&#8221; each level (i.e. they hit) and add up the cycles those fractions take to finish their journey through the memory hierarchy. The idea is depicted in Figure 3.</p>
<h5>Method 1: Breaking CPI into components for each level of the memory hierarchy an instruction passes through</h5>
<p><a href="http://flickeringtubelight.net/blog/wp-content/uploads/2006/04/CachesAndCPIfig2.gif"><img class="alignnone size-full wp-image-551" title="CachesAndCPIfig2" src="http://flickeringtubelight.net/blog/wp-content/uploads/2006/04/CachesAndCPIfig2.gif" alt="" width="612" height="222" /></a></p>
<p>This method is often used to represent a CPI stack, which assigns a portion of the CPI to each level. This is very useful in figuring out which level of the cache to go after for improvements.</p>
<p>CPI = CoreComponent + L1Component + L2Component + L3Component + MemComponent</p>
<p>Here the overall CPI is reached by adding CPI contributions at each level of the memory hierarchy. Simulation models for the core often generate &#8220;infinite&#8221; CPIs. Infinite CPIs are used to keep the effect of the memory hierarchy separate from the effect of the core design, so separate teams can work on both models in parallel. The effect of the memory hierarchy on the CPI has to be added in to the &#8220;infinite&#8221; CPI to get the overall CPI.</p>
<p>infL1CPI = CoreComponent + L1Component<br />
infL2CPI = CoreComponent + L2Component + L2Component<br />
and so on.</p>
<p>The calculation of infL1CPI assumes the L1 always hits, but has the normal hit latency. So it is an infinitely large L1, but miraculously the directory lookup and array access takes time equal to a realistic sized L1.<br />
The calculation of infL2CPI assumes the L2 always hits, but has the normal hit latency. The L1 is realistically modeled, the L2 is the infinite L2 with correct hit latency modeled.</p>
<p>Going back to<br />
CPI = CoreComponent + L1Component + L2Component + L3Component + MemComponent<br />
Say we have cache models for the L1,L2, L2 caches and the memory. So how do we get the L1Component, L2Component, L3Component and MemComponent, such that we can get the overall CPI for the system. This is where the hitsPerInstruction and missesPerInstruction numbers come in very handy.<br />
L1Component = L1ReferencePerInstruction * L1Latency &#8211;(A1)<br />
L2Component = L2ReferencesPerInstruction * (L2Latency &#8211; L1Latency) &#8211;(A2)<br />
L2Component = L3ReferencesPerInstruction * (L3Latency &#8211; L2Latency) &#8211;(A3)<br />
MemComponent = MemReferencePerInstruction * (MemLatency &#8211; L3Latency) &#8211;(A4)</p>
<p>Notice that each component is a product of the fraction of instructions that reference that level and the amount of time those instructions which reference that level are guaranteed to spend at that lavel. Two observations here:<br />
1. the ReferencesPerInstruction at each level is the sum of hitsPerInstruction and missesPerInstruction for that level<br />
2. an easy way to determine the common amount of extra time introduced by a level into the overall execution of an instruction is to find the difference between the time when an instruction hits at that level and one level before.</p>
<h5>Method 2: Breaking CPI into components for each level of the memory hierarcy an instruction finishes at</h5>
<p><a href="http://flickeringtubelight.net/blog/wp-content/uploads/2006/04/CachesAndCPIfig3.gif"><img class="alignnone size-full wp-image-552" title="CachesAndCPIfig3" src="http://flickeringtubelight.net/blog/wp-content/uploads/2006/04/CachesAndCPIfig3.gif" alt="" width="578" height="215" /></a></p>
<p>This method is often used when the aim is to simply calculate the overall CPI and we are not dealing with breaking the CPI into its CPI stack components.</p>
<p>CPI = CoreComponent + (L1HitsPerInstruction*L1Latency) + (L2HitsPerInstr*L2Latency) + (L3HitsPerInstr*L3Latency) + (MemHitsPerInstr*MemLatency) &#8211;(B)</p>
<p>Another way to say this is<br />
CPI = Time spent by all instructions which do not have to even go to the L1Cache(s)<br />
+ time spent by the fraction which hit L1 and face the L1Latency on top of the CoreComponent<br />
+ time spent by the fraction which hit L2 and face the L2Latency on top of the CoreComponent<br />
+ time spent by the fraction which hit L3 and face the L3Latency on top of the CoreComponent<br />
+ time spent by the fraction which hit memory and face memory latency on top of the CoreComponent</p>
<p>Notice that each component looked at this way is the product of the fraction of instructions which hit in a particular level of the hierarchy and the time it takes to service those instructions.</p>
<p>CAVEAT: There is a distinct difference between the CoreComponent component in equations A and equation B. In equations A, CoreComponent applies to all the instructions, since all the instructions pass through the core. In equation B, the coreComponent applies to only those instructions which do not have to leave the core to access L1Cache (or caches, since Level 1 is usually broken into separate instruction and data caches). To simplify this, simulation models of the core often are tightly coupled with at least the L1cache or L1caches. There is a significant amount of overlap in accessing the caches, especially the level 1 cache and therefore the time spent in accessing one instruction or data granule might be able to hide the time to access another. This could adversely affect the CPI calculations. That is why this style of breaking up the CPI into its components and adding them up, only works as an estimate. A realistic CPI can only be realized by building in a more detailed understanding of the paralellism in a workload into the equations above, or by just creating a more accurate simulation model with the core model working hand-in-hand with the memory nest model. One way to make the above equations a little bit closer to reality is by only considering loads (i.e. both instructions and data loads) for the missesPerReference, hitsPerInstruction etc. Stores usually do not directly affect CPI since they are handled by mechanisms that are non-critical to CPI calculations.</p>
<h6>Posted By: Anil Krishna	at 2:30PM on Sunday, April 30th, 2006</h6>
]]></content:encoded>
			<wfw:commentRss>http://flickeringtubelight.net/blog/2006/04/cycles-per-instruction-and-its-components/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

