coreSlog

Wednesday 5 December 2012

Problem Solving Episode

See: http://www.scribd.com/doc/115708583/Slog-Episode

In case the repository for the photos isn't linked: http://s1306.beta.photobucket.com/user/cateagleowl/media/awt1/ScreenShot2012-12-04at13422PM.png.html?sort=3&o=48

Sunday 25 November 2012

FSAs, Languages, and REs

I was really happy to see that we covered product construction because I was expecting it to be formalised. Once I heard about product construction I know it would be about using a sort of "cross product" of machines to form a larger one. I just wish it was taught formally like this before the first quiz about it.

Then there was the proof that:
L = L(M) for some DFSA M iff L = L(M') for some NFSA M' iff L=L(R) for some RE R.
which really I was also expecting, but just the fact that it could be formalised made me really happy.

The first: L = L(M) for some DFSA M iff L = L(M') for some NFSA M', I guess was not shown because it seemed a bit obvious, but I have to look into this in my own time.

The second: L = L(M') for some NFSA M' iff L=L(R) for some RE R is technically shown by the series of iffs, but again I need to look at this in my own time.

The third:
L = L(M) for some DFSA M iff L=L(R) for some RE R was very interesting because to show the first direction (=>), we did state elimination (and I guess I'm always fond of simplifying long statements by assigning variables to them, as we found quite useful in even simple examples of this) to cut down the number of states to the start and the end state, and came up with the regex, which was neat.

To do the opposite direction (=>), we sort of reconstructed what was necessary by using machines as almost recursive structures. This was something I didn't expect, but which I thought was really good lateral thinking. When I do it myself though, I find that I like cutting down the states as much as possible, but I guess I'll have to play it safe until I practice and see if there's any holes in logic by cutting down too much.

On another note, I find that you have to be really really careful when constructing regular expressions to describe a language. I also think that I should practice more the equivalences/laws/rules on regexs, until they are easy to manipulate as boolean expressions (but even these can cause trouble sometimes).

With regards to the proof that a regular expression denotes a language (need to prove "="), I found it kind of sketchy. You see, there seems to be a very thin line between assuming what you need to prove and keeping within your domain of assumptions. Especially when you're trying to prove that a string in a language must be one in the regex - it's almost like the reader must unequivocally follow the proof writer's many little unwritten proof by contradictions to reach the conclusion. For example, if string x is in the language of strings that begin and end with the same bit, then x must have the form 'tut', where u must be zero or more symbols from {0,1} and t must be 0 or 1. But "must" connotes the felling: "Well, there is no other way to do it but this:". But this "must"ness is really a dangerous thing. It seems that the proof works out by translating the language in our own terms, then translating it into a language that holds the expected regular expression. I guess if it works, it works. But I want to do some reading on that.

In terms of A3, I learned really how to do proofs by correctness (what a weird name), i.e., for iterative algorithms: creating an iterative algorithm and proving it true by simultaneously finding a loop invariant and tracing the execution of the algorithm in an intuitive sense. It took me many hours to analyse it, but it turns out this technique is a very classy way to solve these kinds of problems.

Finally, we took on the "Consequences of Regularity", i.e. we showed that not every language could be represented in a DFSA. The feel of the proof was as slimy as that of proving that "the power set of a countably infinite set is uncountable" proof, but the concept of "pumping" to reach a contradiction about the finiteness of the DFSA is way cooler.

----------------------------------------------------------------------------------------------------------------------------------

I'm feeling not too bad about the upcoming exam. But alas, lady luck strikes again --

Saturday 17 November 2012

Finite State Automata and Languages

We can recursively define ∑* as the smallest set such that

1) ε is in ∑*, and

2) for all c in ∑, for all y' in ∑*, y'c is in ∑*.

With this definition in mind, we can prove properties about strings through structural induction - I will probably give more examples in the future. Thm 7.4 seems to be a good start, because it reflects the definition of ∑* above, namely that if we are trying to prove a property (for all strings y).P(y) using structural induction, we just assign y = y'c above, and assume P(y'), and proceed to show that (for all c in ∑).P(y).

DFSA Stingy Vending Machine:

- The Table of ∂ represents exactly the (deterministic) mappings of ∂: Q x ∑ --> Q (regardless of beginning state), so it is clear to me that the table is ∂ itself in the quintuple that describes automations.

In terms of the extended transition function ∂*, I read def'n 7.11 on pg.201 of the textbook to see exactly how it is defined by structural induction. In terms of the lecture, I can now see the significance of setting s = s'a when ∂*(q, s), so that we can determine when s' is in ∑* and a is in ∑, and so we can show that ∂*(q, s) = ∂(∂*(q, s'), a) in this case.

----------------------------------------------------------------------------------------------------------------------------------

Term Test 2:
- Got what I expected, I know what/how to do the questions now (as shown by previous blog posts).

Program correctness:

- Withholding practice until exam time.

Friday 16 November 2012

Recursive and Iterative Correctness

In general, I really liked doing complete induction on recursive correctness proofs because it really exercised proof by cases (which I enjoy) and complete induction (which is cool because it interweaves into this subject of program correctness, and you get the opportunity to practice being careful in translating the induction hypothesis to the recursive calls).

Iterative correctness is certainly tougher... In general, I find that to prove an iterative proof, I need to:
1) Find an appropriate LI (loop invariant) which helps prove the postconditions from the preconditions,
2) Prove that precons (preconditions) imply term (termination) implying postcons (postconditions) using the incoming LI,
3) Prove term by finding a decreasing sequence in the natural numbers to show that the loop does terminate.

Proof of recBinSearch(x, A, b, e), which returns index p such that A[p - 1] < x <= A[p], except for boundaries:

Precons: 1) x and elements of A comparable,
2) e and b valid indices, b <= e,
3) A[b...e] sorted non-decreasing.
Postcons: 1) b <= p <= e + 1,
2) b < p implies A[p - 1] < x,
3) p <= e implies x <= A[p].

My major concern here was with postcons 2 and 3. I thought that what was returned was p such that A[p] = x, when in actuality it is such that A[p - 1] < x <= A[p]. This makes postcons 2 and 3 important because the hidden implications of the latter (A[p - 1] < x <= A[p], except for boundaries). Firstly, b < p implying A[p - 1] < x means that we return the index p of the "first occurrence" of x in A[b...e], since A is sorted in non-decreasing order (even if p = e + 1 in line 5). Secondly, p <= e implying x <= A[p] means that p != e + 1 (since by postcon 1, p <= e + 1, p can't be any larger than that), so line 2's if statement is true, so x <= A[b = p]. The important thing to take out of this is that it is possible for A[p - 1] < x < A[p] (so A[p] != x, so p is not returned such that A[p] = x). An example of this occurring is when x is "beyond the boundaries" (hence the "except for boundaries" part), so x is less than A[b = p] like with recBinSearch(0, [0, 0, 1, 1, 4, 6, 8], 2, 3) or when x is in an index beyond p = e + 1, like with recBinSearch(100, [0, 1, 2, 3, 5, 6, 8, 13, 56, 67, 100, 130], 0, 8).

Proof of MergeSort(A, b, e): with A'[b...e] as the same elements of A[b...e], but sorted in non-dec order and all other elements of A' from A unchanged:
- Interesting that it needs to be unchanged, to ensure that recursive calls do not have lists that mess up each other's elements.
- Proving Merge:
--> Interesting that we needed to create a strictly decreasing sequence (to find the last iteration using WOP), it's very clever.
--> About the loop invariant for merge, being e + 1 - i, I need to find out where the + 1 comes from (length of the list making index fall out perhaps?).

Proof of power(x, y), returning z = x^y, with y a natural number and x a real number:
- Way the proof works seems similar to mult'n proof in textbook.
- In proving termination, i.e. {y - m_i} is a decreasing sequence in the natural numbers, I was wondering why we didn't need to use induction. Instead, we showed that having an (i + 1)th iteration of the loop implies y - m_i < y - m_(i + 1).

"Correctness by Design":
- This was I thought of as a very powerful technique. First, it is possible to rewrite recursive functions as iterative ones, and vice-versa - but here, careful iterative and recursive correctness techniques are used in combination to understand how to design the actual function "correctly" (so we can prove correctness afterwards).
- I liked how we could extrapolate how our function would work by the drawing out of the lists of A and how it is before, during, and after sorting. It seems a crucial technique for finding loop invariants.
- I really wish this could be covered in detail again, it kind of makes me nervous if the short coverage of this topic is used as a cover for very complex questions on the exam. But it also gives room for more creative, interesting questions (#0).

#0: It excites me - now if only it wasn't for my horrid exam schedule...

----------------------------------------------------------------------------------------------------------------------------------
Final words on complexity examples:
- Discussed the closest points pair problem with Danny, it made much more sense knowing:
1) The computer sees the set of points P (as) = [(x0, y0), (x1, y1), ...], but the actual values held by the tuples carry no meaning in terms of sortedness. That is why we need to create two sorted lists immediately after: Px = {p in P | sorted by x} and Py = {p in P | sorted by y}, which is passed onto the recursive function. The sorting will each take O(nlogn) when doing a mergesort of P - the end result for the entire algorithm that we expect is still O(nlogn), so we don't care if part of the algorithm already does O(nlogn)
2) The sorted lists are used to walk through (O(n) to walk through each list) first in one direction, say the x direction, in order to see if the points are close that way, then we do recursive calls as the "map" of points sorted by x is split into left and right parts. On the way "up"/ to "merge" to find a minimum between the set of points on the left and right, we consider the smallest distance delta which is the minimum of the minimums found on the left and right.
3) We can use delta to examine the points closest to the splitting line, and we check if any distance in that neighbourhood is less than delta (since the distance that is across the line, between points, has not yet been examined). We do this by using the second direction of the sorted lists, say the y direction sorted list. For every two consecutive points in the list (imagine yourself as the computer, moving through the list of tuples sorted by y) whose distance in the y direction is less than delta, we can check if the euclidean distance is less than delta, and this will determine if there is a closer delta than the one that is the distance between two points isolated to one side of the splitting line.

Program correctness:
- Recursive: Need to practice writing proofs/counterexamples to whether recBinSearch(x, A, b, e) is correct when m = ceiling((e + b) / 2) or when x < A[m], and finally to edit the code to find the last of duplicate elements probably going over than during Exam time since it takes some extra time.
- Iterative: Need to practice with Merge correctness (pg 67 of course notes), but I'll probably do that during the Exam practice time, since the proof seems large-scale.
- In both cases, I'll have to hold off till exam time to do intense practice.
- Need practice for correctness by design.

Tuesday 6 November 2012

Time Complexity 4

Let's take:
1) Another look at finding closed forms from recurrences
2) And an overview of two Master theorem examples.

I should have really thought about the intuition behind unwinding recurrences. My goal at the moment is to be able to understand recurrence unwinding such that I will tackle such a question without erasing (i.e. right the first time). Let's say that we have an algorithm that takes T(n) time, with n being the input "length". I thought that finding a closed form was about unwinding until an arbitrary n, but that didn't really have a purpose with regards to finding a closed form. If we know that n is an integer power of, say 3, then there exists a positive integer k such that n = 2^k (It is important that k is positive because we want to unwind when T(n) has a recursive definition for n, and this occurs in recursive definitions when n is greater than a certain breakpoint which is greater than 1). Then, we should unwind until that k, so that we can "cancel" things out, since n can be in terms of k.

Notice that n = 2^k is equivalent to k = log(base 2)n. Also notice that when unwinding, we find a closed form that is in terms of log(base 2)n. I figure that we choose k such that it is what our T(n) should be bounded by - and if we have the right k - then a closed form should be found (i.e, we won't have an infinite recurrence and be unable to "cancel" (#0) things out). We should also choose k such that things that get in the way such as floors and ceilings can be simplified.

Finally, we can prove that T(n) is equivalent to its closed form by a form of induction (usually complete, since we need to assume that the recursive components of T(n) hold to show that T(n) itself holds).

On the other hand, I notice that there are "simpler" recurrences to unwind, such as the Assignment 2's OMCOFS recurrence. Here, we didn't need to guess a k in terms of n that our recurrence was bounded by. We simply unwounded until n. I guess one difference was that we were seeking a resemblance to the fibonacci sequence, so we could simply unwind until that. But if we had the simplified time complexity of binary search that ran at T(n) = 1 + T(n/2) time for n > 1, we would still need to find a k in terms of n to bound the recurrence by in order to reach a closed form. I guess that we need such a k if we had a T(n) that can't "cancel" itself out after n substitutions (unwindings). I realize that we may need to do a bit of unwinding before we discover that necessary k in terms of n.

Another interesting thing to note: T(n) won't be able to "cancel" itself out if indices are chosen unwisely. By indices, I mean that we plug (apply the recurrence) and chug (simplify the result) (MIT Mathematics for Computer Science 788). This can happen if I obtain something like T(n - (n + 1)) = T(-1), which is undefined for recurrences (even T(0) can sometimes be undefined). Usually, I see this happen when T(n) is applied to itself several times in one unwinding (like our Assignment 2's question 1 H(n) recurrence, or tutorial 3's question 3 G(n) recurrence).

Now onto Master Theorem examples. In class, we covered two, namely: n-bit multiplication and finding closest points pairs.

For n-bit multiplication, I finally understood why Gauss's trick saved another subproblem from being evaluated. First thing to keep in mind is that we have a recursive call when we do a multiplication. Second, we should think like computers in that we should think step by step. Since our focus here is on the multiplication algorithm time complexity, when we see an expression like (a + b)(c + d), do not conjure up the idea that the computer simplifies this through FOIL (first, outside, inside, last) into the form (a*c + a*d + b*c + b*d). Then obviously this would seem to take equivalent time to brute force multiplication (#-1). If we ignore this innate intuition of simplifying the expression and instead see the bigger picture (#-2), we notice that the expression itself requires only one multiplication, the asterisk in: (a + b) * (c + d). Then we know that Gauss's trick saves one recursive call. For more information, see the first part of http://www.cs.berkeley.edu/~jordan/courses/170-fall05/notes/dc.pdf, on multiplication.

For finding closest points pairs, I know that we need (perhaps not exactly "why we need"):
- to recursively find the minimum distance on each side of the closest points.
- to find the points that might have a close distance across the "separating line".
But for the most part I'm dumbfounded on how sorting is applied to the problem. I have read some material online, but I can only hope to go to office hours by now. My conjecture is that the problem with explaining the closest point pairs problem is a lack of focus on definitions and failing to accommodate to readers' intuitions (as in #1).

I'll be posting about recursive recurrences next.

#0: By "cancel", for the most part, I mean having a recurrence reach a base case:
T(n) = T(n - 1)
= T(n - 2)
.
.
.
= T(n - n) = T(0) = 1 (assuming that T(n) has a defined "base case" like this.)

#-1: (Psychology?) Note. My explanation to you will require me to "break the fourth wall". I don't know the exact terms, but I see many textbooks try to explain the process of how the reader thinks as he /she approaches the problem. I think all that is necessary is for the book to admit to the reader that the it is trying to explain what the reader thinks and should not think, i.e. warn of the reader's pretense in answering a question.

#-2: This is what most textbooks, and many professors seem to say, but it's this that leads to a failure in communication and understanding. Terms like "think about it", "see the bigger picture", and "think outside the box" implies some avant-garde, clear, and simple way to think about things, but it really can make the learners feel like low-lifes. Explanations should be as explicit as possible, with relevant, pure material as backup.

----------------------------------------------------------------------------------------------------------------------------------

I'm creating this section below to record milestones through the course. I figured that this would be unnecessary, as my progress is implied by the topics I've posted on, but I'll take the time to make explicit how I feel so far with the course (But in all honesty, I am reluctant to take away from my immersion from the course material. Writing this section forces me to break this "wall" explicitly).

Induction:
- Simple: Solidified.
- Complete: Solidified.
- Well-Ordering: Solidified, but I there is one extra reading material I wish to look at.
- Structural: Solidified, but I am disappointed with the level of rigor the course had for the topic.
- Pitfalls: Solidified, but I hope to find more.

- Quizzes: Solidified. The TA should not have told us mathematical induction was the only one allowed.
- Test: Solidified.

Recurrences and recursion:
- Solidified, with me having learned this post's lesson on the test.

- Quizzes: Solidified. Looking forward to more examples.
- Test: Solidified, with me having learned this post's lesson on the test. The TA should not have withhold the geometric series from us and we should not have been suggested to self-study program recursive correctness.

Program correctness:
- Recursive: Solidified, self studied but I hope to read on a proof of this for merge sort.
- Iterative: I have yet to take a look at.

Sunday 28 October 2012

Time Complexity 3

Currently, there are four things I want to make clear in this topic.
1) The Piazza question regarding the proof of Lemma 3.6 still gets me confused, and I want to clarify this during OH.

2) Again lemma 3.6 (and this means proofs to show T(n) is non-decreasing). The proof denotes n by k instead. The third case tries to show that for k > 2, P(1), P(2), ..., P(k-1) implies P(k). I tried the proof itself several times and I still don't agree that "Since P(k-1) holds, to prove that P(k) holds as well, it suffices to prove that T(k-1) <= T(k)." I suppose that (if an error were to exist, it would have to be here) the proof goes on to show that since P(k-1) holds: "for every positive integer m, if m < k - 1, then T(m) <= T(k-1)", then T(m) <= T(k-1) <= T(k), the last inequality needed because we are trying to show P(k), which says: "for every positive integer m, if m < k, then T(m) <= T(k)", not merely T(k - 1). Of course this would seem silly because T(k-1) <= T(k) relies on that which we are trying to prove, that: T(k) is non-decreasing.

3) Danny skimmed over an intuition by using summation(a/(b^d)) to understand the master theorem. I read the hefty (I imagine, part-proof) of the theorem in the course notes (At least I think it was here), but I couldn't see a clear connection.

4) In class, we moved on to apply the Master Theorem to more interesting examples, in particular for multiplying n-bit numbers Gauss-wise (of which I now understand after some review) and closest point pairs (of which I don't understand completely. The slides are not posted, yet - but once they are I plan to go over them with scrutiny.)

Meanwhile, I'm hoping that more examples come up on the Master Theorem, because I'm not entirely confident on the choice of a: "The number of pieces for subproblems/the number of recursive calls." It seems that only experience can aid me in understanding this.

EDIT: Most excellent! In regards to the above:
1) I see that by the definition of non-decreasing, m < n implies T(m) < T(n), but this is equivalent to m <= n implies T(m) <= T(n). This is because if m = n, then obviously T(m) = T(n).
2) If you look at the final sentence above, I think that we are making a logical loop by needing T(k-1) <= T(k), but now I understand that this is exactly what the Vassos was trying to prove. Proving T(k-1) <= T(k) is two-fold, because for one, we break out of the logical loop and two, we are able to cover the "final case" which I was so caught up on (Since we are trying to show P(k - 1), but needing to prove T(k - 1) <= T(k) as well, since P(k - 1) cover all cases (induction hypothesis cases) before that).

Sunday 21 October 2012

Time Complexity 2

Proof of Lemma 3.4 (Course Notes):

Take T(n) = c if n = 1

2T(n/2) + dn if n > 1.

Then let P(i) be the predicate (i.e. the lemma)

P(i): "For each natural number n that is a power of 2 s.t. i <= long, T(n) = (2^i)T(n/(2^i)) + idn."

- In the base case (i = 0), the writer skips assuming arbitrary natural number n and skips checking whether that n is a power of 2 s.t. i <= logn, and instead immediately proves P(0), because it is clear that T(n) = (2^0)T(n/(2^0)) + (0)dn. I just wanted to make clear that since this is immediately obvious, I hope that the professor/TAs would also approve of this "skip".

Proof of Theorem 3.7 (Course Notes):
There exists a constant k >= 0 so that for all integers n >=2, T(n) <= knlogn.

I'm sure everyone likes having good intuition to reassure themselves that they can do a particular proof #0. Hence, I myself spend a lot of time searching for the "intuitive spark" for every interesting proof I encounter. Oftentimes you'd have to skip to "mathematical manipulation" part of the proof and see where you get stuck or what is obviously needed for the next step - then you can say: "Ah, so now that I know every part that is needed, I can see that we should choose ____ at the beginning" (i.e. now I have as much as I possibly get for an intuitive sense of the proof at this point in the mathematical evaluation/manipulation of the predicate).

So I searched for intuition here.

Basically, we have to apply Thm 3.5: "If n is a power of 2, T(n) = cn + dnlogn" to T(n), since it's the only thing we know about T(n)'s value in closed form.

Thm 3.5 requires that n is a power of 2, so let's make another variable n_bar in terms of n so that we can decompose T(n) to T(n_bar).

But we want the decomposition to have T(n) <= T(n_bar), since our mindset is on the less than or equal to direction. This means we want n <= n_bar. Hopefully, creativity will conjure up

n_bar = 2^ceiling(logn), based on the facts "power of 2" and "n <= n_bar".

(If we are able to think far enough (and withstand time pressure), we can think of the proof's mathematical manipulation that follows:)

Then (Assuming that we proved T(n) is nondecreasing), we can sub in n for n_bar after transforming T(n) into Thm 3.5's expression.

Lastly, remember to show that k > 0 by referencing c >=0 and d >=0.

Proof of a bound for recursive binary search
- Let's say we try to prove for all positive natural numbers n, there exists a c in the positive real numbers, such that T(n) >= clog(n) by complete induction.
- Then in the case that n > 1, we must choose a different c' in terms of c in clog(ceiling(n/2)), this from using the induction hypothesis on T(ceiling(n/2)). Both in class and in tutorial was this step skipped. Luckily (more likely than not they did know it), it was still correct to choose "c = 1", when:
1) it should be "Then there exists a positive real number c' such that T(n) >= c'log(n)."
and 2) "# Choose c' <= 1 if c > 1, and c' = c otherwise", which technically is c' <= 1, but more with more rigour and clarity (at least it is the least level of rigour to prove something to me).

Proof of a bound for MergeSort
- Proving a stronger claim shouldn't ever be something to "pull out of a hat". I hate pulling out of a hat, because it a sad beauty to withhold when you don't know how it works. Honestly, it is intuition that allows you to choose T(n) <= clog(n - 1) from T(n) <= clog(n) because you tried the latter and saw that subtracting 1 from n may actually help. It's akin to walking through a series of tunnels to reach the other side. When you find yourself in a dead end, you don't just turn 180 degrees and walk blindly back to the entrance; instead, while walking back you look left and right to see if you missed any fork in the tunnel. More to come later...

----------------------------------------------------------------------------------------------------------------------------------

#0: This is precisely why I think mathematicians often say "Want ____" at every next challenging step of their proof, especially since their proof structure isn't as rigorous as ours #1.

#1: Of course not to say they are incorrect, but I have a couple of guidelines I think every mathematician should follow because I see that every problem can be separated into "structural/organizational" and "creativity" steps.

The former includes things such as writing down the question in uniform symbolic notation (1) people are too proud to change, but 2) english words that technically mean the same thing evoke different meaning depending on the audience (but life never stops, so we can't ever unify anything)), forming predicates (keyword: Let) and writing down the proof structure (keywords: Assume when encountering a universal quantifier, and Choose when encountering an existential quantifier (assuming that we need to set the variable to do the proof)). I mean, this is basically what was taught in CSC165, and so any proof can be simplified greatly. Also important to note:
1) Indents are pretty (dang) useful
2) Stop interchanging "assume" with "let", "arbitrary" with "generic", "exists" with "there is" and "any" with "every".
3) There must be care and balance into choosing symbols and operators. In the subject level, fields should be consistent with their choices of symbols and operators, and as much as possible share symbols and operators with other, at the very least interacting (i.e. all...), fields. In the teaching level, breaking down ideas into variables doesn't necessarily help teach a concept. It may be more mathematically rigorous, but at the cost of intuition (that's why MIT's MCS makes more intuitive sense than our course notes).
4) Stop interchanging quantifiers and predicates. Read from left to right! (unless there is much more payoff for intuition or clarity).
5) a > x > b and a > y > b is usually better shown as a > x, y > b. (But sometimes my mind just seems to float away at this representation and I think it's two parts. I don't like assuming that students pay attention, but neither do I think much progress can be done with so many details such as a > x > b and
a > y > b.)

The latter is based on one's knowledge and experience with the mathematical field of the problem.