Tuesday, October 5, 2010

Daily, 2010/10/01: caching and high-level thoughts

Announcement items:
  • A#3 pairings: 12 people still need partners [list elided]
  • Steve will assign partners after Monday's class if not already chosen
  • Study session this weekend thanks to Ray: Sun 1-4; Steve should be there; +1BP to all!
  • Reminder: the exam will be CLOSED BOOK but "open wiki appendix"
    • Appendix is up: edit it!
    • First 10 substantive edits: +1 BP!
High-level comment:

Up to now, we've been moving at a rapid clip.  Today, we savoured caching and reflected on the term.

On the latter: we started class by discussing tradeoffs along (and sample languages lying at different points on) the static and dynamic typing design axis.  Our familiarity with the notion of static vs. dynamic scope already makes a fabulous entry to deep understanding of the discussion of static and dynamic typing.  (We framed the conversation in terms of some work a friend of mine at UW is doing, which I'll blog about later.)

Details:

I want to take a brief detour before coming back to the class-so-far summary.  This weekend during Ray's review session, I noticed a few issues worthy of comment.

First and foremost: use your accumulated programming knowledge as you approach Racket!  That means doing things like breaking problems down when you have trouble solving them.  (Have a syntax error you cannot resolve?  Try to replicate it in a smaller program.  Still can't resolve it?  Post that smaller program and ask for help.  These aren't novel or impractical ideas.  This is how you (better) be programming already in other languages!)  It also means if one approach to a problem isn't working, set it aside and try a totally different approach.  (Can't access the fields of a binding directly, without using type-case?  Then use type-case instead, at least for now!)

Second: If Racket is causing you woes, go back and do those Racket exercises I posted in the first week (available at the top of our assignments page).

Third: Several people wondered about the practical use of our lazy extended interpreter.  Answer: not much.  The extended interpreter shows us how to implement laziness.  To see the practical use of laziness, look at that Haskell code!  Once you have lists (or any compound data structure), laziness becomes a powerful tool to create natural infinite (or immense) data structures and to reorganize and decompose programs as staged pipelines.

Lastly: Is Java eager or lazy?  You tell me, perhaps starting from code like: someObj.someMethod(1 / 0);

Now, here's a record of some of the high notes we've hit so far in class:
  • Interpreter structure: As we built our first simple interpreter, we were already exploring powerful tools.  BNF is allows us to clearly describe concrete syntax in a form that makes sense both to humans (with CS expertise) and to programs, which can generate parsers for us largely automatically.  It also strongly hints at a good internal representation for the programs...

    Which brings us to abstract syntax and the abstract syntax tree.  Our internal representation for a program is a tree, which shouldn't surprise you!  After all, Computer Scientists love trees.  We know how to design them, build them, and manipulate them very effectively. 

    Traversal of the BNF to recognize a program in concrete syntax induces a tree.  To create our AST, we need merely name the nodes and their fields. 

    Once we've done that, the core structure of interpreters (and pre-processors and other analysis tools) writes itself: it's a tree traversal that divides into cases based on the type of AST node under consideration. 

    If you're still not clear why something like (app (fun 'x (add (num 5) (num 7))) (id 'x)) is a tree, try drawing it out like this:



  • Identifier binding: The concept of binding names to expressions is fundamental.  (Indeed, as we'll learn later, binding is one of a group of three concepts that together can build every other mechanism we'll discuss.) 

    Discussing identifier binding also led to another key idea: that a language's semantics might be defined by a simple model but implemented very differently.  We started by establishing a core semantics for identifier binding and substitution.  Then, we repeatedly massaged and altered our implementation of that semantics for efficiency, striving to maintain the original semantics.  This is a common theme in programming languages.  Indeed, we saw it again recently, when we worked hard to restore the semantics of static scoping in a language with lazy evaluation once we applied the deferred substitution optimization.

    By the way, the pairing of clean core semantics and complex implementations that hew to those semantics applies beyond programming languages.  In CPSC 313, for example, you'll likely hear about out-of-order and speculative execution.  When a processor rearranges instructions or executes them speculatively (without knowing whether they'll be skipped by as-yet-unexecuted control statements), it must guarantee that the in-order, non-speculative semantics are preserved.  That means, among other things, delaying commits of register value changes and suspending generated errors.
  • Functions: Functions gave us control flow: programs where you "move your finger around" while you're tracing execution.  If you have to "move your finger around", it means you're leaping from one context in your code to another.  Which context should the code actually execute in?  When you're tracing with your finger, it's obvious.  You execute in the context your finger is in.  When your interpreter is running the code, the answer is the same, but it's not as obvious how you implement it. 

    That brought us to static and dynamic scope, a design axis we lingered on, not because any language should default to dynamic scope anymore but because the contrast between static properties and analyses and dynamic properties and analyses will arise over and over. 

    So, how do we implement "moving the finger"?  There is no "finger" [a new motto for CPSC 311?], and function values can show up in contexts far from where they were built.  To be able to "jump back into" their contexts, the function values need to carry their original contexts along with them, which brings us to...

  • Closures: We used closures to make function's contexts portable so the interpreter can jump back to a function's context later, but closures can do so much more!  They allow: extending the environment with their parameter's binding (alas with, functions make you obsolete), preserving the function's context (hello static scope!), and deferring evaluation of the function's body, which turned out to be just what we needed for...

  • Lazy evaluation: Lazy and eager evaluation were our first meaty design axis without a "correct" side.  The book adopts a core semantics for identifier binding that is eager.  Our core semantics (where inner withs disappear before outer withs do their substitution) was lazy.  Sadly, as soon as we switched to deferred substitution, our lazy implementation broke.  With functions as well, our lazy implementation took a hard left turn into dynamic scoping.  Closures allowed us to resurrect our desired static scoping semantics while using lazy evaluation. 

    Why bother with lazy evaluation?  Look at what Haskell offers us: simple, straightforward reasoning about infinite (or just unmanageably large or size-not-yet-determined) data structures combined with a "pipelined" structure for fitting together program elements.  What are the first 15 even perfect squares?  Well, build the natural numbers.  ([1..] will do.)  Build a pipeline that squares lists of numbers.  (map (^2) will do nicely.)  Build a pipeline that filters out only even numbers.  (How about filter (\x -> x `mod` 2 == 0)?)  Now, put them all together: filter (\x -> x `mod` 2 == 0) (map (^2) [2..]).

Those are some profound ideas about what's possible (and desirable) in the languages we use to express programs.  And.. that's just the first month!


Cheers,

Steve

No comments:

Post a Comment