August 2008

Sun Mon Tue Wed Thu Fri Sat
          1 2
3 4 5 6 7 8 9
10 11 12 13 14 15 16
17 18 19 20 21 22 23
24 25 26 27 28 29 30
31            
Recently on this blog
Recently on other blogs

Map

Audrey

My Photo

License

2006.11.03

MiniPerl6 Parsed!

Today's hackathon was fruitful; fglock++ and I finished designing the AST nodes for MiniPerl6, and coded up a Grammar that parses most of the sanity tests.  The next step tomorrow is to write an emitter for it... Or maybe two emitters. :-)

In other hackathon news, part of MO's single-inheritance test, ported to MOH by cmarcelo++, now actually compiles and runs. There is still a lot to do to bring the full power onto the GHC runcore (multiple inheritance, C3, roles), but thanks to nothingmuch++'s comprehensive tests, it's a tractable process.

CONISLI begins tomorrow, so that's it for today.  See you!

2006.04.18

P6AST Draft.

The long overdue PIL2 node redesign braindump + brainstorming Gobby session took place tonight, commissioned by fglock and participated by a record number of 7 lambdacamels.

The draft document really needs some tidying up, but it's a fairly complete picture of what I had in mind during the Tokyo hackathon.

Our design space was simplified enormously this time, thanks to two insights from Matz:

  1. All compile-time special forms should be reconstructable as run-time method calls
  2. The object space is itself an object, so precomposed objects at BEGIN{} time can be freely shuffled into runtime.

Ergo, even with separate compilation and BEGIN blocks, the AST can still be entirely dynamic and not contain any compile-time nodes. (Thanks also to ko1 for pointing me to Ruby 1.8.1 nodes.)

Also, because this tree is to be shared among three Perl 6 implementations -- none of which written entirely in Perl 6 yet (i.e. with significant Hs/Perl5/PIR parts), it is vitally important that all nodes can be translated back to Perl 6 surface syntax, and we have to forbid any backend-specific "optimization" nodes (e.g. pRawName in PIL1 for PIR codegen).

With this guarantee, the implementations can naturally converge into a self-hosting Perl 6 with maximal code sharing, instead of having subtle semantic differences.

The next step is to come up with a minimal surface form, suitable to embed inside rules, so we can manipulate the parse tree for Perl 6 with Perl 6, without requiring a fully self-hosting implementation. Then the most diverging part of the three implementations -- namely, parse tree construction -- can suffer less from the language barrier, and that would be a Good Thing. :-)

2006.02.21

Pre-hackathon for a faster Pugs.

Despite sleeping 11+ hours each day, I did get plenty of design and discussion done with gaal, in particular a pdd20-inspired refactoring for lexical pads, which I'll write about in another entry. The recent pX spike project of a Perl 6 rules implementation on Perl 5 -- and use it to parse Perl 6 programs -- is very much worth journaling about as well, but that'd take another entry too.

However, most of our pair-coding time was spent on improving the most egregious showstopper to would-be Pugs hackers -- namely, that the "make; make test" cycle simply took too much time.

This issue was brought to #perl6's attention as part of chromatic's poignant rant, citing that Pugs took 8 hours to complete building and run all its tests.  And because he is a devout TDD follower, he'd like to run all tests after he made any change to the Pugs internals, which would take (gasp) 16 hours.

Most of his other points in the rant can be resolved directly:

  • Test::Builder did fail its tests for a while, but was repaired along with other OO modules as part of release engineering before 6.2.11.  Adopting a regular release cycle will fix that.
  • Hooking up to Parrot as a runtime will no doubt bring more contributors (and get us faster-than-C performance), but hooking up to Perl 5 will obviously bring even more. Fortunately, we are doing both, plus JavaScript (and maybe CLR too, now that I'm going to YAPC::NA in Chicago and will probably visit LINQ folks en route.)
  • Because the Synopses are user-level requirements, Pugs would need its own PDD-like set of documentation that discusses the design of various compiler components. I'd like to resume the Pugs Apocrypha series of documents with nothingmuch et al during the Hackathon.

But the most pressing Pugs is slow issue demands a technical solution: the cycle takes 4 hours on my laptop -- 30 minutes to compile and 210 minutes to finish testing, which is simply too much, even if we take into account that we have 616 test files and 11070 test cases.

The current situation was mainly caused by Prelude.pm, a module with built-ins (such as printf) implemented in Perl 6 itself and loaded for each Pugs run.  The problem is, compiling the Prelude.pm takes 15 seconds here, and it will add another 2.5 hours to the test cycle if it has to be reloaded for each test file.

Many moons ago (July 2005), gaal hacked in support for precompiled Prelude, using the ./pugs -CPugs backend to turn Perl 6's parse tree into huge Haskell expressions, and rebuild the Pugs executable again with the Prelude statically linked.  This shaved the startup time from 15 seconds to 0.5 seconds.

The tradeoff is that this makes compilation of the Pugs.Run awfully slow (20+ minutes for optimised builds) and consumes a lot of RAM (curiously, even more so on unoptimised builds). One can turn precompilation off by tweaking config.yml to say precompile_prelude: false, but that will make tests unbearably slow to finish.

Gaal set forth to fix this problem once and for all, by using YAML as the cached intermediate format, much as Python's .pyc/.pyo bytecode files. We wrote a rule for DrIFT that can generate fromYAML and asYAML instance methods for all Haskell types, which provides roundtrip serialization to our Syck bindings.

The upshot is that the new ./pugs -CParse-YAML backend can turn Perl 6 into a YAML syntax tree, which can be loaded back during runtime using the Pugs::Internals::eval_p6y($file) primitive.  Thanks to Syck's speedy parser, the startup time is now 0.7 seconds without any additional time penalty to compiling Pugs.Run, bringing the total compilation time down to 8 minutes (optimised build; unoptimised takes 4 minutes).

This goes a long way in solving the compilation time problem; moving the DrIFT instances away from e.g. Pugs.AST.Internals to another module will probably save another minute or two.

Tomorrow we will apply the same technique to Test.pm (as well as other .pm files).  Seeing that each test file currently takes 5 seconds to load Test.pm, yamlizing it will likely save another hour off the test cycle.  And if we start making use of cached .t.yml.gz files next to each .t programs, the entire build-test cycle can probably be reduced to 30 minutes or less.  That will be lovely indeed. :-)

2006.01.10

Repr Types landed.

Today I finally got around to proofread and implement Stevan's repr types spec, essentially recreating Parrot's PMC structures within our new VM, though with a saner interface design of leo's that I mentioned last week.

Currently, all the container types got their repr types, as well as a generic p6opaque for object attributes and a nil type for Larry's ::Class objects. Next up will be converting the current runtime's VTypes into their respective repr types: IO handles, Sockets, Threads, Processes, Rules, and maybe Perl5 SVs and Haskell Dynamics.

Now that container representations are safely behind us, Stevan is polishing up the S12 metaobject protocol to make sure it works with all repr types.  Once the IO parts of repr types is in and we can actually do input/output from the runcore, it'd be time to systematically copy+pasteport over all of Pugs.Prim and complete the long transition.  But we have a 6.2.11 to release first...

2006.01.07

Representation Types.

One of the underspecified corner in S12 is how class objects (such as ::Array, ::Int and ::Class) are represented internally, how those representation differs from the p6opaque layout, and how different representations interact.

Today Stevan's representation types design addressed these issues, and we figured out how to cleanly support Larry's class objects are like undefined instance objects intuition via the nil reprensetation type. Eigenclasses and prototype-based systems are also very straightforward under this design. We'll make it happen over this weekend for sure. Woot!

2006.01.04

Context and coercion.

"Context" is implemented as type coercion calls, implicitly inserted
by the compiler.

There are five types used in these calls.  The "..." below denote
the positions where these contexts typically occur, using common
Perl 5 operations as examples:

Void              # ...; 1;
Single[::of]      # chdir ...;
Plural[::of]      # reverse ...;
RW[Single[::of]]  # chop ...;
RW[Plural[::of]]  # shift ...;

The compiler is responsible to translate sigils, "is rw" and "is context"
notations into the types above.  In the object model, the five types are
represented as roles.

Denotationally, the signature for = and := are as below:

proto infix:<=> (RW[::T]: T) --> RW[T]
proto infix:<:=> (Plural[RW[::T]]: Plural[RW[Any]]) --> Plural[RW[::T]]

Note that both forms only have one invocant; dispatch is decided by the
left-hand side's type.

Each builtin type "does" exactly one of the five types above.

Int         does Single[Int];      # 3
Rule        does Single[Rule];     # rx{123}
Args        does Plural[Any];      # \(1,2,3)
Sigs        does Plural[RW[Any]];  # :($x,$y,$z)
Tuple[::T]  does Plural[T];        # (1,2,3)
Range[::T]  does Plural[T];        # (1..3)
Scalar[::T] does RW[Single[T]];    # $IN
Array[::T]  does RW[Plural[T]];    # @ARGS
Hash[::T]   does RW[Plural[T]];    # %ENV
List[::T]   does RW[Plural[T]];    # ($IN,@ARGS,%ENV)

Note that the list-associative infix:<,> has two variants: If all its arguments
does RW, it returns a List; otherwise it returns a Seq.  This matches the Perl5
behaviour:

perl -e '($_, $_) = 3'; # okay
perl -e '($_, 7) = 3';  # error

For user-defined concrete classes without any "does" clauses, we automatically
derive a "does Single[::t]" for it.

During compile time, the local type inferencer looks up coerce:<as> calls for
each expression that does not match its expected type:

proto coerce:<as> (::from, ::to) --> ::to

If the type of ::from is known at compile time, the availability of these
coercion forms are checked at compile time as well, so it's subject to
constant folding.  Hence if the programmer wants to make "3 = 4" work, a
corresponding form must be declared at compile time:

multi coerce:<as> (Int, RW[Single[Int]]) {...}

otherwise it raises a compile time error.

Dereference and assignment semantics.

This is a brain-dump of how container types and contexts work in the new Pugs runcore.

Assuming everything other than an explicit method call is subject to MMD, we have this desugaring:

# Sugared syntax
${$x} = 5;

# Desugared syntax
&infix:<=>(&circumfix:<${ }>($x), 5);

It's straightforward to type the relevant &infix:<=> part, given that ::Scalar, ::Array and ::Hash are roles that does the ::Ref role.

proto infix:<=> ((Ref of ::v) ::a is rw, ::v) --> ::a;
multi infix:<=> ((Scalar of ::v) ::a is rw, ::v) --> ::a {...}

However, what should the type of &circumfix:<${ }> be?

I'm currently going with:

proto circumfix:<${ }> (Any) --> (Scalar of Any) is rw;
multi circumfix:<${ }> ((Scalar of Any) ::a) --> ::a is rw {...}

That is, it applies to an argument that can yield a Scalar object when used as a rvalue.

Below I'll use the notation value($x) to mean $x evaluated as an rvalue, analogous to
the variable($x) special form for marking explicit lvalues.

Here is the signature for for prefix:<\>:

proto prefix:<\> (Any) --> (Ref of Any);
multi prefix:<\> (Ref ::T is rw) --> (Ref of ::T) {...}
multi prefix:<\> (::T) --> (Scalar of ::T) {...}

So these two are legal:

${ \3 };   # value(&prefix:<\>(3))   yields a ::Scalar container object
${ \$x }; # value(&prefix:<\>($x)) is just variable($x)

But these are not:

${ \3, }; # value(&prefix:<\>(&infix:<,>(3))) yields a ::Tuple, not ::Scalar
${ variable($x) }; # value(variable($x)) means the same as value($x)

2006.01.03

Containers adopt Roles.

Today Stevan started writing out Roles for container types, so there can be multiple classes that implements the Hash/Array/Scalar interface, so operations like .{} and .[] can apply to user-defined types as well.

This is similar to the Perl 5 way of using the "tie" interface, as well as overloading @{} and %{}, but because Perl 5 is a strongly typed language with only five ($@%&*) container types, ultimately you need to decompose the user-defined class to one of those five things, XS-based solutions like PDL notwithstanding.

With roles, user-defined classes can be first class citizens that conform to various interfaces (pair, args, sigs, list, ranges, etc...), and it'd be much easier to write an ordered hash class that does both the Array and Hash interface.

We are working toward something like Scala's traits hierarchy, starting with the bare minimum already defined in docs/quickref/data.

As the main Bootstrap.pil is getting huge with the container type interfaces, I factored them out into multiple small files in src/PIL/Native/Bootstrap/. The next step for me is to create another surface syntax for PIL2 -- this time a bare subset of syntactically-valid Perl 6 -- and compile it to the already-running-fine bootstrapped PILN runcore.

The compiler itself will have access to an object space, and simply serialize the final (garbage-collected) state as the executable image, ready to be run by invoking the main routine in &*::('').

Once we can pass the t/01-sanity/ tests with this compiler, the rest of the job is to port over all desugaring, primitives, as well as other assorted magics from the old runcore over, a process not unlike what iblech has been doing for the JavaScript runcore. Stay tuned...

2006.01.01

::Array and ::Hash lands.

Today Leo sent me a set of design notes, including his recent thinking about how Parrot's PMC layout and interfaces may be improved, allegedly inspired by pugs/docs/quickref/data. Coincidentally, I was feeling unhappy about the steadily-growing vtable of PILN's NativeObj structure, so I implemented Leo's design, and it worked beautifully.

The basic idea is that instead of having a large fixed set of things that all objects may do, we split them out into interfaces, which are simple mappings from method names to native code. At class composition time, the class selects a representation for its object by mixing in one or more interfaces.

The chosen interfaces determines which primitive operations the object can perform, which stays immutable during the entire runtime (just like the old vtables did), and as such may still be checked statically.

Under this scheme, we don't need to allocate stub throw-an-execption for value classes's set method, and the distinction of Containers is clear: unlike ordinary Objects, they do not use the p6opaque representation (set_attr, get_attr...). I have just coded in p6array and p6hash representations, with primitive operations such as fetch_elem and store_elem.

This also addressed the problem of how attributes are handled when we extend builtin classes with Perl 6 code. Because "has $!x = 3" ultimately desugars into set_attr operations, which does not exist in a boxed p6integer, code like "class ::Int is extended { has $!foo }" can be rejected at compile time, which is probably a good thing.

Tomorrow I'll move to ::Args and ::Sigs, and ::Code after them if all works out. Then we'll add some sugar to it and serve the dish by compiling PIL2 to it. It's already smelling good...

2005.12.31

Scalar container lands.

Scalar container support has landed to PILN runtime.  To wit:

$ make pil
$ ./pil -e '::Scalar`create(3).STORE(9).FETCH()'
### Evaluated ###
9

Moreover, ::Int now represents a true BigInt (internally using the Haskell type Integer), which differs from its unboxed representation with limited precision.  This is correct according to the spec, and unlike the old runtime which uses BigInts everywhere.  Unicode-aware ::Str types will get the same treatment soon.

The next step is to implement Array and Hash container types, and revise PIL2 so they can be desugared to PILN. Which should all happen tomorrow, unless I get distracted into more Rules hacking...