After some discussion on haskell@, Sebastian Sylvan suggested a much more straightforward way of hyperizing computations. By request of Nicholas, here is the hyperization code before parallelization:
mapM evaluate xs
and this is the code after parallelization:
mvs <- forM xs $ \x -> do
mv <- newEmptyMVar
forkIO (evaluate x >>= putMVar mv)
return mv
mapM takeMVar mvs
Or, in Perl 6 (without using hyper-operators themselves):
# Before
@xs.map(&evaluate);# After
my @mvs = @xs.map: -> $x {
my $mv = MVar.new;
async { $mv.put: evaluate($x) };
$mv;
};
.take for @mvs;
The main point here is that forkIO/async does not actually create an OS thread; instead, it creates a new task for the preemptive GHC runtime kernel, which then assign it to one of the CPUs currently available, via pre-spawned OS threads.
With this strategy, numbers on feather now looks better, and so does my dual-core Macbook:
$ time env GHCRTS=-N1 ./pugs -e 'my @x = 1..50000; @x.>>sqrt'
real 0m5.204s
user 0m5.093s
sys 0m0.088s$ time env GHCRTS=-N2 ./pugs -e 'my @x = 1..50000; @x.>>sqrt'
real 0m3.404s
user 0m3.937s
sys 0m0.107s
Note that it's now taking more user-time than real-time, which means SMP is doing its job correctly.
The profiler seems to point to GC performance as the major factor preventing true linear scalability, which GHC 6.8 will address by having a true multithreaded GC implementation. It'll be fun to try this little experiment again once it arrives. :-)

Recent Comments