Feature #142

Speedup: Direct Streaming

Added by Kornelius Kalnbach 8 months ago. Updated 4 months ago.

Status:Closed Start:01/11/2010
Priority:High Due date:
Assigned to:Kornelius Kalnbach % Done:

100%

Category:Performance
Target version:1.0 RC1

Description


Related issues

related to CodeRay - Task #95: Review Streamable system Resolved 04/20/2009
blocks CodeRay - Feature #223: Token filters don't handle token groups Closed 04/02/2010
blocks CodeRay - Feature #222: Tokens#split_into_lines New 04/02/2010

Associated revisions

Revision 563
Added by Kornelius Kalnbach 4 months ago

Direct Streaming! See #142 and Changes.textile.

Revision 567
Added by Kornelius Kalnbach 4 months ago

Got rid of the old streaming system (see #142).

History

Updated by Kornelius Kalnbach 7 months ago

  • % Done changed from 0 to 20

Concept is done, implementation pending. Involves a lot of rewriting, but the code will be cleaner eventually.

Updated by Kornelius Kalnbach 5 months ago

Well, this one blocks 4 other tickets now! It seems I should start to dig now. Expect everything to break before I can put it back together. Expect a faster, slimmer, more flexible CodeRay when I'm done :)

Updated by Kornelius Kalnbach 5 months ago

  • Priority changed from Normal to High

Updated by Kornelius Kalnbach 4 months ago

Early benchmarks show 20-30% speedup for Ruby 1.8, 1.9, and JRuby when using an intermediary Tokens representation.

When using direct streaming, it's 25% on Ruby 1.8, 35% on Ruby 1.9, and almost 40% on JRuby.

Updated by Kornelius Kalnbach 4 months ago

New benchmarks reveal solid 20-35% speedups for MRI and up to 80% more speed for JRuby. I'm pretty happy with that :)

Updated by Kornelius Kalnbach 4 months ago

  • % Done changed from 20 to 70

It works, and it's faster :) Cleaning up the old Streamable system and making the old CodeRay.scan(code, :lang).format API work again are the only things left, I think.

I let Ken Thompson speak for me: One of my most productive days was throwing away 1000 lines of code.

Updated by Kornelius Kalnbach 4 months ago

  • Status changed from New to Closed
  • % Done changed from 70 to 100

It works now.

Note that the old CodeRay.scan(code, lang).format API still uses an intermediary Tokens representation. If you want to benefit from direct streaming, you have to use CodeRay.encode code, lang, format or CodeRay::Duo[lang, format].encode(code).

Updated by Kornelius Kalnbach 4 months ago

Direct streaming also helps with memory use: running the benchmark script is down by about 60% in MRI (from ~60MB to ~20MB).

Also available in: Atom PDF