Lightpanda migrate DOM implementation to Zig

(lightpanda.io)

93 points | by gearnode 3 hours ago

6 comments

  • barishnamazov 3 hours ago
    This reminds me of the Servo project's journey. Always impressed to see another implementation of the WHATWG specs.

    It's interesting to see Zig being chosen here over Rust for a browser engine component. Rust has kind of become the default answer for "safe browser components" (e.g., Servo, Firefox's oxidation), primarily because the borrow checker maps so well to the ownership model of a DOM tree in theory. But in practice, DOM nodes often need shared mutable state (parent pointers, child pointers, event listeners), which forces you into Rc<RefCell<T>> hell in Rust.

    Zig's manual memory management might actually be more ergonomic for a DOM implementation specifically because you can model the graph relationships more directly without fighting the compiler, provided you have a robust strategy for the arena allocation. Excited to learn from Lightpanda's implementation when it's out.

    • fbouvier 2 hours ago
      Hi, I am Francis, founder of Lightpanda. We wrote a full article explaining why we choose Zig over Rust or C++, if you are interested: https://lightpanda.io/blog/posts/why-we-built-lightpanda-in-...

      Our goal is to build a headless browser, rather than a general purpose browser like Servo or Chrome. It's already available if you would like to try it: https://lightpanda.io/docs/open-source/installation

      • nicoburns 1 hour ago
        I see you're using html5ever for HTML parsing, and like it's trait/callback based API (me too). It looks like style/layout is not in scope at the moment, but if you're ever looking at adding style/layout capabilities to lightpanda, then you may find it useful to know that Stylo [0] (CSS / style system) and Taffy [1] (box-level layout) are both avaiable with a similar style of API (also Parley [2] which has a slightly different API style but can be combined with Taffy to implement inline/text layout).

        [0]: https://github.com/servo/stylo

        [1]: https://github.com/DioxusLabs/taffy

        [2]: https://github.com/linebender/parley

        ---

        Also, if you're interested in contributing C bindings for html5ever upstream then let me know / maybe open a github issue.

      • parhamn 2 hours ago
        Off topic note: I read the website and a few pages of the docs and it's unclear to me for what I can use LightPanda safely. Like say I wanted to swap my it as my engine on playwright, what are the tradeoffs? What things are implemented, what isnt?
        • fbouvier 1 hour ago
          Thanks for the feedback, we will try to make this clearer on the website. Lightpanda works with Playwright, and we have some docs[1] and examples[2] available.

          Web APIs and CDP specifications are huge, so this is still a work in progess. Many websites and scripts already work, while others do not, it really depends on the case. For example, on the CDP side, we are currently working on adding an Accessibility tree implentation.

          [1] https://lightpanda.io/docs/quickstart/build-your-first-extra...

          [2] https://github.com/lightpanda-io/demo/tree/main/playwright

          • epolanski 37 minutes ago
            I was actually interested into using lightpanda for E2Es to be honest, because halving the feedback cycle would be very valuable to me.
        • h33t-l4x0r 2 hours ago
          I think it's really more of an alternative to JSDom than it is an alternative to Chromium. It's not going to fool any websites that care about bots into thinking it's a real browser in other words.
      • barishnamazov 2 hours ago
        Thanks Francis, appreciate the nice & honest write-up with the thought process (while keeping it brief).
      • quotemstr 1 hour ago
        Choosing something like Zig over C++ on simplicity grounds is going to be a false economy. C++ features exist for a reason. The complexity is in the domain. You can't make a project simpler by using a simplistic language: the complexity asserts itself somehow, somewhere, and if a language can't express the concept you want, you'll end up with circumlocution "patterns" instead.

        Build system complexity disappears when you set it up too. Meson and such can be as terse as your Curl example.

        I mean, it's your project, so whatever. Do what you want. But choosing Zig for the stated reasons is like choosing a car for the shape of the cupholders.

        • hnlmorg 46 minutes ago
          That’s not fully true though. There’s different types of complexity:

          - project requirements

          - requirements forced upon you due to how the business is structured

          - libraries available for a particular language ecosystem

          - paradigms / abstractions that a language is optimised for

          - team experiences

          Your argument is more akin to saying “all general purpose languages are equal” which I’m sure you’d agree is false. And likewise, complexity can and will manifest itself differently depending on language, problems being solved, and developer preferences for different styles of software development.

          So yes, C++ complexity exists for a reason (though I’d personally argue that “reason” was due to “design by committee”). But that doesn’t mean that reason is directly applicable to the problems the LightPanda team are concerned about solving.

        • vegabook 35 minutes ago
          C++ features for complexity management are not ergonomic though, with multiple conflicting ideas from different eras competing with each other. Sometimes demolition and rebuild from foundations is paradoxically simpler.
    • pron 41 minutes ago
      I don't think that a language that was meant to compete with C++ and in 10+ years hasn't captured 10% of C++'s (already diminished) market share could be said to have become "kind of the default" for anything (and certainly not when that requires generalising from n≅1).
      • anal_reactor 15 minutes ago
        The problem is that the number of browser engines is n=2.
    • galangalalgol 1 hour ago
      Too late now, but is the requirement for shared mutable state inherent in the problem space? Or is it just because we still thought OOP was cool when we started on the DOM design?
    • pjmlp 2 hours ago
      And use-after-free, when that arena's memory goes away.
      • pron 36 minutes ago
        But arenas have substantial benefits. They may be one of the few remaining reasons to use a low-level (or "systems programming") language in the first place. Most things are tradeoffs, and the question isn't what you're giving up, but whether you're getting the most for what you're paying.
        • pjmlp 3 minutes ago
          Arenas are also available in languages with automatic memory management, e.g. D, C# and Swift, to use only modern languages as example.

          Thus I don't consider that a reason good enough for using Zig, while throwing away the safety from modern languages.

    • IshKebab 2 hours ago
      I don't think it's really that bad in Rust. If you're happy with an arena in Zig you can do exactly the same thing in Rust. There are a ton of options listed here: https://donsz.nl/blog/arenas/

      Some of them even prevent use after free (the "ABA mitigation" column).

      • kryps 17 minutes ago
        No, you can't do the same thing in Rust, because Rust crates and the standard library generally use the global allocator and not any arena you want to use in your code.
      • mijoharas 1 hour ago
        I'm not super experienced with zig, but I always think that in the same way that rust forces you to think about ownership (by having the borrow checker - note: I think of this as a good thing personally) zig makes you think upfront about your allocation (by making everything that can allocate take an allocator argument.).

        It makes everything very explicit, and you can always _see_ where your allocations are happening in a way that you can't (as easily, or as obviously - imo) in rust.

        It seems like something I quite like. I'm looking forward to rust getting an effects system/allocator api to help a little more with that side of things.

        • silon42 53 minutes ago
          The problem is deallocation... unless you tie the allocated object to an arena allocator with a lifetime somehow (Rust can model that).
          • mijoharas 45 minutes ago
            Yep, rust forces you to think about lifetimes. Zig only suggests it (because you're forced to think about allocation, which makes you naturally think about the lifetime usually) but does not help you with it/ensure correctness.

            It's still nice sometimes to ensure that you have to think about allocation everywhere, and can change the allocation strategy for something that works for your usecase. (hence why I'm looking forward to the allocator api in rust to get the best of both worlds).

      • pjmlp 1 hour ago
        Which is hardly any different from me using PurifyPlus back in 2000.
    • 7bit 53 minutes ago
      > without fighting the compiler

      It's unfortunate that "writing safe code" is constantly being phrased in this way.

      The borrow checker is a deterministic safety net. Claiming Zig is easier ignores that its lack of safety checks is what makes it feel easier; if Zig had Rust’s guarantees, the complexity would be the same. Comparing them like this is apples vs. oranges.

      • pron 28 minutes ago
        That's a very narrow way of looking at things. ATS has a much stronger "deterministic safety net" than Rust, yet the reason to use Rust over ATS is that "fighting the compiler" is easier in Rust than in ATS. On the other hand, if any cost is worth whatever level of safety Rust offers for any project, than Rust wouldn't exist because there are far more popular languages with equal (or better) safety. So Rust's design itself is an admission that 1. more compile-time safety is always better, even if it complicates the language (or everyone who uses Rust should use ATS), and 2. any cost is worth paying for safety (or Rust wouldn't exist in the first place).

        Safety has some value that isn't infinite, and a cost that isn't zero. There are also different kinds of safety with different value and different costs. For example, spatial memory safety appears to have more value than temporal safety (https://cwe.mitre.org/top25/archive/2025/2025_cwe_top25.html) and Zig offers spatial safety. The question is always what you're paying and what you're getting in return. There doesn't appear to be a universal right answer. For some projects it may be worth it to pay for more safety, and for other it may be better to pay for something else.

      • senko 46 minutes ago
        The fact that Zig doesn't have Rust's guarantees doesn't mean Zig does not have safety checks. The safety checks that Zig does have are different, and are different in a way that's uniquely useful for this particular project.

        Zig's check absolutely don't go to the extent that Rust's do, which is kind of the point here. If you do need to go beyond safe code in Rust, Zig is safer than unsafe code in Rust.

        Saying Zig lacks safety checks is unfortunate, although I wouldn't presume you meant it literally and just wanted to highlight the difference.

  • kristopolous 1 hour ago
    I've been using it for months now ever since I saw their presentation at GitHub

    This is a common flow for me

        lightpanda url | markitdown (microsoft) | sd (day50 streamdown) 
    
    I even have it as a shell alias, wv(). It's way better than the crusty old lynx and links on sites that need JS.

    It's solid. Definitely worth a check

  • nicoburns 2 hours ago
    This table is informative as to exactly what lightpanda is: https://lightpanda.io/blog/posts/what-is-a-true-headless-bro...

    TL;DR: It does the following:

    - Fetch HTML over the network

    - Parse HTML into a DOM tree

    - Fetch and execute JavaScript that manipulates the DOM

    But not the following:

    - Fetch and parse CSS to apply styling rules

    - Calculate layout

    - Fetch images and fonts for display

    - Paint pixels to render the visual result

    - Composite layers for smooth scrolling and animations

    So it's effectively a net+DOM+script-only browser with no style/layout/paint.

    ---

    Definitely fun for me to watch as someone who is making a lightweight browser engine with a different set of trade-offs (net+DOM+style/layout/paint-only with no script)

    • karel-3d 2 hours ago
      When I was working before on something that used headless browser agents, the ability to do a screenshot (or even a recording) was really great for debugging... so I am not sure about the "no paint". But hey everything in life is a trade-off.
      • pzo 1 hour ago
        yeah I feel the same, I think even having a screenshot of part of rendered page or full page can be useful even for machines considering how heavy those HTML can be to parse and expensive for LLM context. Sometimes (sub)screenshot is just a better kind of compression
        • fbouvier 1 hour ago
          Yes HTML is too heavy and too expensive for LLM. We are working on a text-based format more suitable for AI.
    • warpech 53 minutes ago
      > So it's effectively a net+DOM+script-only browser with no style/layout/paint.

      > ---

      > Definitely fun for me to watch as someone who is making a lightweight browser engine with a different set of trade-offs (net+DOM+style/layout/paint-only with no script)

      Both projects (Lightpanda, DioxusLabs/blitz) sound very interesting to me. What do you think about rendering patterns that require both script+layout for rendering, e.g. virtual scrolling of large tables?

      What would be a good pattern to make virtual scrolling work with Lightpanda or Blitz?

      • nicoburns 46 minutes ago
        So Blitz does technically have scripting, it's just Rust scripting rather than JavaScript scripting. So the plan for virtual scrolling would likely be to implement it in Rust.

        If your aim is to render a UI (ala Electron/Flutter) then we have a React-style framework (Dioxus) that runs on top of Blitz, and allows you access to the low-level Rust API of the DOM for advanced use cases (although it's still a WIP and this API is a bit rough atm). I'm also hoping to eventually have a built-in `RecyclerView`-like widget for this (that can bypass the style/layout systems for much more efficient virtual scrolling).

  • lewdwig 1 hour ago
    A language which is not 1.0, and has repeatedly changed its IO implementation in a non-backwards-compatible way is certainly a courageous choice for production code.
  • everlier 1 hour ago
    Wow. Lightpanda is absolutely bonkers of a project. I'd pay dearly for such an option a few years back.
  • portly 31 minutes ago
    Love to see Zig winning!