- by cmovq on 5/27/25, 9:32 PM - > gl.drawArrays(gl.TRIANGLES, 0, 6); - Using 2 tris for this isn’t ideal because you will get duplicate  fragment invocations along the horizontal seam where the triangles meet. It is slightly more efficient to use one larger triangle extending outside the viewport, the offscreen parts will be clipped and not generate any additional fragments. - [1]: https://wallisc.github.io/rendering/2021/04/18/Fullscreen-Pa... 
- by flakiness on 5/27/25, 7:18 PM - CUDA is better for sure, but the pure functional nature of the traditional shader is conceptually much simpler and I kind of relish the simplicity. There is no crazy tiling or anything. Just per-pixel parallelism [1]. It won't be as fast as those real, highly-tuned kernels, but it's still nice to see something simple that does something non-trivial. It reminded me of the early "GPGPU" days (early 2000s?) - [1] https://github.com/nathan-barry/gpt2-webgl/blob/main/src/gpt... 
- by nathan-barry on 5/27/25, 6:14 PM - A few weeks back, I implemented GPT-2 using WebGL and shaders. Here's a write-up over how I made it, covering how I used textures and frame buffer objects to store and move around weights and outputs from calculations while using WebGL. 
- by grg0 on 5/28/25, 1:17 AM - The lost art? Shader programming is very much relevant to this day. Many of the statements in this post are also either incorrect or inaccurate, no need for the sensationalism. And like somebody else has mentioned below, WebGL 2 adds compute shaders. I think the post would be better if it just focused on the limitations of pre-compute APIs and how to run a network there, without the other statements. 
- by rezmason on 5/27/25, 6:59 PM - Nice writeup! I'm a fan of shader sandwiches like yours. 
Judging from the stated limitations and conclusion, I bet this would benefit tremendously from a switch to WebGPU. That's not a criticism! If anything, you're effectively demonstrating that WebGL, a sometimes frustrating but mature ubiquitous computing platform, can be a valuable tool in the hands of the ambitious. Regardless, I hope to fork your repo and try a WebGPU port. Good stuff! 
- by nickpsecurity on 5/27/25, 7:42 PM - People have complained Nvidia dominates the market. Many were looking at their older GPU's for cheap experimentation. One idea I had was just using OpenCL to use all the cross-platform support it has. Even some FPGA's support it. - Good to see GPT-2 done with shader programming. A port of the techniques used in smaller models, like TinyLlama or Gemma-2B, might lead to more experimentation with older or cheaper hardware [on-site]. 
- by swoorup on 5/28/25, 5:17 AM - Imho there are js libraries which goes through the traditional rendering based shader path to emulate general purpose computations on the GPU, gpu.js for example  https://gpu.rocks/#/
- by lerp-io on 5/27/25, 10:52 PM - i think u can use webgpu compute shaders now 
- by 0points on 5/28/25, 7:21 AM - lost art ey? - what about mega shaders, vulkan llm etc etc?