Optimizing Jupyter Notebooks for LLMs

by alexmolason 1/17/25, 8:00 PMwith 5 comments
by 331c8c71on 1/21/25, 5:29 PM

There is also jupytext that does two-way sync between notebooks and regular Python files. For instance one could use it to keep the python files in git repos (rather than notebooks which contain a lot of noise and are no diff-friendly at all).

[1] https://jupytext.readthedocs.io/en/latest/

by mscolnickon 1/21/25, 5:55 PM

You might be interested in marimo [1], which does not store your outputs in the notebook artifact. Also the files are stored at pure-python instead of json, which LLMs much prefer / easier for them to parse.

[1]: https://github.com/marimo-team/marimo

by westurneron 1/18/25, 5:10 PM

Jupyter + LLM tools: Ipython-GPT, Elyra, jetmlgpt, jupyter-ai; CoCalc, Colab, NotebookLM,

jupyterlab/jupyter-ai: https://github.com/jupyterlab/jupyter-ai

"[jupyter/enhancement-proposals#128] Pre-proposal: standardize object representations for ai and a protocol to retrieve them" https://github.com/jupyter/enhancement-proposals/issues/128

by Helmut10001on 1/21/25, 5:33 PM

In regard to converting ipynb files to *.py, use jupytext [1]. It will automatically convert notebooks to both *.md and *.py. This also allows to `import * from notebook1` in notbeook2, which is great for splitting long notebooks into many sequential ones. The trick is to add cell-tags to those cells that you don't want to appear in your *.py converted notebook ("active-ipynb") - e.g. you want to include method definitions to be imported, but not plot() stuff (etc.).

I was a little bit disappointed by the topic focus: I was hoping to explore/use LLMs in notebooks for data processing, not for augmenting my code writing. I used stable diffusion (automatic1111 API) in Jupyter once and wrote a blog post about it [2]. However, I haven't used any LLMs so far in Jupyter Lab to do data processing. I did use OpenAI's API in Jupyter [3], but found it too limiting and too much of a black box.

[1]: https://jupytext.readthedocs.io/en/latest/

[2]: https://ad.vgiscience.org/links/posts/2023-06-27-stable-diff...

[3]: https://kartographie.geo.tu-dresden.de/ad/2022-12-22_OpenAI_...