Googlielmo's blog

Posts

Showing posts from July, 2021

Turning Python Scripts into Working Web Apps Quickly with Streamlit

I just realized that I am using Streamlit since almost one year now, posted about in Twitter or LinkedIn several times, but never wrote a blog post about it before. Communication in Data Science and Machine Learning is the key. Being able to showcase work in progress and share results with the business makes the difference. Verbal and non-verbal communication skills are important. Having some tool that could support you in this kind of conversation with a mixed audience that couldn't have a technical background or would like to hear in terms of results and business value would be of great help. I found that Streamlit fits well this scenario. Streamlit is an Open Source (Apache License 2.0) Python framework that turns data or ML scripts into shareable web apps in minutes (no kidding). Python only: no front‑end experience required. To start with Streamlit, just install it through pip (it is available in Anaconda too): pip install streamlit and you are ready to execute the working de...

The Codex Paper Has Been Published: the Idea Behind GitHub Copilot

The Codex paper has been published yesterday. Codex is a GPT language model finetuned on publicly available code from GitHub which has Python code-writing capabilities. A distinct production version of Codex powers GitHub Copilot . This paper focuses on the work leading to the early Codex models. The main task is the generation of standalone Python functions from docstrings, and the automated evaluation of the correctness of code samples through unit tests (this is in contrast to natural language generation, where samples are typically evaluated by heuristics or by human evaluators). To solve a problem in the test set, the authors generate multiple samples from the models, and check if any of them passes the unit tests. The raw training dataset was collected in May 2020 from 54 million public software repositories hosted on GitHub, containing 179 GB of unique Python files under 1 MB. Then it has been filtered by removing files which were likely auto-generated, had average line l...

Python Calculations in Jupyter with Handcalcs

Jupyter notebooks allows LaTeX rendering inside markdown. This way you can write complex math equations within a notebook. While LaTeX is the de facto standard for scientific documents, it hasn't a very friendly and intuitive syntax. handcalcs is an Open Source library for converting Python calculations into rendered LaTeX: just write the symbolic formula, followed by numeric substitutions and that's it. After install it (it is available through PyPI), in the simplest case you just need to import the render class and use the %%render magic command to render the content of a cell: Here another example of equation render and numeric substitution: It is also possible to render just the symbolic equation: or any way generate the corresponding LaTeX code: By default handcalcs renders code vertically, but it is possible to use the %%render params magic to save space by rendering in a single line or show just the result of a calculation: handcalcs allows to adjust precision, use Gr...