I recently finished reading this amazing book by Peter and Andrew Bruce. It is definitely one of the must have books for the starter and middle-level data scientists, and so more for the starter statistician.
There has been a lot of controversy around OpenAI not releasing the full model for GPT-2. However, leaving that aside, they did publish the 117M model version, which is a small subset of the model that still works. And it works really nicely.
So let’s dive and see how we can run it for ourselves.
About a month ago OpenAI published a post where they open their gates for social scientists into their organization. I think this is part of a profound matter that has not been completely discussed in that blog post. Aside from a short version of some points of it, I’ll also provide what is this subject that hasn’t been discussed yet.
As a final step to moving my blog to GitHub pages, it was the matter of setting up HTTPS. The GitHub guides are very detailed on this. So much that it’s easy to get lost in them.
My case was particular, as it was a combination of a lot of conditions:
- My repository is a project repository (it is not a repository named after my user or my organization).
- I don’t want to transfer the whole domain DNS (apex domain) to GitHub nameservers, since I might want to use other subdomains (or the main domain) for my own purposes.
- I want the blog to have HTTPS enabled, even when only static content is served.
I couldn’t find anywhere how to setup the configuration for a case like this, but it’s really easy.
I just exported my Wordpress blog by using the jekyll-import tool. This is my first post in this new tool. I very much like the simplified approach of writing markdown and having a static site rather than running a server and a database on my own.
Not all things were entirely straightforward, so here’s a quick list of the steps I took, which might be helfpul to others.
I recently participated in a meetup with the promising title “Ethics in AI”. Dr. Chris McKillop conducted the meetup, and she did not only has a lot of theoretical background under her arm, but also a great deal of experience with working on the field of Data Science and AI.
Most blogs and manuals will recommend you the simpler approaches to reducing the image of your docker image. We’ll go a little further today but let’s reiterate them anyway:
- Use the reduced version of base images (alpine usually recommended), avoid SDKs for final images
- Use multistage build, do not copy over temporary files or sources
- Take care of the .dockerignore, ignore as much as possible
Having said that, it is possible that you’ll still end up with a very huge docker image, and it’s difficult to understand what the next step from here.
This is where this post comes in.
I just finished reading Working effectively with legacy code, by Michael Feathers. The title is perfectly descriptive and quite ambiguous at the same time. Let me explain to you why.
Text generation with nltk, markovify, Tumblr, docker
(Image used without permission from Gunshow comic: Robot that screams.)
If the word offends you, below the fold I use it a lot, so you might not want to read this article. However, I think it’s the most appropriate term.
I just finished reading “Head First Data Analysis “, by Michael Milton, and it was the first book I’ve ever read from the “Head First” series.
The book was very easily approachable, with concepts introduced only when they were necessary and to make a good, valid, practical point. The whole structure of the book revolves around practice and “real life” examples (although, greatly simplified) to prove how methodical logical steps can naturally lead to a good analysis mechanism.