Python Development on MacOS
Datascience Python Setup - A Quick Guide
Follow this blog post to set up a flexible and reproducible dev environment for data science projects on your apple device. I recently got an M3 MacBook Pro and wanted to share this simple yet effective setup. To get started, you dont need anything other than your device, a bit of patience and this quick guide!
Install homebrew
As apple still is not providing a suitable in-house package management solution, homebrew is still the go-to tool for this. If you arent using homebrew yet, simply head over to Homebrews Github Page and download and use the .pkg file under Assets. Then simply follow the instructions to complete the install.
Install python
Its time to install at least one python version. The versions you need depend on your projects packages and other external requirements. For now lets install 3.10.
brew install python@3.10
Install pipx with homebrew
PipX enables us to run python applications in an isolated environment. This is great for keeping your system clean and provides an easy way to list, upgrade or uninstall python applications. Head over to its GitHub Repository and follow the installation guide there.
Or just run the following commands in the terminal:
brew install pipx
pipx ensurepath
Now that you have pipx and a functioning python version on your machine, its time to install poetry and create our first project!
Get poetry
We use pipx to install and manage poetry like this:
pipx install poetry
Afterwards you can check if its working by typing poetry
into your terminal. It should show you all the options you have for using it over the CLI. For more information about poetry and possible configuration options, take a look at their website here.
Create your first project
Now that poetry is installed properly, we can use it to initialize our first project called ds-playground
with:
poetry new ds-playground
Poetry auto-generates some folders and files for us, which makes project creation a breeze.
ds-playground
├── pyproject.toml
├── README.md
├── ds-playground
│ └── __init__.py
└── tests
└── __init__.py
Nevertheless, make sure you actually understand what at least the pyproject.toml
file is for. This file manages all our project dependencies and acts as a central source of truth.
Great! Now lets add some basic dependencies by opening the terminal in ds-playground and calling the add command with the package names.
poetry add jupyter pandas seaborn numpy scikit-learn matplotlib
If you want a package but are not sure about its name exactly, you can use poetry search package-name
to get a list of available packages for your search query.
Get an IDE
You can of course choose whichever development application you want, heck you could even stay in the CLI with something like neovim. I prefer using Visual Studio Code (VS Code) for most of my dev needs. To install VS Code, simply download the correct version from here.
Once you've successfully installed VS Code, it's time to use it for python development. To do this, we will install its Python extension.
Launch VS Code, then navigate to the extensions marketplace by clicking the square-shaped icon on the left-hand sidebar, or use Cmd+Shift+X to open it. Now, search for "Python" and install the first extension provided by Microsoft.
With this extension, you will have access to features such as IntelliSense, linting, debugging, and many others which enhance productivity and the ease of Python scripting. It also supports testing with pytest, unittest, and nose.
To ensure we're using Poetry's Python environment within VS Code, we need to specify which Python interpreter to use. Open VS Code command palette with Cmd+Shift+P, type/select "Python: Select Interpreter", and you will see a list of detected Python environments. Poetry environments will typically be in a hidden folder within your project. If you don't see this, try reloading VS Code.
Select your newly created ds-playground
environment. Now the Python extension will use the specific Python interpreter for your project, staying in line with Poetry's isolated dependency management.
A little touch on VS Code for Data Science:
VS Code inbuilt Jupyter Notebook support means you can carry out your exploratory data science tasks without leaving the editor. You can create a new *.ipynb
notebook file, or open an existing one, and perform computations right in it. It supports data exploration, visualization, data cleaning & transformation, statistical modeling, and machine learning.