· The Data
· Categorical Distribution Plots
∘ Box Plots
∘ Violin Plots
∘ Boxen Plot
· Categorical Estimate Plots
∘ Bar Plot
∘ Point Plot
∘ Count Plot
· Categorical Scatter Plots
∘ Strip Plot
∘ Swarm Plot
· Combining Plots
· Faceting Data with Catplot
· Documentation and Links
In this post we will use one of Seaborn’s conveniently available datasets about the Titanic, which I’m sure many readers have seen before. …
It is worth publishing your python packages on conda and conda-forge
I recently wrote a post guiding users through the process of publishing their python package to PyPI to be installed via pip. However, many users prefer to use conda. I like to have my packages available on conda since it plays nicely with packages that have external dependencies. Sometimes a package will also only be available on a conda channel, and it is best to avoid mixing pip and conda installations in a single environment in order to steer clear of dependency issues.
Seaborn is a fantastic plotting library that I wish I had started using earlier in my Python carrier. I have always been a Matplotlib user and I would spend hours on some projects fine tuning the aesthetics of my plots so that they would really capture colleagues’ attention during presentations. My first PI always said that if you don’t show up to a meeting with plots you weren’t prepared, so I always had something ready to go. Little did I know how simple Seaborn makes plotting in Python even while producing more visually appealing canned plots than Matplotlib.
We will…
One of Python’s greatest strengths is the ability for users to package their code and publish it for fellow pythonistas to use in their workflows. Without libraries like pandas, numpy, and matplotlib Python would not be the wonderfully flexible language we have all come to love.
You don’t need to create a package as expansive as numpy to benefit other users in the community. Even very niche workflows generally have a group of users working on similar problems. Packaging your software for others to use can save time and bring other developers to your work to help improve your code…
Visualizing data is a critical part of any data scientist’s workflow. Plots are an excellent way to visualize what is happening in a data set, and they are ideal for sharing findings with others. In true Python fashion, there are quite a few plotting libraries available to you, but when should you use each one?
In this post, I will discuss my four favorite plotting libraries: Pandas, Matplotlib, Seaborn, and Plotly. While all four of these libraries are capable of producing plots, they vary significantly in the effort required to produce and the visual quality of the final product. …
Packaging python software that has helped improve you or your team’s workflow can be very beneficial to the greater Python community; it makes your software more robust and can also improve your ability to use it in house. However, without the proper infrastructure in place, your python package will likely either break over time or be too difficult for other users to use efficiently.
Making your codebase public can also give you exposure to other great programmers working on similar problems; the more people using your code, the more likely it is to grow and improve. If you have ever…
I love learning from data and using data to create useful insights. My background involves extensive use of python and machine learning to study snowpack.