Below you will find pages that utilize the taxonomy term “testing”
Posts
Airflow
Airflow for hands-off ETL Almost exactly a year ago, I joined Yahoo, which more recently became Oath.
The team I joined is called the Product Hackers, and we work with large amounts of data. By large amounts I meant, billions of rows of log data.
Our team does both ad-hoc analyses and ongoing machine learning projects. In order to support those efforts, our team had initially written scripts to parse logs and run them with cron to load the data into Redshift on AWS.
read more
Posts
A tutorial within a tutorial on building reusable models with scikit-learn
Things I learned while following a tutorial on how to build reusable models with scikit-learn.
When in doubt, go back to pandas. When in doubt, write tests. When in doubt, write helper methods to wrap existing objects, rather than creating new objects. Ingesting “clean” data is easy, right? Step 1 of this tutorial began with downloading data using requests, and saving that to a csv file. So I did that. I’ve used requests before, I had no reason to think it wouldn’t work.
read more
Posts
Test-driven data pipelining
When to test, and why: • Write a test for every method.
• Write a test any time you find a bug! Then make sure the test passes after you fix the bug.
• Think of tests as showing how your code should be used, and write them accordingly. The next person who’s going to edit your code, or even just use your code, should be able to refer to your tests to see what’s happening.
read more