Today I got my certificate for the Python for Data Science and Machine Learning Bootcamp on Udemy.   This was a great course and I really enjoy the way  Jose Portilla teaches.

The class covers topics from Pandas and visualizations through machine learning, cloud computing with PySpark, all the way to TensorFlow.  I can now train test split in my sleep 🙂

One of the best features of this course is the way Mr. Portilla goes over the theory of a number of different machine learning techniques such as Linear Regression, Decision Trees, K Means, etc. AND provides the step by step of using them as well.

This class re-enforces that having normalized, clean data is key to generating good insights.  Coming up with features to train on – effective training sets are  as important as using the right algorithm.  I am proficient in cleaning data at this point and am hoping to offer it as a service.

When I finished this class I checked myself by grabbing some tweets and train test splitting them up so that I could predict if any given tweet was posted by a republican or a democrat:

train test split

It wasn’t long before I got my prediction up to 80%

80% accuracy

I will be continuing my education with TensorFlow.  It is a very exciting technology with unlimited possibilities.   Watch this space…


0 Comments

Leave a Reply

Your email address will not be published. Required fields are marked *