Today I got my certificate for the Python for Data Science and Machine Learning Bootcamp on Udemy. This was a great course and I really enjoy the way Jose Portilla teaches.
The class covers topics from Pandas and visualizations through machine learning, cloud computing with PySpark, all the way to TensorFlow. I can now train test split in my sleep 🙂
One of the best features of this course is the way Mr. Portilla goes over the theory of a number of different machine learning techniques such as Linear Regression, Decision Trees, K Means, etc. AND provides the step by step of using them as well.
This class re-enforces that having normalized, clean data is key to generating good insights. Coming up with features to train on – effective training sets are as important as using the right algorithm. I am proficient in cleaning data at this point and am hoping to offer it as a service.
When I finished this class I checked myself by grabbing some tweets and train test splitting them up so that I could predict if any given tweet was posted by a republican or a democrat:
It wasn’t long before I got my prediction up to 80%
I will be continuing my education with TensorFlow. It is a very exciting technology with unlimited possibilities. Watch this space…