So as explained earlier in my discussions on this topic that Data science is all about analyzing the data that is collected in structural or un structured form. as a data scientist it is very important for a person to think out of the box, building up relationships in the data, that could be based on time, days, season, gender, weather, mood, age brackets, seniority level and any pattern you find in data. when this data is processed, many hidden problems are automatically resolved. There are 3 very main case studies that can be studied that how successfully they are utilizing data sciences.
- Uber
- Toronto Transportation Commission
-
Data sciences helps the authorities to take proactive approch. It helps
One of the most simple or common type of machine learning is regression. it estimates the relationships between variables. .In the context of machine learning and data science, regression specifically refers to the estimation of a continuous dependent variable or response from a list of input variables, or features.
Let me explain regression in the simplest possible terms. If you have ever taken a cab ride, a taxi ride, you understand regression. Here is how it works. The moment you sit in a cab ride, in a cab, you see that there’s a fixed amount there. It says $2.50. You, rather the cab, moves or you get off. This is what you owe to the driver the moment you step into a cab. That’s a constant. You have to pay that amount if you have stepped into a cab. Then as it starts moving for every meter or hundred meters the fare increases by certain amount. So there’s a… there’s a fraction, there’s a relationship between distance and the amount you would pay above and beyond that constant. And if you’re not moving and you’re stuck in traffic, then every additional minute you have to pay more. So as the minutes increase, your fare increases. As the distance increases, your fare increases. And while all this is happening you’ve already paid a base fare which is the constant. This is what regression is. Regression tells you what the base fare is and what is the relationship between time and the fare you have paid, and the distance you have traveled and the fare you’ve paid. Because in the absence of knowing those relationships, and just knowing how much people traveled for and how much they paid, regression allows you to compute that constant that you didn’t know. That it was $2.50, and it would compute the relationship between the fare and and the distance and the fare and the time. That is regression.
Using Cloud for Data Sciences:
Using the Cloud enables you to get instant access to open source technologies like Apache Spark without the need to install and configure them locally. Using the Cloud also gives you access to the most up-to-date tools and libraries without the worry of maintaining them and ensuring that they are up to date. The Cloud is accessible from everywhere and in every time zone. You can use cloud-based technologies from your laptop, from your tablet, and even from your phone, enabling collaboration more easily than ever before. Multiple collaborators or teams can access the data simultaneously, working together on producing a solution. IBM offers the IBM Cloud, Amazon offers Amazon Web Services or AWS, and Google offers Google Cloud platform. IBM also provides Skills Network labs or SN labs to learners registered at any of the learning portals on the IBM Developer Skills Network, where you have access to tools like Jupyter Notebooks and Spark clusters so you can create your own data science project and develop solutions. With practice and familiarity, you will discover how the Cloud dramatically enhances productivity for data scientists.
In this lesson, you have learned:
- The typical work day for a Data Scientist varies depending on what type of project they are working on.
- Many algorithms are used to bring out insights from data.
- Accessing algorithms, tools, and data through the Cloud enables Data Scientists to stay up-to-date and collaborate easily.