Smarter Data Science. Cole Stryker

Читать онлайн.
Название Smarter Data Science
Автор произведения Cole Stryker
Жанр Базы данных
Серия
Издательство Базы данных
Год выпуска 0
isbn 9781119693420



Скачать книгу

necessity for being able to use data for input into machine learning algorithms.

      There will be many situations when an AI system needs to process or analyze a corpus of data with far less structure than the type of organized data typically found in a financial or transactional system. Fortunately, learning algorithms can be used to extract meaning from ambiguous queries and seek to make sense of unstructured data inputs.

      Learning and reasoning go hand in hand, and the number of learning techniques can become quite extensive. The following is a list of some learning techniques that may be leveraged when using machine learning and data science:

       Active learning

       Deductive inference

       Ensemble learning

       Inductive learning

       Multi-instance learning

       Multitask learning

       Online learning

       Reinforcement learning

       Self-supervised learning

       Semi-supervised learning

       Supervised learning

       Transduction

       Transfer learning

       Unsupervised learning

      Some learning types are more complex than others. Supervised learning, for example, is comprised of many different types of algorithms, and transfer learning can be leveraged to accelerate solving other problems. All model learning for data science necessitates that your information architecture can cater to the needs of training models. Additionally, the information architecture must provide you with a means to reason through a series of hypotheses to determine an appropriate model or ensemble for use either standalone or infused into an application.

      Models are frequently divided along the lines of supervised (passive learning) and unsupervised (active learning). The division can become less clear with the inclusion of hybrid learning techniques such as semisupervised, self-supervised, and multi-instance learning models. In addition to supervised learning and unsupervised learning, reinforcement learning models represent a third primary learning method that you can explore.

      Two specific techniques used with supervised learning include classification and regression.

       Classification is used for predicting a class label that is computed from attribute values.

       Regression is used to predict a numerical label, and the model is trained to predict a label for a new observation.

      An unsupervised learning model operates on input data without any specified output or target variables. As such, unsupervised learning does not use a teacher to help correct the model. Two problems often encountered with unsupervised learning include clustering and density estimation. Clustering attempts to find groups in the data, and density estimation helps to summarize the distribution of data.

      K-means is one type of clustering algorithm, where data is associated to a cluster based on a means. Kernel density estimation is a density estimation algorithm that uses small groups of closely related data to estimate a distribution.

      In the book Artificial Intelligence: A Modern Approach, 3rd edition (Pearson Education India, 2015), Stuart Russell and Peter Norvig described an ability for an unsupervised model to learn patterns by using the input without any explicit feedback.

       The most common unsupervised learning task is clustering: detecting potentially useful clusters of input examples. For example, a taxi agent might gradually develop a concept of “good traffic days” and “bad traffic days” without ever being given labeled examples of each by a teacher.

      Reinforcement learning uses feedback as an aid in determining what to do next. In the example of the taxi ride, receiving or not receiving a tip along with the fare at the completion of a ride serves to imply goodness or badness.

      The main statistical inference techniques for model learning are inductive learning, deductive inference, and transduction. Inductive learning is a common machine learning model that uses evidence to help determine an outcome. Deductive inference reasons top-down and requires that each premise is met before determining the conclusion. In contrast, induction is a bottom-up type of reasoning and uses data as evidence for an outcome. Transduction is used to refer to predicting specific examples given specific examples from a domain.

      LEARNING

      The variety of opportunities to apply machine learning is extensive. The sheer variety gives credence as to why so many different modes of learning are necessary:

       Advertisement serving

       Business analytics

       Call centers

       Computer vision

       Companionship

       Creating prose

       Cybersecurity

       Ecommerce

       Education

       Finance, algorithmic trading

       Finance, asset allocation

       First responder rescue operations

       Fraud detection

       Law

       Housekeeping

       Elderly care

       Manufacturing

       Mathematical theorems

       Medicine/surgery

       Military

       Music composition

       National security

       Natural language understanding

       Personalization

       Policing

       Political

       Recommendation engines

       Robotics, consumer

       Robotics, industry

       Robotics, military

       Robotics, outer space

       Route planning

       Scientific discovery

       Search