Название | Practical Data Analysis with JMP, Third Edition |
---|---|
Автор произведения | Robert Carver |
Жанр | Программы |
Серия | |
Издательство | Программы |
Год выпуска | 0 |
isbn | 9781642956122 |
We Want to Hear from You
SAS Press books are written by SAS Users for SAS Users. We welcome your participation in their development and your feedback on SAS Press books that you are using. Please visit sas.com/books to do the following:
● Sign up to review a book
● Recommend a topic
● Request information on how to become a SAS Press author
● Provide feedback on a book
Do you have questions about a SAS Press book that you are reading? Contact the author through [email protected] or https://support.sas.com/author_feedback.
SAS has many resources to help you find answers and expand your knowledge. If you need additional help, see our list of resources: sas.com/books.
About The Author
Robert Carver is Professor Emeritus of Business Administration at Stonehill College in Easton, Massachusetts, and he recently retired as a senior lecturer at the Brandeis University International Business School in Waltham, Massachusetts. At both institutions, he was instrumental in establishing programs in Data Science and Business Analytics, taught courses on quantitative methods in addition to general management courses, and won teaching awards at both schools. His primary research interest is statistics education. A JMP user since 2006, Carver holds an AB in political science from Amherst College in Amherst, Massachusetts, and an MPP and PhD in public policy from the University of Michigan at Ann Arbor.
Learn more about this author by visiting his author page at support.sas.com/carver. There you can download free book excerpts, access example code and data, read the latest reviews, get updates, and more.
Chapter 1: Getting Started: Data Analysis with JMP
Goals of Data Analysis: Description and Inference
Graph Builder: An Interactive Tool to Explore Data
Exporting and Sharing JMP Reports
Saving and Reproducing Your Work
Overview
Statistical analysis and visualization of data have become an important foundation of decision making and critical thinking. Professionals in numerous walks of life—from medicine to government, from science to sports, from commerce to public health—all rely on the analysis of data to inform their work. In this first chapter, we take our first steps into the important and rapidly growing practice of data analysis.
Goals of Data Analysis: Description and Inference
The central goal of this book is to help you build your capacity as a statistical thinker through progressive experience with the techniques and approaches of data analysis, specifically by using the features of JMP. As such, before using JMP, we will begin with some remarks about activities that require data analysis.
People gather and analyze data for many different reasons. Engineers test materials or new designs to determine their utility or safety. Coaches and owners of professional sports teams track their players’ performance in different situations to structure rosters and negotiate salary offers. Chemists and medical researchers conduct clinical trials to investigate the safety and efficacy of new treatments. Demographers describe the characteristics of populations and market segments. Investment analysts study recent market data to fine-tune investment portfolios. Increasingly, “smart” devices continuously generate high volumes of data touching on varying topics. All of the individuals who are engaged in these activities have consequential, pressing needs for information, and they turn to the techniques of statistics to meet those needs.
There are two basic types of statistical analysis: description and inference. We perform descriptive analysis in order to summarize or describe an ongoing process or the current state of a population—a group of individuals or items that is of interest to us. Sometimes we can collect data from every individual in a population (every professional athlete in a sport, or every firm in which we currently own stock), but more often we are dealing with a subset of a population—with a sample from the population. A sample is simply a subset. When we study ongoing processes, we nearly always deal with samples.
If a company reviews the records of all its client firms to summarize last month’s sales to all customers, the summary will describe the population of customers. If the same company wants to use that summary information to make a forecast of sales for next month, the company needs to engage in inference. When we use available data to make a conclusion about something that we cannot observe, or about something that has not happened yet, we are drawing an inference. As we will come to understand, inferential thinking requires risk-taking. Learning to measure and minimize the risks involved in inference is a central part of the study of statistics.
Types of Data
The practice of statistical analysis requires data—when we “do” analysis, we are analyzing data. It’s important to understand that analysis is just one phase in a statistical study. Later in this chapter, we will look at some data collected and reported by the World Population Division of the United Nations. Specifically, we will analyze the estimated life expectancy at birth for nations around the world in 2017. This set of data is a portion of a considerably larger collection spanning many years and assembled by numerous national and international agencies.
In this example, we have five variables that are represented as five columns within a data table. A variable is an attribute that we can count, measure, or record. The variables in this example are a 3-letter code, country name, region, year, and life expectancy. Typically, we will capture multiple observations of each variable—whether we are taking repeated measurements of stock prices or recording facts from numerous respondents in a survey or individual countries around the globe. Each observation (often called a case or subject in survey data) occupies a row in a data table. In this example, the observational units are countries.
Whenever we analyze a data set in JMP, we will work with a data table. The columns of the table contain different variables, and the rows of the table contain observations of each variable. In your statistics course, you will probably use the terms data set, variable, and observation (or case). In JMP, we more commonly speak of data tables, columns, and rows.
Throughout