Человечество научилось собирать, обрабатывать и использовать в науке, бизнесе и повседневной жизни огромные массивы данных. Но что делать с данными, которых у нас нет? Допустимо ли игнорировать то, чего мы не замечаем? Британский статистик Дэвид Хэнд считает, что это по меньшей мере недальновидно, а порой – крайне опасно. В своей книге он выделяет 15 влияющих на наши решения и действия видов данных, которые остаются в тени. Например, речь идет об учете сигналов бедствия, которые могли бы подать жители бедных районов, если бы у них были смартфоны, результатах медицинского исследования, которые намеренно утаили или случайно исказили, или данных, ставших «темными» из-за плохого набора критериев для включения в выборку. Хэнд также рассказывает о том, какие меры могут сгладить эффект «темных данных» и как их можно обратить себе на пользу. Книга будет интересна широкому кругу читателей, интересующихся дата-сайенс, программированием и статистикой.
This is a book about how ecologists can integrate remote sensing and GIS in their research. It will allow readers to get started with the application of remote sensing and to understand its potential and limitations. Using practical examples, the book covers all necessary steps from planning field campaigns to deriving ecologically relevant information through remote sensing and modelling of species distributions. An Introduction to Spatial Data Analysis introduces spatial data handling using the open source software Quantum GIS (QGIS). In addition, readers will be guided through their first steps in the R programming language. The authors explain the fundamentals of spatial data handling and analysis, empowering the reader to turn data acquired in the field into actual spatial data. Readers will learn to process and analyse spatial data of different types and interpret the data and results. After finishing this book, readers will be able to address questions such as “What is the distance to the border of the protected area?”, “Which points are located close to a road?”, “Which fraction of land cover types exist in my study area?” using different software and techniques. This book is for novice spatial data users and does not assume any prior knowledge of spatial data itself or practical experience working with such data sets. Readers will likely include student and professional ecologists, geographers and any environmental scientists or practitioners who need to collect, visualize and analyse spatial data. The software used is the widely applied open source scientific programs QGIS and R. All scripts and data sets used in the book will be provided online at book.ecosens.org. This book covers specific methods including: what to consider before collecting in situ data how to work with spatial data collected in situ the difference between raster and vector data how to acquire further vector and raster data how to create relevant environmental information how to combine and analyse in situ and remote sensing data how to create useful maps for field work and presentations how to use QGIS and R for spatial analysis how to develop analysis scripts
Статистика играла ключевую роль в научном познании мира на протяжении веков, а в эпоху больших данных базовое понимание этой дисциплины и статистическая грамотность становятся критически важными. Дэвид Шпигельхалтер приглашает вас в не обремененное техническими деталями увлекательное знакомство с теорией и практикой статистики. Эта книга предназначена как для студентов, которые хотят ознакомиться со статистикой, не углубляясь в технические детали, так и для широкого круга читателей, интересующихся статистикой, с которой они сталкиваются на работе и в повседневной жизни. Но даже опытные аналитики найдут в книге интересные примеры и новые знания для своей практики. На русском языке публикуется впервые.
Learn how to apply the principles of machine learning to time series modeling with this indispensable resource Machine Learning for Time Series Forecasting with Python is an incisive and straightforward examination of one of the most crucial elements of decision-making in finance, marketing, education, and healthcare: time series modeling. Despite the centrality of time series forecasting, few business analysts are familiar with the power or utility of applying machine learning to time series modeling. Author Francesca Lazzeri, a distinguished machine learning scientist and economist, corrects that deficiency by providing readers with comprehensive and approachable explanation and treatment of the application of machine learning to time series forecasting. Written for readers who have little to no experience in time series forecasting or machine learning, the book comprehensively covers all the topics necessary to: Understand time series forecasting concepts, such as stationarity, horizon, trend, and seasonality Prepare time series data for modeling Evaluate time series forecasting models’ performance and accuracy Understand when to use neural networks instead of traditional time series models in time series forecasting Machine Learning for Time Series Forecasting with Python is full real-world examples, resources and concrete strategies to help readers explore and transform data and develop usable, practical time series forecasts. Perfect for entry-level data scientists, business analysts, developers, and researchers, this book is an invaluable and indispensable guide to the fundamental and advanced concepts of machine learning applied to time series modeling.
Get ahead of the curve—learn about big data on the blockchain Blockchain came to prominence as the disruptive technology that made cryptocurrencies work. Now, data pros are using blockchain technology for faster real-time analysis, better data security, and more accurate predictions. Blockchain Data Analytics For Dummies is your quick-start guide to harnessing the potential of blockchain. Inside this book, technologists, executives, and data managers will find information and inspiration to adopt blockchain as a big data tool. Blockchain expert Michael G. Solomon shares his insight on what the blockchain is and how this new tech is poised to disrupt data. Set your organization on the cutting edge of analytics, before your competitors get there! Learn how blockchain technologies work and how they can integrate with big data Discover the power and potential of blockchain analytics Establish data models and quickly mine for insights and results Create data visualizations from blockchain analysis Discover how blockchains are disrupting the data world with this exciting title in the trusted For Dummies line!
Organizations can make data science a repeatable, predictable tool, which business professionals use to get more value from their data Enterprise data and AI projects are often scattershot, underbaked, siloed, and not adaptable to predictable business changes. As a result, the vast majority fail. These expensive quagmires can be avoided, and this book explains precisely how. Data science is emerging as a hands-on tool for not just data scientists, but business professionals as well. Managers, directors, IT leaders, and analysts must expand their use of data science capabilities for the organization to stay competitive. Smarter Data Science helps them achieve their enterprise-grade data projects and AI goals. It serves as a guide to building a robust and comprehensive information architecture program that enables sustainable and scalable AI deployments. When an organization manages its data effectively, its data science program becomes a fully scalable function that’s both prescriptive and repeatable. With an understanding of data science principles, practitioners are also empowered to lead their organizations in establishing and deploying viable AI. They employ the tools of machine learning, deep learning, and AI to extract greater value from data for the benefit of the enterprise. By following a ladder framework that promotes prescriptive capabilities, organizations can make data science accessible to a range of team members, democratizing data science throughout the organization. Companies that collect, organize, and analyze data can move forward to additional data science achievements: Improving time-to-value with infused AI models for common use cases Optimizing knowledge work and business processes Utilizing AI-based business intelligence and data visualization Establishing a data topology to support general or highly specialized needs Successfully completing AI projects in a predictable manner Coordinating the use of AI from any compute node. From inner edges to outer edges: cloud, fog, and mist computing When they climb the ladder presented in this book, businesspeople and data scientists alike will be able to improve and foster repeatable capabilities. They will have the knowledge to maximize their AI and data assets for the benefit of their organizations.