
DATA SCIENCE - HARNESSING THE POWER OF DATA
AUTHOR(S) -
Dr. ANIRBAN MANDAL, SHOKHJAKHON ABDUFATTOKHOV, SEJUTI SARKER TINNY, Dr. ANIRUDH REDDY CINGIREDDY
DOI – 10.61909/AMKEDTB052419
Genre/Subject – Computer Science, Data Science
Book code – AMKEDTB052419 pgs: 261
ISBN(E) -978-81-972267-9-3
ISBN(P) – 978-81-972267-4-8
Published – 18/05/2024
AUTHOR(S)

Dr. Anirban Mandal
Dr. Anirban Mandal obtained his B.Sc Honours (Physics) from University of Calcutta in 1999. After that from Institute of Radiophysics and Electronics (CU) he completed B.Tech and M.Tech in the year 2002 and 2004 respectively. He has been awarded Ph.D (Engg) from Jadavpur University. He is currently working as an Associate Professor and Head in the Department of Electronics and Communications Engineering, Future Institute of Engineering and Management, Sonarpur, Kolkata. During his 20 years of academic journey he has published more than 15 research articles and 2 patents on antenna and IOT based designs. Now he is working on Data Science and AL & ML with many contributions in national level projects.

Mr. Shokhjakhon Abdufattokhov
Mr. Shokhjakhon Abdufattokhov was born on October 20, 1992 in Andijan, Uzbekistan. In 2011, he entered to Turin Polytechnic University in Tashkent and earned BSc degree with distinction of Mechanical Engineering in 2015. From 2015 to 2016, he worked as an engineer in GM Uzbekistan car plant. Subsequently, he was enrolled in Erasmus Mundus Joint Master Degree Program in Mathematical Modelling in Engineering and successfully defended his thesis in 2018 at University of L’aquila, Italy. He is an associate professor of Automatic Control and Computer Engineering Department in Turin Polytechnic University in Tashkent. He has published more than 30 articles in high impact worldwide journals and conferences. His research interest is in data analysis, predictive modeling, system identification, machine learning and optimal control.

Ms. Sejuti Sarker Tinny
Ms. Sejuti Sarker Tinny is a high school graduate student with an insatiable curiosity for the world of Data Science. Despite her young age, she has already delved deep into the realms of statistics, programming and machine learning, fueled by a passion for unravelling the mysteries hidden within vast datasets. Tinny’s journey into data science began with a simple fascination for numbers, which quickly evolved into a fervent desire to understand the stories they tell. With a keen eye for detail and a knack for problem solving , she has embarked on numerous data analysis projects, exploring diverse topics ranging from environmental trends to socioeconomic patterns. Tinny’s writing reflects her dedication to demystifying complex concepts and making them accessible to readers of all backgrounds. Through her contributions to this book, she hopes to inspire fellow students to embark on their adventures in the captivating world of data science.

Dr. Anirudh Reddy Cingireddy
I’m Dr. Anirudh Reddy Cingireddy, a versatile data scientist and computer science assistant professor with a wealth of experience in both academia and industry. With a Ph.D. in Computational Science focused on the classification of Parkinson’s disease, my expertise lies in leveraging machine learning techniques to extract insights from complex datasets.
From teaching a diverse range of undergraduate courses to advising honors students and actively participating in university events, my tenure-track role at East Central University underscores my commitment to nurturing future talents and fostering academic excellence. With a track record of peer-reviewed publications, paper reviews, and poster presentations on topics ranging from COVID-19 diagnostics to the application of machine learning in disease classification, I am dedicated to pushing the boundaries of interdisciplinary research.
EDITOR

SHUBHODIP SASMAL
Mr. Shubhodip Sasmal, an IT professional with 15 years of experience based in Atlanta, Georgia, is currently employed at Tata Consultancy Services. He earned his Bachelor’s Degree in Information Technology in 2007 from West Bengal University of Technology. Over the years, he has cultivated a comprehensive skill set encompassing Artificial Intelligence, Machine Learning, Cloud Computing, Database Management, and Data Processing. He is dedicated to staying at the forefront of technological advancements, consistently updating his skills to align with the evolving IT landscape. His proficiency extends beyond technical expertise, as he actively engages in professional development, contributing to a culture of continuous learning. With a track record of successfully leading initiatives and collaborating with cross-functional teams, he is passionate about leveraging his knowledge and experience to drive innovation in the dynamic realm of Information Technology.
ABOUT BOOK / ABSTRACT
“Data Science: Harnessing the Power of Data” is a comprehensive guide that delves into the diverse and dynamic field of data science. With ten meticulously crafted chapters, this book covers a wide array of topics ranging from foundational concepts to advanced methodologies and future trends in data science.
In Chapter 1, “Introduction to Data Science,” readers are introduced to the rise of data science, its definition, historical milestones, and its cross-disciplinary nature. The chapter explores the applications of data science across industries, its impact on decision-making, and the challenges it presents, emphasizing the importance of ethical data practices.
Chapter 2, “Foundations of Statistics and Mathematics,” provides readers with a solid grounding in statistical fundamentals, probability, and mathematical tools essential for data analysis. It discusses multivariate analysis techniques, linear algebra, and advanced statistical concepts, along with their real-world applications and challenges in statistical modeling.
Moving on to Chapter 3, “Programming for Data Science,” readers learn about choosing the right programming language, data manipulation using Python and R, data visualization techniques, and best practices in code structure and optimization. The chapter also covers version control, collaborative coding, and deploying data science solutions.
In Chapter 4, “Data Collection and Cleaning,” strategies for effective data collection, web scraping, handling missing data, outlier detection, and data transformation techniques are discussed. The chapter also addresses ethical considerations in data collection and cleaning processes.
Chapter 5, “Exploratory Data Analysis (EDA),” guides readers through uncovering patterns and trends, descriptive statistics, data visualization, advanced EDA techniques, feature engineering, time series analysis, and dimensionality reduction methods.
Chapter 6, “Machine Learning Fundamentals,” introduces machine learning concepts, types, feature selection, model evaluation metrics, hyperparameter tuning, cross-validation techniques, ensemble learning, and real-world applications, while discussing challenges and ethics in machine learning.
In Chapter 7, “Advanced Machine Learning Algorithms,” readers explore deep learning, CNNs, RNNs, transfer learning, GANs, reinforcement learning, explainability, limitations, and ethical considerations in advanced machine learning, along with case studies.
Chapter 8, “Big Data and Data Engineering,” covers understanding big data, Hadoop, Apache Spark, distributed storage systems, scalable data pipelines, real-time data processing, data warehousing, best practices, challenges, and ethical considerations.
Chapter 9, “Data Ethics and Privacy,” addresses the importance of ethical data practices, privacy concerns, regulatory compliance, bias, fairness, explainability, accountability, responsible data governance, ethical decision-making frameworks, and case studies.
Finally, Chapter 10, “Future Trends in Data Science,” discusses the evolving landscape, emerging technologies, data science applications, continuous learning, ethical challenges, democratization, predictions for the future, and concludes with the ongoing journey of data science.
This book serves as a comprehensive resource for students, professionals, and enthusiasts seeking to navigate the complexities of data science and unlock its transformative potential in today’s digital age.